Commit Graph

137 Commits

Author SHA1 Message Date
Joseph Myers
b3f27d8150 Add narrowing fma functions
This patch adds the narrowing fused multiply-add functions from TS
18661-1 / TS 18661-3 / C2X to glibc's libm: ffma, ffmal, dfmal,
f32fmaf64, f32fmaf32x, f32xfmaf64 for all configurations; f32fmaf64x,
f32fmaf128, f64fmaf64x, f64fmaf128, f32xfmaf64x, f32xfmaf128,
f64xfmaf128 for configurations with _Float64x and _Float128;
__f32fmaieee128 and __f64fmaieee128 aliases in the powerpc64le case
(for calls to ffmal and dfmal when long double is IEEE binary128).
Corresponding tgmath.h macro support is also added.

The changes are mostly similar to those for the other narrowing
functions previously added, especially that for sqrt, so the
description of those generally applies to this patch as well.  As with
sqrt, I reused the same test inputs in auto-libm-test-in as for
non-narrowing fma rather than adding extra or separate inputs for
narrowing fma.  The tests in libm-test-narrow-fma.inc also follow
those for non-narrowing fma.

The non-narrowing fma has a known bug (bug 6801) that it does not set
errno on errors (overflow, underflow, Inf * 0, Inf - Inf).  Rather
than fixing this or having narrowing fma check for errors when
non-narrowing does not (complicating the cases when narrowing fma can
otherwise be an alias for a non-narrowing function), this patch does
not attempt to check for errors from narrowing fma and set errno; the
CHECK_NARROW_FMA macro is still present, but as a placeholder that
does nothing, and this missing errno setting is considered to be
covered by the existing bug rather than needing a separate open bug.
missing-errno annotations are duly added to many of the
auto-libm-test-in test inputs for fma.

This completes adding all the new functions from TS 18661-1 to glibc,
so will be followed by corresponding stdc-predef.h changes to define
__STDC_IEC_60559_BFP__ and __STDC_IEC_60559_COMPLEX__, as the support
for TS 18661-1 will be at a similar level to that for C standard
floating-point facilities up to C11 (pragmas not implemented, but
library functions done).  (There are still further changes to be done
to implement changes to the types of fromfp functions from N2548.)

Tested as followed: natively with the full glibc testsuite for x86_64
(GCC 11, 7, 6) and x86 (GCC 11); with build-many-glibcs.py with GCC
11, 7 and 6; cross testing of math/ tests for powerpc64le, powerpc32
hard float, mips64 (all three ABIs, both hard and soft float).  The
different GCC versions are to cover the different cases in tgmath.h
and tgmath.h tests properly (GCC 6 has _Float* only as typedefs in
glibc headers, GCC 7 has proper _Float* support, GCC 8 adds
__builtin_tgmath).
2021-09-22 21:25:31 +00:00
Joseph Myers
4b6574a6f6 Redirect fma calls to __fma in libm
include/math.h has a mechanism to redirect internal calls to various
libm functions, that can often be inlined by the compiler, to call
non-exported __* names for those functions in the case when the calls
aren't inlined, with the redirection being disabled when
NO_MATH_REDIRECT.  Add fma to the functions to which this mechanism is
applied.

At present, libm-internal fma calls (generally to __builtin_fma*
functions) are only done when it's known the call will be inlined,
with alternative code not relying on an fma operation being used in
the caller otherwise.  This patch is in preparation for adding the TS
18661 / C2X narrowing fma functions to glibc; it will be natural for
the narrowing function implementations to call the underlying fma
functions unconditionally, with this either being inlined or resulting
in an __fma* call.  (Using two levels of round-to-odd computation like
that, in the case where there isn't an fma hardware instruction, isn't
optimal but is certainly a lot simpler for the initial implementation
than writing different narrowing fma implementations for all the
various pairs of formats.)

Tested with build-many-glibcs.py that installed stripped shared
libraries are unchanged by the patch (using
<https://sourceware.org/pipermail/libc-alpha/2021-September/130991.html>
to fix installed library stripping in build-many-glibcs.py).  Also
tested for x86_64.
2021-09-15 22:57:35 +00:00
Siddhesh Poyarekar
30891f35fa Remove "Contributed by" lines
We stopped adding "Contributed by" or similar lines in sources in 2012
in favour of git logs and keeping the Contributors section of the
glibc manual up to date.  Removing these lines makes the license
header a bit more consistent across files and also removes the
possibility of error in attribution when license blocks or files are
copied across since the contributed-by lines don't actually reflect
reality in those cases.

Move all "Contributed by" and similar lines (Written by, Test by,
etc.) into a new file CONTRIBUTED-BY to retain record of these
contributions.  These contributors are also mentioned in
manual/contrib.texi, so we just maintain this additional record as a
courtesy to the earlier developers.

The following scripts were used to filter a list of files to edit in
place and to clean up the CONTRIBUTED-BY file respectively.  These
were not added to the glibc sources because they're not expected to be
of any use in future given that this is a one time task:

https://gist.github.com/siddhesh/b5ecac94eabfd72ed2916d6d8157e7dc
https://gist.github.com/siddhesh/15ea1f5e435ace9774f485030695ee02

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-09-03 22:06:44 +05:30
Paul Eggert
2b778ceb40 Update copyright dates with scripts/update-copyrights
I used these shell commands:

../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")

and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
2021-01-02 12:17:34 -08:00
Joseph Myers
d614a75396 Update copyright dates with scripts/update-copyrights. 2020-01-01 00:14:33 +00:00
Adhemerval Zanella
3b5ebe85aa sparc: Use atomic compiler builtins on sparc
This patch removes the arch-specific atomic instruction, relying on
compiler builtins.  The __sparc32_atomic_locks support is removed
and a configure check is added to check if compiler uses libatomic
to implement CAS.

It also removes the sparc specific sem_* and pthread_barrier_*
implementations.  It in turn allows buidling against a LEON3/LEON4
sparcv8 target, although it will still be incompatible with generic
sparcv9.

Checked on sparcv9-linux-gnu and sparc64-linux-gnu.  I also checked
with build against sparcv8-linux-gnu with -mcpu=leon3.

Tested-by: Andreas Larsson <andreas@gaisler.com>
2019-11-27 10:31:13 -03:00
Paul Eggert
5a82c74822 Prefer https to http for gnu.org and fsf.org URLs
Also, change sources.redhat.com to sourceware.org.
This patch was automatically generated by running the following shell
script, which uses GNU sed, and which avoids modifying files imported
from upstream:

sed -ri '
  s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g
  s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g
' \
  $(find $(git ls-files) -prune -type f \
      ! -name '*.po' \
      ! -name 'ChangeLog*' \
      ! -path COPYING ! -path COPYING.LIB \
      ! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \
      ! -path manual/texinfo.tex ! -path scripts/config.guess \
      ! -path scripts/config.sub ! -path scripts/install-sh \
      ! -path scripts/mkinstalldirs ! -path scripts/move-if-change \
      ! -path INSTALL ! -path  locale/programs/charmap-kw.h \
      ! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \
      ! '(' -name configure \
            -execdir test -f configure.ac -o -f configure.in ';' ')' \
      ! '(' -name preconfigure \
            -execdir test -f preconfigure.ac ';' ')' \
      -print)

and then by running 'make dist-prepare' to regenerate files built
from the altered files, and then executing the following to cleanup:

  chmod a+x sysdeps/unix/sysv/linux/riscv/configure
  # Omit irrelevant whitespace and comment-only changes,
  # perhaps from a slightly-different Autoconf version.
  git checkout -f \
    sysdeps/csky/configure \
    sysdeps/hppa/configure \
    sysdeps/riscv/configure \
    sysdeps/unix/sysv/linux/csky/configure
  # Omit changes that caused a pre-commit check to fail like this:
  # remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines
  git checkout -f \
    sysdeps/powerpc/powerpc64/ppc-mcount.S \
    sysdeps/unix/sysv/linux/s390/s390-64/syscall.S
  # Omit change that caused a pre-commit check to fail like this:
  # remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline
  git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
2019-09-07 02:43:31 -07:00
Adhemerval Zanella
1e372ded4f Refactor hp-timing rtld usage
This patch refactor how hp-timing is used on loader code for statistics
report.  The HP_TIMING_AVAIL and HP_SMALL_TIMING_AVAIL are removed and
HP_TIMING_INLINE is used instead to check for hp-timing avaliability.
For alpha, which only defines HP_SMALL_TIMING_AVAIL, the HP_TIMING_INLINE
is set iff for IS_IN(rtld).

Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu. I also
checked the builds for all afected ABIs.

	* benchtests/bench-timing.h: Replace HP_TIMING_AVAIL with
	HP_TIMING_INLINE.
	* nptl/descr.h: Likewise.
	* elf/rtld.c (RLTD_TIMING_DECLARE, RTLD_TIMING_NOW, RTLD_TIMING_DIFF,
	RTLD_TIMING_ACCUM_NT, RTLD_TIMING_SET): Define.
	(dl_start_final_info, _dl_start_final, dl_main, print_statistics):
	Abstract hp-timing usage with RTLD_* macros.
	* sysdeps/alpha/hp-timing.h (HP_TIMING_INLINE): Define iff IS_IN(rtld).
	(HP_TIMING_AVAIL, HP_SMALL_TIMING_AVAIL): Remove.
	* sysdeps/generic/hp-timing.h (HP_TIMING_AVAIL, HP_SMALL_TIMING_AVAIL,
	HP_TIMING_NONAVAIL): Likewise.
	* sysdeps/ia64/hp-timing.h (HP_TIMING_AVAIL, HP_SMALL_TIMING_AVAIL):
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/hp-timing.h (HP_TIMING_AVAIL,
	HP_SMALL_TIMING_AVAIL): Likewise.
	* sysdeps/powerpc/powerpc64/hp-timing.h (HP_TIMING_AVAIL,
	HP_SMALL_TIMING_AVAIL): Likewise.
	* sysdeps/sparc/sparc32/sparcv9/hp-timing.h (HP_TIMING_AVAIL,
	HP_SMALL_TIMING_AVAIL): Likewise.
	* sysdeps/sparc/sparc64/hp-timing.h (HP_TIMING_AVAIL,
	HP_SMALL_TIMING_AVAIL): Likewise.
	* sysdeps/x86/hp-timing.h (HP_TIMING_AVAIL, HP_SMALL_TIMING_AVAIL):
	Likewise.
	* sysdeps/generic/hp-timing-common.h: Update comment with
	HP_TIMING_AVAIL removal.
2019-03-22 17:30:44 -03:00
Joseph Myers
04277e02d7 Update copyright dates with scripts/update-copyrights.
* All files with FSF copyright notices: Update copyright dates
	using scripts/update-copyrights.
	* locale/programs/charmap-kw.h: Regenerated.
	* locale/programs/locfile-kw.h: Likewise.
2019-01-01 00:11:28 +00:00
Joseph Myers
81dca813cc Use copysign functions not __copysign functions in glibc libm.
Continuing the move to use, within libm, public names for libm
functions that can be inlined as built-in functions on many
architectures, this patch moves calls to __copysign functions to call
the corresponding copysign names instead, with asm redirection to
__copysign when the calls are not inlined (all cases are inlined
except for IBM long double for powerpc soft-float / e500v1).  This
eliminates the need for an inline function defining __copysign in
terms of __builtin_copysign.

Tested for x86_64, and with build-many-glibcs.py.

	* include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ &&
	__FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT]
	(MATH_REDIRECT_BINARY_ARGS): New macro.
	[!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0)
	&& !NO_MATH_REDIRECT] (copysign): Redirect using MATH_REDIRECT.
	* sysdeps/alpha/fpu/s_copysign.c: Define NO_MATH_REDIRECT before
	header inclusion.
	* sysdeps/alpha/fpu/s_copysignf.c: Likewise.
	* sysdeps/ieee754/dbl-64/s_copysign.c: Likewise.
	* sysdeps/ieee754/float128/s_copysignf128.c: Likewise.
	* sysdeps/ieee754/flt-32/s_copysignf.c: Likewise.
	* sysdeps/ieee754/ldbl-128/s_copysignl.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_copysignl.c: Likewise.
	* sysdeps/ieee754/ldbl-96/s_copysignl.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysign.c:
	Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysignf.c:
	Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign.c: Likewise.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysignf.c: Likewise.
	* sysdeps/riscv/rvd/s_copysign.c: Likewise.
	* sysdeps/riscv/rvf/s_copysignf.c: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysign.c:
	Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysignf.c:
	Likewise.
	* sysdeps/generic/math_private_calls.h
	[!__MATH_DECLARING_LONG_DOUBLE || !NO_LONG_DOUBLE] (__copysign):
	Do not declare and define as an inline function.
	* math/divtc3.c (__divtc3): Use copysign functions instead of
	__copysign variants.
	* math/multc3.c (__multc3): Likewise.
	* sysdeps/generic/math-type-macros.h (M_COPYSIGN): Likewise.
	* sysdeps/ieee754/dbl-64/e_atan2.c (signArctan2): Likewise.
	* sysdeps/ieee754/dbl-64/e_atanh.c (__ieee754_atanh): Likewise.
	* sysdeps/ieee754/dbl-64/e_gamma_r.c (__ieee754_gamma_r):
	Likewise.
	* sysdeps/ieee754/dbl-64/e_jn.c (__ieee754_jn): Likewise.
	(__ieee754_yn): Likewise.
	* sysdeps/ieee754/dbl-64/s_asinh.c (__asinh): Likewise.
	* sysdeps/ieee754/dbl-64/s_atan.c (__signArctan): Likewise.
	* sysdeps/ieee754/dbl-64/s_scalbln.c (__scalbln): Likewise.
	* sysdeps/ieee754/dbl-64/s_scalbn.c (__scalbn): Likewise.
	* sysdeps/ieee754/dbl-64/s_sin.c (do_sin): Likewise.
	(__sin): Likewise.
	* sysdeps/ieee754/dbl-64/s_sincos.c (__sincos): Likewise.
	* sysdeps/ieee754/dbl-64/wordsize-64/s_nearbyint.c (__nearbyint):
	Likewise.
	* sysdeps/ieee754/dbl-64/wordsize-64/s_scalbln.c (__scalbln):
	Likewise.
	* sysdeps/ieee754/dbl-64/wordsize-64/s_scalbn.c (__scalbn):
	Likewise.
	* sysdeps/ieee754/flt-32/e_atanhf.c (__ieee754_atanhf): Likewise.
	* sysdeps/ieee754/flt-32/e_gammaf_r.c (__ieee754_gammaf_r):
	Likewise.
	* sysdeps/ieee754/flt-32/e_jnf.c (__ieee754_jnf): Likewise.
	(__ieee754_ynf): Likewise.
	* sysdeps/ieee754/flt-32/s_asinhf.c (__asinhf): Likewise.
	* sysdeps/ieee754/flt-32/s_scalbnf.c (__scalbnf): Likewise.
	* sysdeps/ieee754/k_standard.c (__kernel_standard): Likewise.
	* sysdeps/ieee754/ldbl-128/e_gammal_r.c (__ieee754_gammal_r):
	Likewise.
	* sysdeps/ieee754/ldbl-128/e_jnl.c (__ieee754_jnl): Likewise.
	(__ieee754_ynl): Likewise.
	* sysdeps/ieee754/ldbl-128/s_scalblnl.c (__scalblnl): Likewise.
	* sysdeps/ieee754/ldbl-128/s_scalbnl.c (__scalbnl): Likewise.
	* sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r):
	Likewise.
	* sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Likewise.
	(__ieee754_ynl): Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_fmal.c (__fmal): Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_scalblnl.c (__scalblnl): Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_scalbnl.c (__scalbnl): Likewise.
	* sysdeps/ieee754/ldbl-96/e_gammal_r.c (__ieee754_gammal_r):
	Likewise.
	* sysdeps/ieee754/ldbl-96/e_jnl.c (__ieee754_jnl): Likewise.
	(__ieee754_ynl)
	* sysdeps/ieee754/ldbl-96/s_asinhl.c (__asinhl): Likewise.
	* sysdeps/ieee754/ldbl-96/s_scalblnl.c (__scalblnl): Likewise.
	* sysdeps/ieee754/ldbl-opt/nldbl-copysign.c (copysignl): Likewise.
	* sysdeps/powerpc/power5+/fpu/s_modf.c (__modf): Likewise.
	* sysdeps/powerpc/power5+/fpu/s_modff.c (__modff): Likewise.
2018-09-27 20:04:48 +00:00
Joseph Myers
f29b6f17e4 Use rint functions not __rint functions in glibc libm.
Continuing the move to use, within libm, public names for libm
functions that can be inlined as built-in functions on many
architectures, this patch moves calls to __rint functions to call the
corresponding rint names instead, with asm redirection to __rint when
the calls are not inlined.  The x86_64 math_private.h is removed as no
longer useful after this patch.

This patch is relative to a tree with my floor patch
<https://sourceware.org/ml/libc-alpha/2018-09/msg00148.html> applied,
and much the same considerations arise regarding possibly replacing an
IFUNC call with a direct inline expansion.

Tested for x86_64, and with build-many-glibcs.py.

	* include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ &&
	__FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (rint): Redirect
	using MATH_REDIRECT.
	* sysdeps/aarch64/fpu/s_rint.c: Define NO_MATH_REDIRECT before
	header inclusion.
	* sysdeps/aarch64/fpu/s_rintf.c: Likewise.
	* sysdeps/alpha/fpu/s_rint.c: Likewise.
	* sysdeps/alpha/fpu/s_rintf.c: Likewise.
	* sysdeps/i386/fpu/s_rintl.c: Likewise.
	* sysdeps/ieee754/dbl-64/s_rint.c: Likewise.
	* sysdeps/ieee754/dbl-64/wordsize-64/s_rint.c: Likewise.
	* sysdeps/ieee754/float128/s_rintf128.c: Likewise.
	* sysdeps/ieee754/flt-32/s_rintf.c: Likewise.
	* sysdeps/ieee754/ldbl-128/s_rintl.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_rintl.c: Likewise.
	* sysdeps/m68k/coldfire/fpu/s_rint.c: Likewise.
	* sysdeps/m68k/coldfire/fpu/s_rintf.c: Likewise.
	* sysdeps/m68k/m680x0/fpu/s_rint.c: Likewise.
	* sysdeps/m68k/m680x0/fpu/s_rintf.c: Likewise.
	* sysdeps/m68k/m680x0/fpu/s_rintl.c: Likewise.
	* sysdeps/powerpc/fpu/s_rint.c: Likewise.
	* sysdeps/powerpc/fpu/s_rintf.c: Likewise.
	* sysdeps/riscv/rv64/rvd/s_rint.c: Likewise.
	* sysdeps/riscv/rvf/s_rintf.c: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rint.c: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rintf.c: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_rint.c: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_rintf.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_rint.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_rintf.c: Likewise.
	* sysdeps/x86_64/fpu/math_private.h: Remove file.
	* math/e_scalb.c (invalid_fn): Use rint functions instead of
	__rint variants.
	* math/e_scalbf.c (invalid_fn): Likewise.
	* math/e_scalbl.c (invalid_fn): Likewise.
	* sysdeps/ieee754/dbl-64/e_gamma_r.c (__ieee754_gamma_r):
	Likewise.
	* sysdeps/ieee754/flt-32/e_gammaf_r.c (__ieee754_gammaf_r):
	Likewise.
	* sysdeps/ieee754/k_standard.c (__kernel_standard): Likewise.
	* sysdeps/ieee754/k_standardl.c (__kernel_standard_l): Likewise.
	* sysdeps/ieee754/ldbl-128/e_gammal_r.c (__ieee754_gammal_r):
	Likewise.
	* sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r):
	Likewise.
	* sysdeps/ieee754/ldbl-96/e_gammal_r.c (__ieee754_gammal_r):
	Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_llrint.c (__llrint): Likewise.
	* sysdeps/powerpc/powerpc32/fpu/s_llrintf.c (__llrintf): Likewise.
2018-09-14 13:10:39 +00:00
Joseph Myers
688903eb3e Update copyright dates with scripts/update-copyrights.
* All files with FSF copyright notices: Update copyright dates
	using scripts/update-copyrights.
	* locale/programs/charmap-kw.h: Regenerated.
	* locale/programs/locfile-kw.h: Likewise.
2018-01-01 00:32:25 +00:00
Patrick McGehearty
e6a1c5dc77 sparc: M7 optimized memset/bzero
Support added to identify Sparc M7/T7/S7/M8/T8 processor capability.
Performance tests run on Sparc S7 using new code and old niagara4 code.

Optimizations for memset also apply to bzero as they share code.

For memset/bzero, performance comparison with niagara4 code:
For memset nonzero data,
  256-1023 bytes - 60-90% gain (in cache); 5% gain (out of cache)
  1K+ bytes - 80-260% gain (in cache); 40-80% gain (out of cache)
For memset zero data (and bzero),
  256-1023 bytes - 80-120% gain (in cache), 0% gain (out of cache)
  1024+ bytes - 2-4x gain (in cache), 10-35% gain (out of cache)

Tested in sparcv9-*-* and sparc64-*-* targets in both multi and
non-multi arch configurations.

	Patrick McGehearty <patrick.mcgehearty@oracle.com>
	Adhemerval Zanella  <adhemerval.zanella@linaro.org>

	* sysdeps/sparc/sparc32/sparcv9/multiarch/Makefile
	(sysdeps_routines): Add memset-niagara7.
	* sysdeps/sparc/sparc64/multiarch/Makefile (sysdes_rotuines):
	Likewise.
	* sysdeps/sparc/sparc32/sparcv9/multiarch/memset-niagara7.S: New
	file.
	* sysdeps/sparc/sparc64/multiarch/memset-niagara7.S: Likewise.
	* sysdeps/sparc/sparc64/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Add __bzero_niagara7 and __memset_niagara7.
	* sysdeps/sparc/sparc64/multiarch/ifunc-memset.h (IFUNC_SELECTOR):
	Add niagara7 option.
	* NEWS: Mention sparc m7 optimized memcpy, mempcpy, memmove, and
	memset.
2017-12-14 08:48:19 -02:00
Patrick McGehearty
1b6e07f8e0 sparc: M7 optimized memcpy/mempcpy/memmove
Support added to identify Sparc M7/T7/S7/M8/T8 processor capability.
Performance tests run on Sparc S7 using new code and old niagara4 code.

Optimizations for memcpy also apply to mempcpy and memmove
where they share code. Optimizations for memset also apply
to bzero as they share code.

For memcpy/mempcpy/memmove, performance comparison with niagara4 code:
Long word aligned data
  0-127 bytes - minimal changes
  128-1023 bytes - 7-30% gain
  1024+ bytes - 1-7% gain (in cache); 30-100% gain (out of cache)
Word aligned data
  0-127 bytes - 50%+ gain
  128-1023 bytes - 10-200% gain
  1024+ bytes - 0-15% gain (in cache); 5-50% gain (out of cache)
Unaligned data
  0-127 bytes - 0-70%+ gain
  128-447 bytes - 40-80%+ gain
  448-511 bytes - 1-3% loss
  512-4096 bytes - 2-3% gain (in cache); 0-20% gain (out of cache)
  4096+ bytes - ± 3% (in cache); 20-50% gain (out of cache)

Tested in sparcv9-*-* and sparc64-*-* targets in both multi and
non-multi arch configurations.

	Patrick McGehearty  <patrick.mcgehearty@oracle.com>
	Adhemerval Zanella  <adhemerval.zanella@linaro.org>

	* sysdeps/sparc/sparc32/sparcv9/multiarch/Makefile
	(sysdeps_routines): Add memcpy-memmove-niagara7 and memmove-ultra1.
	* sysdeps/sparc/sparc64/multiarch/Makefile (sysdeps_routines):
	Likewise.
	* sysdeps/sparc/sparc32/sparcv9/multiarch/memcpy-memmove-niagara7.S:
	New file.
	* sysdeps/sparc/sparc32/sparcv9/multiarch/memmove-ultra1.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/multiarch/rtld-memmove.c: Likewise.
	* sysdeps/sparc/sparc64/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Add __memcpy_niagara7, __mempcpy_niagara7,
	and __memmove_niagara7.
	* sysdeps/sparc/sparc64/multiarch/ifunc-memcpy.h (IFUNC_SELECTOR):
	Add niagara7 option.
	* sysdeps/sparc/sparc64/multiarch/memmove.c: New file.
	* sysdeps/sparc/sparc64/multiarch/ifunc-memmove.h: Likewise.
	* sysdeps/sparc/sparc64/multiarch/memcpy-memmove-niagara7.S: Likewise.
	* sysdeps/sparc/sparc64/multiarch/memmove-ultra1.S: Likewise.
	* sysdeps/sparc/sparc64/multiarch/rtld-memmove.c: Likewise.
2017-12-14 08:47:38 -02:00
Jose E. Marchesi
767a26d6a0 sparc: assembly version of memmove for ultra1+
Tested in sparcv9-*-* and sparc64-*-* targets in both non-multi-arch and
multi-arch configurations.

	* sysdeps/sparc/sparc32/sparcv9/memmove.S: New file.
	* sysdeps/sparc/sparc32/sparcv9/rtld-memmove.c: Likewise.
	* sysdeps/sparc/sparc64/memmove.S: Likewise.
	* sysdeps/sparc/sparc64/rtld-memmove.c: Likewise.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2017-12-14 08:47:09 -02:00
Adhemerval Zanella
5b7bd9756c sparc: Fix sparv9 multiarch build
Fix build caused by 5b4e5e7869.

	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysign.c: Fix build
	due redirect macro.
2017-12-01 15:47:23 -02:00
Adhemerval Zanella
2a14526bfa sparc: refactor cpu_relax to C
* sysdeps/sparc/sparc64/cpu_relax.c: New file.
	* sysdeps/sparc/sparc32/sparcv9/cpu_relax.c: Likewise.
	* sysdeps/sparc/sparc64/cpu_relax.S: Remove file.
	* sysdeps/sparc/sparc32/sparcv9/cpu_relax.S: Likewise.

Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2017-12-01 11:51:10 -02:00
Adhemerval Zanella
1c051a9b09 sparc: refactor sparc32 nearbyint{f} selector to C
This patch refactors the sparc32 ifunc selector to a C implementation.
Also, the generic symbol is moved to its own implementation file
s_nearbyint{f}-generic.S).

Checked on sparc64-linux-gnu and sparcv9-linux-gnu.

	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/Makefile
	(libm-sysdep_routines): Add s_nearbyintf-generic and
	s_nearbyint-generic.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyint-generic.S:
	New file.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyint.c: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyintf-generic.S:
	Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyintf.c:
	Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyint.S: Remove
	file.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyintf.S:
	Likewise.

Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2017-12-01 11:51:08 -02:00
Adhemerval Zanella
dbeb74ef84 sparc: refactor sparc32 rint{f} selector to C
This patch refactors the sparc32 ifunc selector to a C implementation.
Also, the generic symbol is moved to its own implementation file
s_rint{f}-generic.S).

Checked on sparc64-linux-gnu and sparcv9-linux-gnu.

	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/Makefile
	(libm-sysdep_routines): Add s_rintf-generic and s_rint-generic.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rint-generic.S: New
	file.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rint.c: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rintf-generic.S:
	Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rintf.c: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rint.S: Remove file.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rintf.S: Likewise.

Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2017-12-01 11:49:36 -02:00
Adhemerval Zanella
fa7ded9612 sparc: refactor sparc32 llrint{f} selector to C
This patch refactors the sparc32 ifunc selector to a C implementation.
Also, the generic symbol is moved to its own implementation file
s_llrint{f}-generic.S).

Checked on sparc64-linux-gnu and sparcv9-linux-gnu.

	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/Makefile
	(libm-sysdep_routines): Add s_llrintf-generic and s_llrint-generic.
	* sysdeps/sparc/sparcv9/fpu/multiarch/s_llrint-generic.S: New
	file.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_llrint.c: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_llrintf-generic.S:
	Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_llrintf.c: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_llrint.S: Remove file.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_llrintf.S: Likewise.

Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2017-12-01 11:49:19 -02:00
Adhemerval Zanella
e240cf0e0e sparc: refactor sparc32 fabs{f} selector to C
This patch refactors the sparc32 ifunc selector to a C implementation.
Also, the generic symbol is moved to its own implementation file
s_fabs{f}-generic.S).

Checked on sparc64-linux-gnu and sparcv9-linux-gnu.

	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/Makefile
	(libm-sysdep_routines): Add s_fabsf-generic and s_fabs-generic.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fabs-generic.S: New
	file.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fabs.c: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fabsf-generic.S:
	Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fabsf.c: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fabs.S: Remove file.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fabsf.S: Likewise.

Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2017-12-01 11:48:58 -02:00
Adhemerval Zanella
5b4e5e7869 sparc: refactor sparc32 copysign selector to C
This patch refactors the sparc32 ifunc selector to a C implementation.
Also, the generic symbol is moved to its own implementation file
s_copysign{f}-generic.S).

Checked on sparc64-linux-gnu and sparcv9-linux-gnu.

	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/Makefile
	(sysdep_calls): New rule.
	(sysdep_routines): Use sysdep_calls as base.
	(libm-sysdep_routines): Add generic rule for symbols shared with
	libc.  Add s_copysign-generic and s_copysign-generic objects.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysign-generic.S:
	New file.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysign.c: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysignf-generic.S:
	Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysignf.c: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysign.S: Remove file.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysignf.S: Likewise.

Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2017-12-01 11:48:39 -02:00
Joseph Myers
18305fba55 Remove SPARC lllrint aliases.
The sparc32/sparcv9/fpu/multiarch implementations of llrint / llrintf
have aliases lllrint / lllrintf.  No such function is exported from or
used in libm and these aliases should not be there; I expect they
arose accidentally in the course of converting a 64-bit implementation
(where lrint and llrint can be aliases) to a 32-bit llrint
implementation.  This patch removes those spurious aliases.

Tested (compilation only) with build-many-glibcs.py for
sparcv9-linux-gnu.

	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_llrint.S
	(__lllrint): Remove alias.
	(lllrint): Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_llrintf.S
	(__lllrintf): Likewise.
	(lllrintf): Likewise.
2017-11-30 00:43:28 +00:00
Joseph Myers
3e5efdbdbe Use libm_alias_float for sparc.
Continuing the preparation for additional _FloatN / _FloatNx function
aliases, this patch makes sparc libm function implementations use
libm_alias_float to define function aliases.

Tested with build-many-glibcs.py for all its sparc configurations that
installed stripped shared libraries are unchanged by the patch.

	* sysdeps/sparc/sparc32/fpu/s_copysignf.S: Include
	<libm-alias-float.h>.
	(copysignf): Define using libm_alias_float.
	* sysdeps/sparc/sparc32/fpu/s_fabsf.S: Include
	<libm-alias-float.h>.
	(fabsf): Define using libm_alias_float.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysignf.S:
	Include <libm-alias-float.h>.
	(copysignf): Define using libm_alias_float.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fabsf.S: Include
	<libm-alias-float.h>.
	(fabsf): Define using libm_alias_float.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fdimf.c: Include
	<libm-alias-float.h>.
	(fdimf): Define using libm_alias_float.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fmaf.c: Include
	<libm-alias-float.h>.
	(fmaf): Define using libm_alias_float.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_llrintf.S: Include
	<libm-alias-float.h>.
	(llrintf): Define using libm_alias_float.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyintf.S:
	Include <libm-alias-float.h>.
	(nearbyintf): Define using libm_alias_float.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rintf.S: Include
	<libm-alias-float.h>.
	(rintf): Define using libm_alias_float.
	* sysdeps/sparc/sparc32/sparcv9/fpu/s_llrintf.S: Include
	<libm-alias-float.h>.
	(llrintf): Define using libm_alias_float.
	* sysdeps/sparc/sparc32/sparcv9/fpu/s_lrintf.S: Include
	<libm-alias-float.h>.
	(lrintf): Define using libm_alias_float.
	* sysdeps/sparc/sparc32/sparcv9/fpu/s_nearbyintf.S: Include
	<libm-alias-float.h>.
	(nearbyintf): Define using libm_alias_float.
	* sysdeps/sparc/sparc32/sparcv9/fpu/s_rintf.S: Include
	<libm-alias-float.h>.
	(rintf): Define using libm_alias_float.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_ceilf.c: Include
	<libm-alias-float.h>.
	(ceilf): Define using libm_alias_float.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_floorf.c: Include
	<libm-alias-float.h>.
	(floorf): Define using libm_alias_float.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_fmaf.c: Include
	<libm-alias-float.h>.
	(fmaf): Define using libm_alias_float.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_lrintf.c: Include
	<libm-alias-float.h>.
	(lrintf): Define using libm_alias_float.
	(llrintf): Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_nearbyintf.c: Include
	<libm-alias-float.h>.
	(nearbyintf): Define using libm_alias_float.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_rintf.c: Include
	<libm-alias-float.h>.
	(rintf): Define using libm_alias_float.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_truncf.c: Include
	<libm-alias-float.h>.
	(truncf): Define using libm_alias_float.
	* sysdeps/sparc/sparc64/fpu/s_copysignf.S: Include
	<libm-alias-float.h>.
	(copysignf): Define using libm_alias_float.
	* sysdeps/sparc/sparc64/fpu/s_fabsf.c: Include
	<libm-alias-float.h>.
	(fabsf): Define using libm_alias_float.
	* sysdeps/sparc/sparc64/fpu/s_lrintf.S: Include
	<libm-alias-float.h>.
	(lrintf): Define using libm_alias_float.
	(llrintf): Likewise.
	* sysdeps/sparc/sparc64/fpu/s_nearbyintf.S: Include
	<libm-alias-float.h>.
	(nearbyintf): Define using libm_alias_float.
	* sysdeps/sparc/sparc64/fpu/s_rintf.S: Include
	<libm-alias-float.h>.
	(rintf): Define using libm_alias_float.
2017-11-30 00:30:40 +00:00
Joseph Myers
875cd54855 Use libm_alias_double for sparc.
Continuing the preparation for additional _FloatN / _FloatNx function
aliases, this patch makes sparc libm function implementations use
libm_alias_double to define function aliases (with consequent
simplification where compat symbol handling is now done by those
macros rather than locally in architecture-specific code).

Tested with build-many-glibcs.py for all its sparc configurations that
installed stripped shared libraries are unchanged by the patch.

	* sysdeps/sparc/sparc32/fpu/s_copysign.S: Include
	<libm-alias-double.h>.
	(copysign): Define using libm_alias_double.
	* sysdeps/sparc/sparc32/fpu/s_fabs.S: Include
	<libm-alias-double.h>.
	(fabs): Define using libm_alias_double.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysign.S:
	Include <libm-alias-double.h>.
	(copysign): Define using libm_alias_double.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fabs.S: Include
	<libm-alias-double.h>.
	(fabs): Define using libm_alias_double.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fdim.c: Include
	<libm-alias-double.h>.
	(fdim): Define using libm_alias_double.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fma.c: Include
	<libm-alias-double.h>.
	(fma): Define using libm_alias_double.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_llrint.S: Include
	<libm-alias-double.h>.
	(llrint): Define using libm_alias_double.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyint.S:
	Include <libm-alias-double.h>.
	(nearbyint): Define using libm_alias_double.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rint.S: Include
	<libm-alias-double.h>.
	(rint): Define using libm_alias_double.
	* sysdeps/sparc/sparc32/sparcv9/fpu/s_fabs.S: Include
	<libm-alias-double.h>.
	(fabs): Define using libm_alias_double.
	* sysdeps/sparc/sparc32/sparcv9/fpu/s_llrint.S: Include
	<libm-alias-double.h>.
	(llrint): Define using libm_alias_double.
	* sysdeps/sparc/sparc32/sparcv9/fpu/s_nearbyint.S: Include
	<libm-alias-double.h>.
	(nearbyint): Define using libm_alias_double.
	* sysdeps/sparc/sparc32/sparcv9/fpu/s_rint.S: Include
	<libm-alias-double.h>.
	(rint): Define using libm_alias_double.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_ceil.c: Include
	<libm-alias-double.h>.
	(ceil): Define using libm_alias_double.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_floor.c: Include
	<libm-alias-double.h>.
	(floor): Define using libm_alias_double.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_fma.c: Include
	<libm-alias-double.h>.
	(fma): Define using libm_alias_double.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_lrint.c: Include
	<libm-alias-double.h>.
	(lrint): Define using libm_alias_double.
	(llrint): Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_nearbyint.c: Include
	<libm-alias-double.h>.
	(nearbyint): Define using libm_alias_double.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_rint.c: Include
	<libm-alias-double.h>.
	(rint): Define using libm_alias_double.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_trunc.c: Include
	<libm-alias-double.h>.
	(trunc): Define using libm_alias_double.
	* sysdeps/sparc/sparc64/fpu/s_copysign.S: Include
	<libm-alias-double.h>.
	(copysign): Define using libm_alias_double.
	* sysdeps/sparc/sparc64/fpu/s_fabs.c: Include
	<libm-alias-double.h>.
	(fabs): Define using libm_alias_double.
	* sysdeps/sparc/sparc64/fpu/s_lrint.S: Include
	<libm-alias-double.h>.
	(lrint): Define using libm_alias_double.
	(llrint): Likewise.
	* sysdeps/sparc/sparc64/fpu/s_nearbyint.S: Include
	<libm-alias-double.h>.
	(nearbyint): Define using libm_alias_double.
	* sysdeps/sparc/sparc64/fpu/s_rint.S: Include
	<libm-alias-double.h>.
	(rint): Define using libm_alias_double.
2017-11-29 23:40:07 +00:00
Joseph Myers
cf4ebc27fe Fix missing sparcv9 --disable-multi-arch fabsl compat symbol (bug 22229).
The --disable-multi-arch case of sparcv9 libm is missing a fabsl
compat symbol for when long double had the same ABI as double.  This
patch adds the missing compat symbol to this implementation.  As my
fix for other instances of this missing compat symbol postdates the
last release, I'm considering this as being part of bug 22229 that was
missing from my previous fix rather than as a separate issue, and so
as not needing a new bug report in Bugzilla.

Tested (compilation only) with build-many-glibcs.py for
sparcv9-linux-gnu --disable-multi-arch.

	[BZ #22229]
	* sysdeps/sparc/sparc32/sparcv9/fpu/s_fabs.S: Include
	<math_ldbl_opt.h>.
	(fabsl): Define as compat symbol at version GLIBC_2_0 for libm.
2017-11-29 23:09:03 +00:00
Adhemerval Zanella
a55430cb0e sparc: Assume VIS3 support
This patch assumes VIS3 support by binutils, which is supported since
version 2.22.  This leads to some code simplification, mostly on
multiarch build where there is only one variant instead of previously
two (whether binutils supports VIS3 instructions or not).

For multiarch files where HAVE_AS_VIS3_SUPPORT was checked and
the default implementation was built with a different name, a new
file with (implementation with -generic appended) is added.

Checked on sparc64-linux-gnu and sparcv9-linux-gnu.

	* config.h.in (HAVE_AS_VIS3_SUPPORT): Remove check for VIS3 support.
	* sysdeps/sparc/configure.ac (HAVE_AS_VIS3_SUPPORT): Likewise.
	* sysdeps//sparc/sparc32/sparcv9/fpu/multiarch/s_fdim.c: Likewise.
	* sysdeps//sparc/sparc32/sparcv9/fpu/multiarch/s_fdimf.c: Likewise.
	* sysdeps//sparc/sparc32/sparcv9/fpu/multiarch/s_fma.c: Likewise.
	* sysdeps//sparc/sparc32/sparcv9/fpu/multiarch/s_fmaf.c: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_ceil.c: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_ceilf.c: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_floor.c: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_floorf.c: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_fma.c: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_fmaf.c: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_trunc.c: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_truncf.c: Likewise.
	* sysdeps/sparc/sparc-ifunc.h [!HAVE_AS_VIS3_SUPPORT]
	(SPARC_ASM_VIS3_IFUNC, SPARC_ASM_VIS3_VIS2_IFUNC): Remove macros.
	* sysdeps/sparc/sparc32/sparcv9/Makefile [$(have-as-vis3) != yes]
	(ASFLAGS.o, ASFLAGS-.os, ASFLAGS-.op, ASFLAGS-.oS): Remove rules.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/Makefile
	($(have-as-vis3) == yes): Remove conditional.
	* sysdeps/sparc/sparc64/Makefile (($(have-as-vis3) == yes)):
	Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fdim-generic.c: New
	file.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fdimf-generic.c: New
	file.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fma-generic.c: New
	file.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fmaf-generic.c: New
	file.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_ceil-generic.c: New file.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_ceilf-generic.c: New file.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_floor-generic.c: New file.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_floorf-generic.c: New file.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_fma-generic.c: New file.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_fmaf-generic.c: New file.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_trunc-generic.c: New file.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_truncf-generic.c: New file.

Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2017-11-29 08:18:00 -02:00
Adhemerval Zanella
6905656404 sparc: Implement memset/bzero ifunc selection in C
This patch refactor the SPARC64 ifunc selector to a C implementation.
No functional change is expected, including ifunc resolution rules.

Checked on sparc64-linux-gnu, sparcv9-linux-gnu and x86_64-linux-gnu.

	* sysdeps/sparc/sparc32/sparcv9/multiarch/Makefile
	[$(subdir) = string] (sysdep_routines): Add memset-ultra1.
	* sysdeps/sparc/sparc64/multiarch/Makefile [$(subdir) = string]
	(sysdep_routines): Add memset-ultra1.
	* sysdeps/sparc/sparc32/sparcv9/multiarch/memset-ultra1.S: New
	file.
	* sysdeps/sparc/sparc32/sparcv9/multiarch/memset.c: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/multiarch/bzero.c: Likewise.
	* sysdeps/sparc/sparc64/multiarch/ifunc-memset.h: Likewise.
	* sysdeps/sparc/sparc64/multiarch/memset-ultra1.S: Likewise.
	* sysdeps/sparc/sparc64/multiarch/memset.c: Likewise.
	* sysdeps/sparc/sparc64/multiarch/bzero.c: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/multiarch/memset.S: Remove file.
	* sysdeps/sparc/sparc64/multiarch/memset.S: Likewise.

Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2017-11-28 08:53:15 -02:00
Adhemerval Zanella
88684de7a6 sparc: Implement memcpy/mempcpy ifunc selection in C
This patch refactor the SPARC64 ifunc selector to a C implementation.
The x86_64 implementation is used as default, which resulted in common
definitions (ifunc-init.h) used on both architectures.  No functional
change is expected, including ifunc resolution rules.

Checked on sparc64-linux-gnu, sparcv9-linux-gnu and x86_64-linux-gnu.

	* sysdeps/sparc/sparc32/sparcv9/multiarch/memcpy-ultra1.S: New
	file.
	* sysdeps/sparc/sparc32/sparcv9/multiarch/memcpy.c: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/multiarch/mempcpy.c: Likewise.
	* sysdeps/sparc/sparc64/multiarch/ifunc-memcpy.h: Likewise.
	* sysdeps/sparc/sparc64/multiarch/memcpy-ultra1.S: Likewise.
	* sysdeps/sparc/sparc64/multiarch/memcpy.c: Likewise.
	* sysdeps/sparc/sparc64/multiarch/mempcpy.c: Likewise.
	* sysdeps/sparc/sparc-ifunc.h (sparc_libc_ifunc_redirected): New
	macro.
	* sysdeps/sparc/sparc32/sparcv9/multiarch/Makefile
	[$(subdir) = string] (sysdep_routines): Add memcpy-ultra1.
	* sysdeps/sparc/sparc64/multiarch/Makefile [$(subdir) = string]
	(sysdep_routines): Add memcpy-ultra1.
	* sysdeps/sparc/sparc64/multiarch/memcpy.S: Remove file.
	* sysdeps/sparc/sparc32/sparcv9/multiarch/memcpy.S: Likewise.

Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2017-11-28 08:53:15 -02:00
Joseph Myers
32d372d548 Restore sparc32 copysignl, fabsl, fmal compat symbols (bug 22229).
32-bit SPARC libm should have compat symbols for copysignl
(GLIBC_2.0), fabsl (GLIBC_2.0), fmal (GLIBC_2.1), pointing to the
double functions; they were present in glibc 2.8, for example, but are
now missing, probably when optimized SPARC function implementations
were added without appropriate compat symbol handling.  The same
applies to copysignl in libc.  This patch restores those compat
symbols.

Tested with build-many-glibcs.py for sparcv9-linux-gnu.

	[BZ #22229]
	* sysdeps/sparc/sparc32/fpu/s_copysign.S: Include
	<math_ldbl_opt.h>
	(copysignl): Define as compat symbol at version GLIBC_2_0 for libm
	and libc.
	* sysdeps/sparc/sparc32/fpu/s_fabs.S: Include <math_ldbl_opt.h>.
	(fabsl): Define as compat symbol at version GLIBC_2_0 for libm.
	* sysdeps/sparc/sparc32/fpu/s_fma.c: Include <math_ldbl_opt.h>.
	(fmal): Define as compat symbol at version GLIBC_2_1 for libm.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysign.S:
	Include <math_ldbl_opt.h>
	(copysignl): Define as compat symbol at version GLIBC_2_0 for libm
	and libc.
	(compat_symbol): Undefine and redefine.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fabs.S: Include
	<math_ldbl_opt.h>
	(fabsl): Define as compat symbol at version GLIBC_2_0 for libm.
	(compat_symbol): Undefine and redefine.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fma.c
	[HAVE_AS_VIS3_SUPPORT]: Include <math_ldbl_opt.h>.
	[HAVE_AS_VIS3_SUPPORT] (fmal): Define as compat symbol at version
	GLIBC_2_1 for libm.
	* sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist: Add
	GLIBC_2.0 copysignl symbol.
	* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Add
	GLIBC_2.0 copysignl and fabsl and GLIBC_2.1 fmal symbols.
2017-10-04 16:46:05 +00:00
Joseph Myers
620ff9eea6 Define and use libm_alias_double.
Continuing the process of setting up common macros for libm function
aliases, with a view to using them to define _FloatN / _FloatNx
aliases in future, this patch adds a libm_alias_double macro and uses
it in the type-generic templates.

This macro handles defining aliases for double, and for long double in
the NO_LONG_DOUBLE case.  It also handles defining compat symbols for
long double = double for architectures that changed their long double
format.  By so doing, it eliminates the need for the
M_LIBM_NEED_COMPAT and declare_mgen_libm_compat macros; the single
declare_mgen_alias call in each template now suffices to define all
required compat symbols.  When used for more double functions (not
based on type-generic templates), I expect it will eliminate the need
for most ldbl-opt wrappers for such functions.

A few special cases are needed.  __clog10l is a public symbol (for
historical reasons) so needs to be given appropriate compat versions
for architectures that changed their long double format, but is not
defined as an alias using the normal macros since __clog10* are *not*
public symbols for _FloatN / _FloatNx types.  For scalbn, scalbln and
log1p, the changes adding errno setting support for those functions
left compat symbols pointing directly to the non-errno-setting
implementations.  There is no requirement for the compat symbols not
to set errno; that just made for the simplest patches at that time.
Now, with these common macros, it's natural to redirect the compat
symbols to the errno-setting wrappers, which I intend to do in a
separate patch.

Tested for x86_64, and with build-many-glibcs.py.  For ldbl-opt
platforms the stripped libm.so binaries are changed (disassembly
unchanged) because the details of how the clog10l compat symbol is
created mean it ceases to be weak as it was before; for other
platforms, stripped libm.so binaries are unchanged.

2017-09-13  Joseph Myers  <joseph@codesourcery.com>

	* sysdeps/generic/libm-alias-double.h: New file.
	* sysdeps/ieee754/ldbl-opt/libm-alias-double.h: Likewise.
	* sysdeps/generic/math-type-macros-double.h: Include
	<libm-alias-double.h>.
	[declare_mgen_alias] (declare_mgen_alias): Define to use
	libm_alias_double.
	* sysdeps/generic/math-type-macros.h [!M_LIBM_NEED_COMPAT]
	(M_LIBM_NEED_COMPAT): Remove macro.
	[!M_LIBM_NEED_COMPAT] (declare_mgen_libm_compat): Likewise.
	* sysdeps/ieee754/ldbl-opt/math-type-macros-double.h: Remove.
	* math/cabs_template.c [M_LIBM_NEED_COMPAT]: Remove conditional
	code.
	* math/carg_template.c [M_LIBM_NEED_COMPAT]: Likewise.
	* math/cimag_template.c [M_LIBM_NEED_COMPAT]: Likewise.
	* math/conj_template.c [M_LIBM_NEED_COMPAT]: Likewise.
	* math/creal_template.c [M_LIBM_NEED_COMPAT]: Likewise.
	* math/s_cacos_template.c [M_LIBM_NEED_COMPAT]: Likewise.
	* math/s_cacosh_template.c [M_LIBM_NEED_COMPAT]: Likewise.
	* math/s_casin_template.c [M_LIBM_NEED_COMPAT]: Likewise.
	* math/s_casinh_template.c [M_LIBM_NEED_COMPAT]: Likewise.
	* math/s_catan_template.c [M_LIBM_NEED_COMPAT]: Likewise.
	* math/s_catanh_template.c [M_LIBM_NEED_COMPAT]: Likewise.
	* math/s_ccos_template.c [M_LIBM_NEED_COMPAT]: Likewise.
	* math/s_ccosh_template.c [M_LIBM_NEED_COMPAT]: Likewise.
	* math/s_cexp_template.c [M_LIBM_NEED_COMPAT]: Likewise.
	* math/s_clog10_template.c [M_LIBM_NEED_COMPAT]: Likewise.
	* math/s_clog_template.c [M_LIBM_NEED_COMPAT]: Likewise.
	* math/s_cpow_template.c [M_LIBM_NEED_COMPAT]: Likewise.
	* math/s_cproj_template.c [M_LIBM_NEED_COMPAT]: Likewise.
	* math/s_csin_template.c [M_LIBM_NEED_COMPAT]: Likewise.
	* math/s_csinh_template.c [M_LIBM_NEED_COMPAT]: Likewise.
	* math/s_csqrt_template.c [M_LIBM_NEED_COMPAT]: Likewise.
	* math/s_ctan_template.c [M_LIBM_NEED_COMPAT]: Likewise.
	* math/s_ctanh_template.c [M_LIBM_NEED_COMPAT]: Likewise.
	* math/s_fdim_template.c [M_LIBM_NEED_COMPAT]: Likewise.
	* math/s_fmax_template.c [M_LIBM_NEED_COMPAT]: Likewise.
	* math/s_fmin_template.c [M_LIBM_NEED_COMPAT]: Likewise.
	* math/s_nan_template.c [M_LIBM_NEED_COMPAT]: Likewise.
	* math/w_ilogb_template.c [M_LIBM_NEED_COMPAT]: Likewise.
	* sysdeps/ieee754/ldbl-opt/s_clog10.c: New file.
	* sysdeps/ieee754/ldbl-opt/s_ldexp.c (M_LIBM_NEED_COMPAT): Remove
	macro.
	(declare_mgen_alias): New macro.
	* sysdeps/ieee754/ldbl-opt/w_log1p.c: New file.
	* sysdeps/ieee754/ldbl-opt/w_scalbln.c: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fdim-vis3.c
	(M_LIBM_NEED_COMPAT): Remove macro.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fdim.c
	[HAVE_AS_VIS3_SUPPORT]: Include <math_ldbl_opt.h> and
	<first-versions.h>.
	[HAVE_AS_VIS3_SUPPORT && LONG_DOUBLE_COMPAT (libm,
	FIRST_VERSION_libm_fdiml)]: Define fdiml as compat symbol.
2017-09-13 01:13:30 +00:00
Joseph Myers
831bbd5527 Remove SPARC sqrt wrappers (bug 21973).
This patch removes the SPARC-specific wrappers for sqrt and sqrtf.

These wrappers, by adding architecture-specific uses of _LIB_VERSION
and __kernel_standard, unnecessarily complicate cleanups of libm error
handling.  They also do not serve a useful optimization purpose.  GCC
knows about sqrt as a built-in function, and can generate direct calls
to a hardware square root instruction, either on its own, in the
-fno-math-errno case, or together with an inline check for the
argument being negative and a call to the out-of-line sqrt function
for error handling only in that case (and has been able to do so for a
long time).  Thus in practice the wrapper will only be called only in
the case of negative arguments, which is not a case it is useful to
optimize for.

The removal of the wrappers also uncovers, and fixes, an old bug.
32-bit SPARC libm used (checked with glibc 2.8 binaries) to have a
sqrtl compat symbol, version GLIBC_2.0, for old binaries when sqrtl
was an alias of sqrt (I don't have pre-glibc-2.4 binaries for SPARC to
hand to check for the sqrtl symbol in those).  This disappeared,
probably with:

commit 8847f03770
Author: David S. Miller <davem@davemloft.net>
Date:   Tue Feb 28 22:37:58 2012 -0800

    Add sparc optimized sqrt{,f}.

Removing the wrappers brings back the generic ldbl-opt logic for
creating such compat symbols, and so restores the compat symbol that
should be there.  This could of course also be fixed in the wrappers -
but as noted above, the wrappers are optimizing a case it's not useful
to optimize, so the bug of the missing compat symbol serves to
illustrate the risks involved with the extra complexity of
architecture-specific function versions where not needed.

Tested with build-many-glibcs.py.

	[BZ #21973]
	* sysdeps/sparc/sparc32/fpu/w_sqrt_compat.S: Remove file.
	* sysdeps/sparc/sparc32/fpu/w_sqrtf_compat.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/w_sqrt_compat-vis3.S:
	Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/w_sqrt_compat.S:
	Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/w_sqrtf_compat-vis3.S:
	Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/w_sqrtf_compat.S:
	Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/w_sqrt_compat.S : Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/w_sqrtf_compat.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/w_sqrt_compat.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/w_sqrtf_compat.S: Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Add
	GLIBC_2.0 sqrtl symbol.
2017-08-21 17:46:34 +00:00
Joseph Myers
813378e9fe Obsolete matherr, _LIB_VERSION, libieee.a.
This patch obsoletes support for SVID libm error handling (the system
where a user-defined function matherr is called on a libm function
error; only enabled if you also set _LIB_VERSION = _SVID_ or
_LIB_VERSION = _XOPEN_) and the use of the _LIB_VERSION global
variable to control libm error handling.  matherr and _LIB_VERSION are
made into compat symbols, not supported for new ports or for static
linking.  The libieee.a object file (which sets _LIB_VERSION = _IEEE_,
so disabling errno setting for some functions) is also removed, and
all the related definitions are removed from math.h.

The manual already recommends against using matherr, and it's already
not supported for _Float128 functions (those use new wrappers that
don't support matherr, only errno) - this patch means that it becomes
possible to e.g. add sinf32 as an alias to sinf without that resulting
in undesired matherr support in sinf32 for existing glibc ports.
matherr support is not part of any standard supported by glibc (it was
removed in XPG4).

Because matherr is a function to be defined by the user, of course
user programs defining such a function will still continue to link; it
just quietly won't be used.  If they try to write to the library's
copy of _LIB_VERSION to enable SVID error handling, however, they will
get a link error (but if they define their own _LIB_VERSION variable,
they won't).

I expect the most likely case of build failures from this patch to be
programs with unconditional cargo-culted uses of -lieee (based on a
notion of "I want IEEE floating point", not any actual requirement for
that library).

Ideally, the new-port-or-static-linking case would use the new
wrappers used for _Float128.  This is not implemented in this patch,
because of the complication of architecture-specific (powerpc32 and
sparc) sqrt wrappers that use _LIB_VERSION and __kernel_standard
directly.  Thus, the old wrappers and __kernel_standard are still
built unconditionally, and _LIB_VERSION still exists in static libm.
But when the old wrappers and __kernel_standard are built in the
non-compat case, _LIB_VERSION and matherr are defined as macros so
code to support those features isn't actually built into static libm
or new ports' shared libm after this patch.

I intend to move to the new wrappers for static libm and new ports in
followup patches.  I believe the sqrt wrappers for powerpc32 and sparc
can reasonably be removed.  GCC already optimizes the normal case of
sqrt by generating code that uses a hardware instruction and only
calls the sqrt function if the argument was negative (if
-fno-math-errno, of course, it just uses the hardware instruction
without any check for negative argument being needed).  Thus those
wrappers will only actually get called in the case of negative
arguments, which is not a case it makes sense to optimize for.  But
even without removing the powerpc32 and sparc wrappers it should still
be possible to move to the new wrappers for static libm and new ports,
just without having those dubious architecture-specific optimizations
in static libm.

Everything said about matherr equally applies to matherrf and matherrl
(IA64-specific, undocumented), except that the structure of IA64 libm
means it won't be converted to using the new wrappers (it doesn't use
the old ones either, but its own error-handling code instead).

As with other tests of compat symbols, I expect test-matherr and
test-matherr-2 to need to become appropriately conditional once we
have a system for disabling such tests for ports too new to have the
relevant symbols.

Tested for x86_64 and x86, and with build-many-glibcs.py.

	* math/math.h [__USE_MISC] (_LIB_VERSION_TYPE): Remove.
	[__USE_MISC] (_LIB_VERSION): Likewise.
	[__USE_MISC] (struct exception): Likewise.
	[__USE_MISC] (matherr): Likewise.
	[__USE_MISC] (DOMAIN): Likewise.
	[__USE_MISC] (SING): Likewise.
	[__USE_MISC] (OVERFLOW): Likewise.
	[__USE_MISC] (UNDERFLOW): Likewise.
	[__USE_MISC] (TLOSS): Likewise.
	[__USE_MISC] (PLOSS): Likewise.
	[__USE_MISC] (HUGE): Likewise.
	[__USE_XOPEN] (MAXFLOAT): Define even if [__USE_MISC].
	* math/math-svid-compat.h: New file.
	* conform/linknamespace.pl (@whitelist): Remove matherr, matherrf
	and matherrl.
	* include/math.h [!_ISOMAC] (__matherr): Remove.
	* manual/arith.texi (FP Exceptions): Do not document matherr.
	* math/Makefile (tests): Change test-matherr to test-matherr-3.
	(tests-internal): New variable.
	(install-lib): Do not add libieee.a.
	(non-lib.a): Likewise.
	(extra-objs): Do not add libieee.a and ieee-math.o.
	(CPPFLAGS-s_lib_version.c): Remove variable.
	($(objpfx)libieee.a): Remove rule.
	($(addprefix $(objpfx), $(tests-internal)): Depend on $(libm).
	* math/ieee-math.c: Remove.
	* math/libm-test-support.c (matherr): Remove.
	* math/test-matherr.c: Use <support/test-driver.c>.  Add copyright
	and license notices.  Include <math-svid-compat.h> and
	<shlib-compat.h>.
	(matherr): Undefine as macro.  Use compat_symbol_reference.
	(_LIB_VERSION): Likewise.
	* math/test-matherr-2.c: New file.
	* math/test-matherr-3.c: Likewise.
	* sysdeps/generic/math_private.h (__kernel_standard): Remove
	declaration.
	(__kernel_standard_f): Likewise.
	(__kernel_standard_l): Likewise.
	* sysdeps/ieee754/s_lib_version.c: Do not include <math.h> or
	<math_private.h>.  Include <math-svid-compat.h>.
	(_LIB_VERSION): Undefine as macro.
	(_LIB_VERSION_INTERNAL): Always initialize to _POSIX_.  Define
	only if [LIBM_SVID_COMPAT || !defined SHARED].  If
	[LIBM_SVID_COMPAT], use compat_symbol.
	* sysdeps/ieee754/s_matherr.c: Do not include <math.h> or
	<math_private.h>.  Include <math-svid-compat.h>.
	(matherr): Undefine as macro.
	(__matherr): Define only if [LIBM_SVID_COMPAT].  Use
	compat_symbol.
	* sysdeps/ia64/fpu/libm_error.c: Include <math-svid-compat.h>.
	[_LIBC && LIBM_SVID_COMPAT] (matherrf): Use
	compat_symbol_reference.
	[_LIBC && LIBM_SVID_COMPAT] (matherrl): Likewise.
	[_LIBC && !LIBM_SVID_COMPAT] (matherrf): Define as macro.
	[_LIBC && !LIBM_SVID_COMPAT] (matherrl): Likewise.
	* sysdeps/ia64/fpu/libm_support.h: Include <math-svid-compat.h>.
	(MATHERR_D): Remove declaration.
	[!_LIBC] (_LIB_VERSION_TYPE): Likewise
	[!LIBM_BUILD] (_LIB_VERSIONIMF): Likewise.
	[LIBM_BUILD] (pmatherrf): Likewise.
	[LIBM_BUILD] (pmatherr): Likewise.
	[LIBM_BUILD] (pmatherrl): Likewise.
	(DOMAIN): Likewise.
	(SING): Likewise.
	(OVERFLOW): Likewise.
	(UNDERFLOW): Likewise.
	(TLOSS): Likewise.
	(PLOSS): Likewise.
	* sysdeps/ia64/fpu/s_matherrf.c: Include <math-svid-compat.h>.
	(__matherrf): Define only if [LIBM_SVID_COMPAT].  Use
	compat_symbol.
	* sysdeps/ia64/fpu/s_matherrl.c: Include <math-svid-compat.h>.
	(__matherrl): Define only if [LIBM_SVID_COMPAT].  Use
	compat_symbol.
	* math/lgamma-compat.h: Include <math-svid-compat.h>.
	* math/w_acos_compat.c: Likewise.
	* math/w_acosf_compat.c: Likewise.
	* math/w_acosh_compat.c: Likewise.
	* math/w_acoshf_compat.c: Likewise.
	* math/w_acoshl_compat.c: Likewise.
	* math/w_acosl_compat.c: Likewise.
	* math/w_asin_compat.c: Likewise.
	* math/w_asinf_compat.c: Likewise.
	* math/w_asinl_compat.c: Likewise.
	* math/w_atan2_compat.c: Likewise.
	* math/w_atan2f_compat.c: Likewise.
	* math/w_atan2l_compat.c: Likewise.
	* math/w_atanh_compat.c: Likewise.
	* math/w_atanhf_compat.c: Likewise.
	* math/w_atanhl_compat.c: Likewise.
	* math/w_cosh_compat.c: Likewise.
	* math/w_coshf_compat.c: Likewise.
	* math/w_coshl_compat.c: Likewise.
	* math/w_exp10_compat.c: Likewise.
	* math/w_exp10f_compat.c: Likewise.
	* math/w_exp10l_compat.c: Likewise.
	* math/w_exp2_compat.c: Likewise.
	* math/w_exp2f_compat.c: Likewise.
	* math/w_exp2l_compat.c: Likewise.
	* math/w_fmod_compat.c: Likewise.
	* math/w_fmodf_compat.c: Likewise.
	* math/w_fmodl_compat.c: Likewise.
	* math/w_hypot_compat.c: Likewise.
	* math/w_hypotf_compat.c: Likewise.
	* math/w_hypotl_compat.c: Likewise.
	* math/w_j0_compat.c: Likewise.
	* math/w_j0f_compat.c: Likewise.
	* math/w_j0l_compat.c: Likewise.
	* math/w_j1_compat.c: Likewise.
	* math/w_j1f_compat.c: Likewise.
	* math/w_j1l_compat.c: Likewise.
	* math/w_jn_compat.c: Likewise.
	* math/w_jnf_compat.c: Likewise.
	* math/w_jnl_compat.c: Likewise.
	* math/w_lgamma_main.c: Likewise.
	* math/w_lgamma_r_compat.c: Likewise.
	* math/w_lgammaf_main.c: Likewise.
	* math/w_lgammaf_r_compat.c: Likewise.
	* math/w_lgammal_main.c: Likewise.
	* math/w_lgammal_r_compat.c: Likewise.
	* math/w_log10_compat.c: Likewise.
	* math/w_log10f_compat.c: Likewise.
	* math/w_log10l_compat.c: Likewise.
	* math/w_log2_compat.c: Likewise.
	* math/w_log2f_compat.c: Likewise.
	* math/w_log2l_compat.c: Likewise.
	* math/w_log_compat.c: Likewise.
	* math/w_logf_compat.c: Likewise.
	* math/w_logl_compat.c: Likewise.
	* math/w_pow_compat.c: Likewise.
	* math/w_powf_compat.c: Likewise.
	* math/w_powl_compat.c: Likewise.
	* math/w_remainder_compat.c: Likewise.
	* math/w_remainderf_compat.c: Likewise.
	* math/w_remainderl_compat.c: Likewise.
	* math/w_scalb_compat.c: Likewise.
	* math/w_scalbf_compat.c: Likewise.
	* math/w_scalbl_compat.c: Likewise.
	* math/w_sinh_compat.c: Likewise.
	* math/w_sinhf_compat.c: Likewise.
	* math/w_sinhl_compat.c: Likewise.
	* math/w_sqrt_compat.c: Likewise.
	* math/w_sqrtf_compat.c: Likewise.
	* math/w_sqrtl_compat.c: Likewise.
	* math/w_tgamma_compat.c: Likewise.
	* math/w_tgammaf_compat.c: Likewise.
	* math/w_tgammal_compat.c: Likewise.
	* sysdeps/ieee754/dbl-64/w_exp_compat.c: Likewise.
	* sysdeps/ieee754/flt-32/w_expf_compat.c: Likewise.
	* sysdeps/ieee754/k_standard.c: Likewise.
	* sysdeps/ieee754/k_standardf.c: Likewise.
	* sysdeps/ieee754/k_standardl.c: Likewise.
	* sysdeps/ieee754/ldbl-128/w_expl_compat.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/w_expl_compat.c: Likewise.
	* sysdeps/ieee754/ldbl-96/w_expl_compat.c: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/w_sqrt_compat.S: Likewise.
	* sysdeps/powerpc/powerpc32/power4/fpu/w_sqrtf_compat.S: Likewise.
	* sysdeps/powerpc/powerpc32/power5/fpu/w_sqrt_compat.S: Likewise.
	* sysdeps/powerpc/powerpc32/power5/fpu/w_sqrtf_compat.S: Likewise.
	* sysdeps/sparc/sparc32/fpu/w_sqrt_compat.S: Likewise.
	* sysdeps/sparc/sparc32/fpu/w_sqrtf_compat.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/w_sqrt_compat-vis3.S:
	Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/w_sqrtf_compat-vis3.S:
	Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/w_sqrt_compat.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/w_sqrtf_compat.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/w_sqrt_compat.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/w_sqrtf_compat.S: Likewise.
2017-08-21 17:45:10 +00:00
Stefan Liebler
12d2dd7060 Optimize generic spinlock code and use C11 like atomic macros.
This patch optimizes the generic spinlock code.

The type pthread_spinlock_t is a typedef to volatile int on all archs.
Passing a volatile pointer to the atomic macros which are not mapped to the
C11 atomic builtins can lead to extra stores and loads to stack if such
a macro creates a temporary variable by using "__typeof (*(mem)) tmp;".
Thus, those macros which are used by spinlock code - atomic_exchange_acquire,
atomic_load_relaxed, atomic_compare_exchange_weak - have to be adjusted.
According to the comment from  Szabolcs Nagy, the type of a cast expression is
unqualified (see http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_423.htm):
__typeof ((__typeof (*(mem)) *(mem)) tmp;
Thus from spinlock perspective the variable tmp is of type int instead of
type volatile int.  This patch adjusts those macros in include/atomic.h.
With this construct GCC >= 5 omits the extra stores and loads.

The atomic macros are replaced by the C11 like atomic macros and thus
the code is aligned to it.  The pthread_spin_unlock implementation is now
using release memory order instead of sequentially consistent memory order.
The issue with passed volatile int pointers applies to the C11 like atomic
macros as well as the ones used before.

I've added a glibc_likely hint to the first atomic exchange in
pthread_spin_lock in order to return immediately to the caller if the lock is
free.  Without the hint, there is an additional jump if the lock is free.

I've added the atomic_spin_nop macro within the loop of plain reads.
The plain reads are also realized by C11 like atomic_load_relaxed macro.

The new define ATOMIC_EXCHANGE_USES_CAS determines if the first try to acquire
the spinlock in pthread_spin_lock or pthread_spin_trylock is an exchange
or a CAS.  This is defined in atomic-machine.h for all architectures.

The define SPIN_LOCK_READS_BETWEEN_CMPXCHG is now removed.
There is no technical reason for throwing in a CAS every now and then,
and so far we have no evidence that it can improve performance.
If that would be the case, we have to adjust other spin-waiting loops
elsewhere, too!  Using a CAS loop without plain reads is not a good idea
on many targets and wasn't used by one.  Thus there is now no option to
do so.

Architectures are now using the generic spinlock automatically if they
do not provide an own implementation.  Thus the pthread_spin_lock.c files
in sysdeps folder are deleted.

ChangeLog:

	* NEWS: Mention new spinlock implementation.
	* include/atomic.h:
	(__atomic_val_bysize): Cast type to omit volatile qualifier.
	(atomic_exchange_acq): Likewise.
	(atomic_load_relaxed): Likewise.
	(ATOMIC_EXCHANGE_USES_CAS): Check definition.
	* nptl/pthread_spin_init.c (pthread_spin_init):
	Use atomic_store_relaxed.
	* nptl/pthread_spin_lock.c (pthread_spin_lock):
	Use C11-like atomic macros.
	* nptl/pthread_spin_trylock.c (pthread_spin_trylock):
	Likewise.
	* nptl/pthread_spin_unlock.c (pthread_spin_unlock):
	Use atomic_store_release.
	* sysdeps/aarch64/nptl/pthread_spin_lock.c: Delete File.
	* sysdeps/arm/nptl/pthread_spin_lock.c: Likewise.
	* sysdeps/hppa/nptl/pthread_spin_lock.c: Likewise.
	* sysdeps/m68k/nptl/pthread_spin_lock.c: Likewise.
	* sysdeps/microblaze/nptl/pthread_spin_lock.c: Likewise.
	* sysdeps/mips/nptl/pthread_spin_lock.c: Likewise.
	* sysdeps/nios2/nptl/pthread_spin_lock.c: Likewise.
	* sysdeps/aarch64/atomic-machine.h (ATOMIC_EXCHANGE_USES_CAS): Define.
	* sysdeps/alpha/atomic-machine.h: Likewise.
	* sysdeps/arm/atomic-machine.h: Likewise.
	* sysdeps/i386/atomic-machine.h: Likewise.
	* sysdeps/ia64/atomic-machine.h: Likewise.
	* sysdeps/m68k/coldfire/atomic-machine.h: Likewise.
	* sysdeps/m68k/m680x0/m68020/atomic-machine.h: Likewise.
	* sysdeps/microblaze/atomic-machine.h: Likewise.
	* sysdeps/mips/atomic-machine.h: Likewise.
	* sysdeps/powerpc/powerpc32/atomic-machine.h: Likewise.
	* sysdeps/powerpc/powerpc64/atomic-machine.h: Likewise.
	* sysdeps/s390/atomic-machine.h: Likewise.
	* sysdeps/sparc/sparc32/atomic-machine.h: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/atomic-machine.h: Likewise.
	* sysdeps/sparc/sparc64/atomic-machine.h: Likewise.
	* sysdeps/tile/tilegx/atomic-machine.h: Likewise.
	* sysdeps/tile/tilepro/atomic-machine.h: Likewise.
	* sysdeps/unix/sysv/linux/hppa/atomic-machine.h: Likewise.
	* sysdeps/unix/sysv/linux/m68k/coldfire/atomic-machine.h: Likewise.
	* sysdeps/unix/sysv/linux/nios2/atomic-machine.h: Likewise.
	* sysdeps/unix/sysv/linux/sh/atomic-machine.h: Likewise.
	* sysdeps/x86_64/atomic-machine.h: Likewise.
2017-06-06 09:41:56 +02:00
Adhemerval Zanella
bdc543e338 sparc: Fix .udiv plt on libc
With the removal of divdi3 object from sparcv9-linux-gnu build, its
definition came from libgcc and its functions internall calls .udiv.
Since glibc also exports these symbols for compatibility reasons, it
will end up creating PLT calls internally in libc.so.

To avoid it, this patch uses the linker option --wrap to replace all
the internal libc.so .udiv calls to the wrapper __wrap_.udiv. Along
with strong alias in the udiv implementations, it makes linker do
local calls.

Checked on sparcv9-linux-gnu.

	* sysdeps/sparc/sparc32/Makefile (libc.so-gnulib): New rule.
	* sysdeps/sparc/sparc32/sparcv8/udiv.S (.udiv): Make a strong_alias
	to __wrap_.udiv.
	* sysdeps/sparc/sparc32/sparcv9/udiv.S (.udiv): Likewise.
	* sysdeps/sparc/sparc32/udiv.S (.udiv): Likewise.
2017-04-06 15:14:44 -03:00
David S. Miller
33d7e138ca sparc: Remove optimized math routines which cause testsuite failures.
famx{,f}/fmin{,f} and 32-bit lrint cause math testsuite failures
either because they generate incorrect results or they fail to signal
the proper exceptions.

	* sysdeps/sparc/sparc64/fpu/multiarch/s_fmax-vis3.S: Remove file.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_fmax.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_fmaxf-vis3.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_fmaxf.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_fmin-vis3.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_fmin.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_fminf-vis3.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_fminf.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/Makefile
	(libm-sysdep_routines): Update.
	* sysdeps/sparc/sparc32/sparcv9/fpu/s_fmax.S: Remove file.
	* sysdeps/sparc/sparc32/sparcv9/fpu/s_fmaxf.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/s_fmin.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/s_fminf.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/s_lrint.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/s_fmax.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/s_fmaxf.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/s_fmin.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/s_fminf.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fmax-vis3.S:
	Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fmax.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fmaxf-vis3.S:
	Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fmaxf.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fmin-vis3.S:
	Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fmin.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fminf-vis3.S:
	Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fminf.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/Makefile
	(libm-sysdep_routines): Update.
2017-02-03 17:55:25 -08:00
Gabriel F. T. Gomes
f67d78192c Move wrappers to libm-compat-calls-auto
This commit moves one step towards the deprecation of wrappers that
use _LIB_VERSION / matherr / __kernel_standard functionality, by
adding the suffix '_compat' to their filenames and adjusting Makefiles
and #includes accordingly.

New template wrappers that do not use such functionality will be added
by future patches and will be first used by the float128 wrappers.
2017-01-04 16:25:04 -02:00
Joseph Myers
bfff8b1bec Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
Florian Weimer
8837917cf1 Remove remnants of .og patterns
This was used by --enable-omitfp, and the bulk of it was removed in this
commit:

commit bdeba1354b
Author: Ulrich Drepper <drepper@gmail.com>
Date:   Sat Jan 7 11:29:31 2012 -0500

    Remove --enable-omitfp support
2016-09-20 12:18:13 +02:00
Adhemerval Zanella
09cb278539 nptl: Consolidate sem_init implementations
Current sparc32 sem_init and default one only differ on sem.newsem.pad
initialization.  This patch removes sparc32 and sparc32v9 sem_init arch
specific implementation and set sparc32 to use nptl default one.
The default implementation sets the required sem.newsem.pad to 0 (which
is ununsed in other architectures).

I checked on i686 and a sparc32v9 build.

        * nptl/sem_init.c (sem_init): Init pad value to 0.
        * sysdeps/sparc/sparc32/sem_init.c: Remove file.
        * sysdeps/sparc/sparc32/sparcv9/sem_init.c: Likewise.
2016-09-15 16:31:50 -03:00
Adhemerval Zanella
49ad334ab1 nptl: Remove sparc sem_wait
This patch removes the sparc32 sem_wait.c implementation since it is
identical to default nptl one.  The sparcv9 is no longer required with
the removal.

Checked with a sparcv9 build.

	* sysdeps/sparc/sparc32/sem_wait.c: Remove file.
	* sysdeps/sparc/sparc32/sparcv9/sem_wait.c: Likewise.
2016-09-15 11:14:30 -03:00
Adhemerval Zanella
980d25d53e nptl: Consolidate sem_open implementations
Current sparc32 sem_open and default one only differ on:

  1. Default one contains a 'futex_supports_pshared' check.
  2. sem.newsem.pad is initialized to zero.

This patch removes sparc32 and sparc32v9 sem_open arch specific
implementation and instead set sparc32 to use nptl default one.
Using 1. is fine since it should always evaluate 0 for Linux
(an optimized away by the compiler). Adding 2. to default
implementation should be ok since 'pad' field is used mainly
on sparc32 code.

I checked on i686 and checked a sparc32v9 build.

	* nptl/sem_open.c (sem_open): Init pad value to 0.
	* sysdeps/sparc/sparc32/sem_open.c: Remove file.
	* sysdeps/sparc/sparc32/sparcv9/sem_open.c: Likewise.
2016-09-15 11:13:10 -03:00
Paul E. Murphy
7b7c39450b Make common fdim implementation generic.
The only difference is the usage of math_narrow_eval when
building s_fdiml.c.  This should be harmless for long double,
but I did observe some code generation changes on m68k, but
lack the resources to test it.

Likewise, to more easily support overriding symbol generation,
the aliasing macros are always conditionally defined on their
absence to reduce boilerplate.

I also ran builds for i486, ppc64, sparcv9, aarch64,
s390x and observed no changes to s_fdim* objects.
2016-09-01 09:28:05 -05:00
Paul E. Murphy
d47d27d6c0 sparcv9: Restore fdiml@GLIBC_2.1
Use s_fdim.c from sysdeps/ieee754/ldbl-opt/ instead of
math/ to ensure a compat symbol for fdiml is created.
2016-08-29 11:54:45 -05:00
Aurelien Jarno
bf79a337ec sparc32/sparcv9: add a VIS3 version of fdim
sparc32 passes floating point values in the integer registers. VIS3
instructions gives access to the movwtos instruction to directly
transfer a value from an integer register to a floating point register.
Therefore it makes sense to provide a VIS3 version consisting in the
generic version compiled with -mvis3.

Changelog:
	* math/s_fdim.c: Avoid alias renamed.
	* math/s_fdimf.c: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/Makefile
	[$(subdir) = math && $(have-as-vis3) = yes] (libm-sysdep_routines):
	Add s_fdimf-vis3, s_fdim-vis3.
	(CFLAGS-s_fdimf-vis3.c): New. Set to -Wa,-Av9d -mvis3.
	(CFLAGS-s_fdim-vis3.c): Likewise.
	sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fdim-vis3.c: New file.
	sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fdim.c: Likewise.
2016-08-05 22:35:01 +02:00
Aurelien Jarno
8a9f4eb958 sparc: remove fdim sparc specific implementations
The fdim and fdimf functions on sparc do not fully follow the standard
and do not set errno to ERANGE when the result overflows. Since glibc
2.24 this causes the two following tests to fail:

  Failure: fdim (max_value, -max_value): errno set to 0, expected 34 (ERANGE)
  Failure: fdim_upward (max_value, -max_value): errno set to 0, expected 34 (ERANGE)

It happens that using GCC with the generic C code generates very similar
code to the sparc specific implementations. Therefore this patches
remove them. Note it might still worth adding a vis3 specific version of
fdim on sparc32/sparcv9, this is done in a following patch to ease
backporting.

Changelog:
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/Makefile
	[$(subdir) = math && $(have-as-vis3) = yes] (libm-sysdep_routines):
	Remove s_fdimf-vis3, s_fdim-vis3.
	* sysdeps/sparc/sparc32/fpu/s_fdim.S: Delete file.
	* sysdeps/sparc/sparc32/fpu/s_fdimf.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fdim-vis3.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fdim.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fdimf-vis3.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fdimf.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/s_fdim.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/s_fdimf.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/s_fdim.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/s_fdimf.S: Likewise.
2016-08-05 22:35:01 +02:00
Aurelien Jarno
9c8addbc1c sparc: build with -mvis on sparc32/sparcv9 and sparc64
When building for sparc32/sparcv9 or sparc64, we assume that VIS
instructions are available and use them in the sparc specific assembly
code. However we do not tell GCC to use such instructions, resulting in
slightly suboptimal code.

Fix that by passing -Wa,-Av9a -mvis to GCC.

Changelog:
	* sysdeps/sparc/sparc32/sparcv9/Makefile (sysdep-CFLAGS): Add -mvis.
	* sysdeps/sparc/sparc64/Makefile (sysdep-CFLAGS): New. Define to
	-Wa,-Av9a -mvis.
2016-08-05 22:35:01 +02:00
David S. Miller
3ef3f1b93f Fix sNaN handling in nearbyint on 32-bit sparc.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyint-vis3.S
	(__nearbyint_vis3): Don't check for sNaN before float register is
	loaded with the incoming argument.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyintf-vis3.S
	(__nearbyintf_vis3): Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/s_nearbyint.S (__nearbyint):
	Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/s_nearbyintf.S (__nearbyintf):
	Likewise.
2016-08-02 21:00:21 -07:00
Aurelien Jarno
33ae5b17cd sparc: remove ceil, floor, trunc sparc specific implementations
The ceil, floor and trunc functions on sparc do not fully follow the
standard and trigger an inexact exception when presented a value which
is not an integer. Since glibc 2.24 this causes a few tests to fail,
for instance:

  testing double (without inline functions)
  Failure: ceil (lit_pi): Exception "Inexact" set
  Failure: ceil (-lit_pi): Exception "Inexact" set
  Failure: ceil (min_subnorm_value): Exception "Inexact" set
  Failure: ceil (min_value): Exception "Inexact" set
  Failure: ceil (0.1): Exception "Inexact" set
  Failure: ceil (0.25): Exception "Inexact" set
  Failure: ceil (0.625): Exception "Inexact" set
  Failure: ceil (-min_subnorm_value): Exception "Inexact" set
  Failure: ceil (-min_value): Exception "Inexact" set
  Failure: ceil (-0.1): Exception "Inexact" set
  Failure: ceil (-0.25): Exception "Inexact" set
  Failure: ceil (-0.625): Exception "Inexact" set

I tried to fix that by using the same strategy than used on other
architectures, that is by saving the FSR register at the beginning
and restoring it at the end of the function. When doing so I noticed
a comment that this operation might be very costly, so I decided to
do some benchmarks.

The benchmarks below represent the time required to run each of the
function 60 millions of times with different input value. I have done
that in the basic V9 code, the VIS2 code, and using the default C
implementation of the libc, for both sparc32 and sparc64, on a Niagara
T1 based machine and an UltraSparc IIIi. Given I don't have access to a
more recent machine), I haven't been able to test the VIS3 version. Also
it should be noted that it doesn't make sense to do this benchmark for
V8 or earlier as in that case we use the default C implementation. The
results are available in the table below, the "+ fix" version correspond
to the one saving and restoring the FSR.

  Niagara T1 / sparc32
  --------------------
              ceilf    ceil     floorf   floor    truncf   trunc
  V9          19.10    22.48    19.10    22.48    16.59    19.27
  V9 + fix    19.77    23.34    19.77    23.33    17.27    20.12
  VIS2        16.87    19.62    16.87    19.62
  VIS2 + fix  17.55    20.47    17.55    20.47
  C impl      11.39    13.80    11.40    13.80    10.88    10.84

  Niagara T1 / sparc64
  --------------------
              ceilf    ceil     floorf   floor    truncf   trunc
  V9          18.14    22.23    18.14    22.23    15.64    19.02
  V9 + fix    18.82    23.08    18.82    23.08    16.32    19.87
  VIS2        15.92    19.37    15.92    19.37
  VIS2 + fix  16.59    20.22    16.59    20.22
  C impl      11.39    13.60    11.39    15.36    10.88    12.65

  UltraSparc IIIi / sparc32
  -------------------------
              ceilf    ceil     floorf   floor    truncf   trunc
  V9           4.81     7.09     6.61    11.64     4.91     7.05
  V9 + fix     7.20    10.42     7.14    10.54     6.76     9.47
  VIS2         4.81     7.03     4.76     7.13
  VIS2 + fix   6.76     9.51     6.71     9.63
  C impl       3.88     8.62     3.90     9.45     3.57     6.62

  UltraSparc IIIi / sparc64
  -------------------------
              ceilf    ceil     floorf   floor    truncf   trunc
  V9           3.48     4.39     3.48     4.41     3.01     3.85
  V9 + fix     4.76     5.90     4.76     5.90     4.86     6.26
  VIS2         2.95     3.61     2.95     3.61
  VIS2 + fix   4.24     5.37     4.30     7.97
  C impl       3.63     4.89     3.62     6.38     3.33     4.03

The first thing that should be noted is that the C implementation is
always faster on the Niagara T1 based machine. On the UltraSparc IIIi
the float version on sparc32 is also faster.

Coming back about the fix saving and restoring the FSR, it appears
it has a big impact as expected. In that case the C implementation is
always faster than the fixed implementations.

This patch therefore removes the sparc specific implementations in
favor of the generic ones.

Changelog:
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/Makefile
	[$(subdir) = math] (libm-sysdep_routines): Remove.
	[$(subdir) = math && $(have-as-vis3) = yes] (libm-sysdep_routines):
	Remove s_ceilf-vis3, s_ceil-vis3, s_floorf-vis3, s_floor-vis3,
	s_truncf-vis3, s_trunc-vis3.
	* sysdeps/sparc/sparc64/fpu/multiarch/Makefile: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_ceil-vis2.S: Delete
	file.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_ceil-vis3.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_ceil.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_ceilf-vis2.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_ceilf-vis3.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_ceilf.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_floor-vis2.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_floor-vis3.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_floor.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_floorf-vis2.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_floorf-vis3.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_floorf.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_trunc-vis3.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_trunc.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_truncf-vis3.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_truncf.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/s_ceil.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/s_ceilf.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/s_floor.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/s_floorf.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/s_trunc.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/s_truncf.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_ceil-vis2.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_ceil-vis3.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_ceil.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_ceilf-vis2.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_ceilf-vis3.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_ceilf.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_floor-vis2.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_floor-vis3.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_floor.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_floorf-vis2.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_floorf-vis3.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_floorf.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_trunc-vis3.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_trunc.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_truncf-vis3.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/multiarch/s_truncf.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/s_ceil.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/s_ceilf.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/s_floor.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/s_floorf.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/s_trunc.S: Likewise.
	* sysdeps/sparc/sparc64/fpu/s_truncf.S: Likewise.
2016-08-02 02:07:20 +02:00
Aurelien Jarno
2cbec36566 SPARC: fix nearbyint on sNaN input
nearbyint and nearbyintf should not trigger inexact exceptions, but
should still trigger an invalid exception for a sNaN input.

The SPARC specific implementations of these functions save the FSR at
the beginning of the function and restore it at the end to not trigger
an inexact exception. This however doesn't work for an sNaN input which
need to trigger an invalid exception. Fix that by adding a fcmp
instruction using the input value before saving FSR, so that an invalid
exception is triggered for a sNaN input.

This fixes the math/test-nearbyint-except test on SPARC.

Changelog:
	* sparc/sparc32/sparcv9/fpu/s_nearbyint.S (__nearbyint): Trigger an
	invalid exception for a sNaN input.
	* sparc/sparc32/sparcv9/fpu/s_nearbyintf.S (__nearbyintf): Likewise.
	* sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyint-vis3.S
	(__nearbyint_vis3): Likewise
	* sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyintf-vis3.S
	(__nearbyintf_vis3): Likewise
	* sparc/sparc64/fpu/s_nearbyint.S (__nearbyint): Likewise.
	* sparc/sparc64/fpu/s_nearbyintf.S (__nearbyintf): Likewise.
	* sparc/sparc64/fpu/multiarch/s_nearbyint-vis3.S (__nearbyint_vis3):
	Likewise.
	* sparc/sparc64/fpu/multiarch/s_nearbyintf-vis3.S (__nearbyintf_vis3):
	Likewise.
2016-07-01 16:36:41 +02:00