Commit Graph

447 Commits

Author SHA1 Message Date
Paul Pluzhnikov
50d004c91c Update ulps with "make regen-ulps" on AMD Ryzen 7 1800X.
2018-05-30  Paul Pluzhnikov  <ppluzhnikov@google.com>

	* sysdeps/x86_64/fpu/libm-test-ulps (log_vlen8_avx2): Update for
	AMD Ryzen 7 1800X.
2018-05-30 09:17:47 -07:00
Wilco Dijkstra
19a8b9a300 [PATCH 1/7] sin/cos slow paths: avoid slow paths for small inputs
This series of patches removes the slow patchs from sin, cos and sincos.
Besides greatly simplifying the implementation, the new version is also much
faster for inputs up to PI (41% faster) and for large inputs needing range
reduction (27% faster).

ULP is ~0.55 with no errors found after testing 1.6 billion inputs across most
of the range with mpsin and mpcos.  The number of incorrectly rounded results
(ie. ULP >0.5) is at most ~2750 per million inputs between 0.125 and 0.5,
the average is ~850 per million between 0 and PI.

Tested on AArch64 and x86_64 with no regressions.

The first patch removes the slow paths for the cases where the input is small
and doesn't require range reduction.  Update ULP tables for sin, cos and sincos
on AArch64 and x86_64.

	* sysdeps/aarch64/libm-test-ulps: Update ULP for sin, cos, sincos.
	* sysdeps/ieee754/dbl-64/s_sin.c (__sin): Remove slow paths for small
	inputs.
	(__cos): Likewise.
	* sysdeps/x86_64/fpu/libm-test-ulps: Update ULP for sin, cos, sincos.
2018-04-03 16:52:16 +01:00
Wilco Dijkstra
700593fdd7 Remove all target specific __ieee754_sqrt(f/l) inlines
Remove the now unused target specific__ieee754_sqrt(f/l) inlines.
Also remove inlines of sqrt which are for really old GCC versions.
Removing these is desirable, under the general principle of leaving
such inlining to the compiler rather than trying to do it in installed
headers, especially when only very old compilers are affected.

Note that removing inlines for __ieee754_sqrt disables inlining in the
sqrt wrapper functions.  Given the sqrt function will typically only be
called for negative arguments, it doesn't matter whether the inlining
happens or not.

	* sysdeps/aarch64/fpu/math_private.h (__ieee754_sqrt): Remove.
	(__ieee754_sqrtf): Remove.
	* sysdeps/alpha/fpu/math_private.h (__ieee754_sqrt): Remove.
	(__ieee754_sqrtf): Remove.
	* sysdeps/generic/math-type-macros.h (M_SQRT): Use sqrt.
	* sysdeps/m68k/m680x0/fpu/mathimpl.h (__ieee754_sqrt): Remove.
	* sysdeps/powerpc/fpu/math_private.h (__ieee754_sqrt): Remove.
	(__ieee754_sqrtf): Remove.
	* sysdeps/s390/fpu/bits/mathinline.h: Remove file.
	* sysdeps/sparc/fpu/bits/mathinline.h (sqrt) Remove.
	(sqrtf): Remove.
	(sqrtl): Remove.
	(__ieee754_sqrt): Remove.
	(__ieee754_sqrtf): Remove.
	(__ieee754_sqrtl): Remove.
	* sysdeps/m68k/m680x0/fpu/mathimpl.h (__ieee754_sqrt): Remove.
	* sysdeps/x86/fpu/math_private.h (__ieee754_sqrt): Remove.
	* sysdeps/x86_64/fpu/math_private.h (__ieee754_sqrt): Remove.
	(__ieee754_sqrtf): Remove.
	(__ieee754_sqrtl): Remove.
2018-03-15 19:21:36 +00:00
Wilco Dijkstra
610ee1fc93 Remove mplog and mpexp
Remove the now unused mplog and mpexp files.

	* math/Makefile: Remove mpexp.c and mplog.c
	* sysdeps/i386/fpu/mpexp.c: Delete file.
	* sysdeps/i386/fpu/mplog.c: Likewise.
	* sysdeps/ia64/fpu/mpexp.c: Likewise.
	* sysdeps/ia64/fpu/mplog.c: Likewise.
	* sysdeps/ieee754/dbl-64/e_exp.c: Remove mention of mpexp and mplog.
	* sysdeps/ieee754/dbl-64/mpa.h (__pow_mp): Remove unused function.
	* sysdeps/ieee754/dbl-64/mpexp.c: Delete file.
	* sysdeps/ieee754/dbl-64/mplog.c: Likewise.
	* sysdeps/m68k/m680x0/fpu/mpexp.c: Likewise.
	* sysdeps/m68k/m680x0/fpu/mplog.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/Makefile: Remove mpexp* and mplog*.
	* sysdeps/x86_64/fpu/multiarch/e_log-avx.c: Remove unused defines.
	* sysdeps/x86_64/fpu/multiarch/e_log-fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_log-fma4.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/mpexp-avx.c: Delete file.
	* sysdeps/x86_64/fpu/multiarch/mpexp-fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/mpexp-fma4.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/mplog-avx.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/mplog-fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/mplog-fma4.c: Likewise.
2018-02-15 12:41:05 +00:00
Szabolcs Nagy
de800d8305 Remove slow paths from exp
Remove the __slowexp code, so exp is no longer correctly rounded.  The
result is computed to about 70 bits precision so the worst case ulp
error is about 0.500007 in nearest rounding mode.

	* manual/probes.texi: Remove slowexp probes.
	* math/Makefile: Remove slowexp.
	* sysdeps/generic/math_private.h (__slowexp): Remove.
	* sysdeps/ieee754/dbl-64/e_exp.c (__ieee754_exp): Remove __slowexp and
	document error bounds.
	* sysdeps/i386/fpu/slowexp.c: Remove.
	* sysdeps/ia64/fpu/slowexp.c: Remove.
	* sysdeps/ieee754/dbl-64/slowexp.c: Remove.
	* sysdeps/ieee754/dbl-64/uexp.h (err_0): Remove.
	* sysdeps/m68k/m680x0/fpu/slowexp.c: Remove.
	* sysdeps/powerpc/power4/fpu/Makefile (CPPFLAGS-slowexp.c): Remove.
	* sysdeps/x86_64/fpu/multiarch/Makefile: Remove slowexp-fma.
	* sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__slowexp): Remove.
	* sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__slowexp): Remove.
	* sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__slowexp): Remove.
	* sysdeps/x86_64/fpu/multiarch/slowexp-avx.c: Remove.
	* sysdeps/x86_64/fpu/multiarch/slowexp-fma.c: Remove.
	* sysdeps/x86_64/fpu/multiarch/slowexp-fma4.c: Remove.
2018-02-12 11:33:33 +00:00
Wilco Dijkstra
c3d466cba1 Remove slow paths from pow
Remove the slow paths from pow.  Like several other double precision math
functions, pow is exactly rounded.  This is not required from math functions
and causes major overheads as it requires multiple fallbacks using higher
precision arithmetic if a result is close to 0.5ULP.  Ridiculous slowdowns
of up to 100000x have been reported when the highest precision path triggers.

All GLIBC math tests pass on AArch64 and x64 (with ULP of pow set to 1).
The worst case error is ~0.506ULP.  A simple test over a few hundred million
values shows pow is 10% faster on average.  This fixes BZ #13932.

	[BZ #13932]
	* sysdeps/ieee754/dbl-64/uexp.h (err_1): Remove.
	* benchtests/pow-inputs: Update comment for slow path cases.
	* manual/probes.texi (slowpow_p10): Delete removed probe.
	(slowpow_p10): Likewise.
	* math/Makefile: Remove halfulp.c and slowpow.c.
	* sysdeps/aarch64/libm-test-ulps: Set ULP of pow to 1.
	* sysdeps/generic/math_private.h (__exp1): Remove error argument.
	(__halfulp): Remove.
	(__slowpow): Remove.
	* sysdeps/i386/fpu/halfulp.c: Delete file.
	* sysdeps/i386/fpu/slowpow.c: Likewise.
	* sysdeps/ia64/fpu/halfulp.c: Likewise.
	* sysdeps/ia64/fpu/slowpow.c: Likewise.
	* sysdeps/ieee754/dbl-64/e_exp.c (__exp1): Remove error argument,
	improve comments and add error analysis.
	* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Add error analysis.
	(power1): Remove function:
	(log1): Remove error argument, add error analysis.
	(my_log2): Remove function.
	* sysdeps/ieee754/dbl-64/halfulp.c: Delete file.
	* sysdeps/ieee754/dbl-64/slowpow.c: Likewise.
	* sysdeps/m68k/m680x0/fpu/halfulp.c: Likewise.
	* sysdeps/m68k/m680x0/fpu/slowpow.c: Likewise.
	* sysdeps/powerpc/power4/fpu/Makefile: Remove CPPFLAGS-slowpow.c.
	* sysdeps/x86_64/fpu/libm-test-ulps: Set ULP of pow to 1.
	* sysdeps/x86_64/fpu/multiarch/Makefile: Remove slowpow-fma.c,
	slowpow-fma4.c, halfulp-fma.c, halfulp-fma4.c.
	* sysdeps/x86_64/fpu/multiarch/e_pow-fma.c (__slowpow): Remove define.
	* sysdeps/x86_64/fpu/multiarch/e_pow-fma4.c (__slowpow): Likewise.
	* sysdeps/x86_64/fpu/multiarch/halfulp-fma.c: Delete file.
	* sysdeps/x86_64/fpu/multiarch/halfulp-fma4.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/slowpow-fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/slowpow-fma4.c: Likewise.
2018-02-12 10:47:09 +00:00
H.J. Lu
c70e4e9c9e x86-64: Add sincosf with vector FMA
Since the x86-64 assembly version of sincosf is higly optimized with
vector instructions, there isn't much room for improvement.  However
s_sincosf.c written in C with vector math and intrinsics can be
optimized by GCC with FMA.

On Skylake, bench-sincosf reports performance improvement:

           Assembly       FMA         improvement
max        104.042       101.008         3%
min        9.426         8.586           10%
mean       20.6209       18.2238         13%

	* sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines):
	Add s_sincosf-sse2 and s_sincosf-fma.
	(CFLAGS-s_sincosf-fma.c): New.
	* sysdeps/x86_64/fpu/multiarch/s_sincosf-fma.c: New file.
	* sysdeps/x86_64/fpu/multiarch/s_sincosf-sse2.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_sincosf.c: Likewise.
	* sysdeps/x86_64/fpu/s_sincosf.S: Don't add alias if
	__sincosf is defined.
2018-01-08 08:04:40 -08:00
Joseph Myers
688903eb3e Update copyright dates with scripts/update-copyrights.
* All files with FSF copyright notices: Update copyright dates
	using scripts/update-copyrights.
	* locale/programs/charmap-kw.h: Regenerated.
	* locale/programs/locfile-kw.h: Likewise.
2018-01-01 00:32:25 +00:00
Joseph Myers
f1e005022e Revert exp reimplementation (causes test failures).
Revert:

	2017-12-19  Joseph Myers  <joseph@codesourcery.com>

	* sysdeps/x86_64/fpu/libm-test-ulps: Update.

	2017-12-19  Patrick McGehearty  <patrick.mcgehearty@oracle.com>

	* sysdeps/ieee754/dbl-64/e_exp.c: Include <math-svid-compat.h> and
	<errno.h>.  Include "eexp.tbl".
	(half): New constant.
	(one): Likewise.
	(__ieee754_exp): Rewrite.
	(__slowexp): Remove prototype.
	* sysdeps/ieee754/dbl-64/eexp.tbl: New file.
	* sysdeps/ieee754/dbl-64/slowexp.c: Remove file.
	* sysdeps/i386/fpu/slowexp.c: Likewise.
	* sysdeps/ia64/fpu/slowexp.c: Likewise.
	* sysdeps/m68k/m680x0/fpu/slowexp.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/slowexp-avx.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/slowexp-fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/slowexp-fma4.c: Likewise.
	* sysdeps/generic/math_private.h (__slowexp): Remove prototype.
	* sysdeps/ieee754/dbl-64/e_pow.c: Remove mention of slowexp.c in
	comment.
	* sysdeps/powerpc/power4/fpu/Makefile [$(subdir) = math]
	(CPPFLAGS-slowexp.c): Remove variable.
	* sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines):
	Remove slowexp-fma, slowexp-fma4 and slowexp-avx.
	(CFLAGS-slowexp-fma.c): Remove variable.
	(CFLAGS-slowexp-fma4.c): Likewise.
	(CFLAGS-slowexp-avx.c): Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__slowexp): Do not
	define as macro.
	* sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__slowexp): Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__slowexp): Likewise.
	* math/Makefile (type-double-routines): Remove slowexp.
	* manual/probes.texi (slowexp_p6): Remove.
	(slowexp_p32): Likewise.
2017-12-19 18:11:37 +00:00
Joseph Myers
6f58c10ded Update x86_64 libm-test-ulps.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
2017-12-19 17:38:41 +00:00
Patrick McGehearty
6fd0a3c6a8 Improve __ieee754_exp() performance by greater than 5x on sparc/x86.
These changes will be active for all platforms that don't provide
their own exp() routines. They will also be active for ieee754
versions of ccos, ccosh, cosh, csin, csinh, sinh, exp10, gamma, and
erf.

Typical performance gains is typically around 5x when measured on
Sparc s7 for common values between exp(1) and exp(40).

Using the glibc perf tests on sparc,
      sparc (nsec)    x86 (nsec)
      old     new     old     new
max   17629   395    5173     144
min     399    54      15      13
mean   5317   200    1349      23

The extreme max times for the old (ieee754) exp are due to the
multiprecision computation in the old algorithm when the true value is
very near 0.5 ulp away from an value representable in double
precision. The new algorithm does not take special measures for those
cases. The current glibc exp perf tests overrepresent those values.
Informal testing suggests approximately one in 200 cases might
invoke the high cost computation. The performance advantage of the new
algorithm for other values is still large but not as large as indicated
by the chart above.

Glibc correctness tests for exp() and expf() were run. Within the
test suite 3 input values were found to cause 1 bit differences (ulp)
when "FE_TONEAREST" rounding mode is set. No differences in exp() were
seen for the tested values for the other rounding modes.
Typical example:
exp(-0x1.760cd2p+0)  (-1.46113312244415283203125)
 new code:    2.31973271630014299393707e-01   0x1.db14cd799387ap-3
 old code:    2.31973271630014271638132e-01   0x1.db14cd7993879p-3
    exp    =  2.31973271630014285508337 (high precision)
Old delta: off by 0.49 ulp
New delta: off by 0.51 ulp

In addition, because ieee754_exp() is used by other routines, cexp()
showed test results with very small imaginary input values where the
imaginary portion of the result was off by 3 ulp when in upward
rounding mode, but not in the other rounding modes.  For x86, tgamma
showed a few values where the ulp increased to 6 (max ulp for tgamma
is 5). Sparc tgamma did not show these failures.  I presume the tgamma
differences are due to compiler optimization differences within the
gamma function.The gamma function is known to be difficult to compute
accurately.

	* sysdeps/ieee754/dbl-64/e_exp.c: Include <math-svid-compat.h> and
	<errno.h>.  Include "eexp.tbl".
	(half): New constant.
	(one): Likewise.
	(__ieee754_exp): Rewrite.
	(__slowexp): Remove prototype.
	* sysdeps/ieee754/dbl-64/eexp.tbl: New file.
	* sysdeps/ieee754/dbl-64/slowexp.c: Remove file.
	* sysdeps/i386/fpu/slowexp.c: Likewise.
	* sysdeps/ia64/fpu/slowexp.c: Likewise.
	* sysdeps/m68k/m680x0/fpu/slowexp.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/slowexp-avx.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/slowexp-fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/slowexp-fma4.c: Likewise.
	* sysdeps/generic/math_private.h (__slowexp): Remove prototype.
	* sysdeps/ieee754/dbl-64/e_pow.c: Remove mention of slowexp.c in
	comment.
	* sysdeps/powerpc/power4/fpu/Makefile [$(subdir) = math]
	(CPPFLAGS-slowexp.c): Remove variable.
	* sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines):
	Remove slowexp-fma, slowexp-fma4 and slowexp-avx.
	(CFLAGS-slowexp-fma.c): Remove variable.
	(CFLAGS-slowexp-fma4.c): Likewise.
	(CFLAGS-slowexp-avx.c): Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__slowexp): Do not
	define as macro.
	* sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__slowexp): Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__slowexp): Likewise.
	* math/Makefile (type-double-routines): Remove slowexp.
	* manual/probes.texi (slowexp_p6): Remove.
	(slowexp_p32): Likewise.
2017-12-19 17:27:31 +00:00
H.J. Lu
4ca945e9c5 x86-64: Remove sysdeps/x86_64/fpu/s_cosf.S
On Ivy Bridge, bench-cosf reports performance improvement:

           s_cosf.S      s_cosf.c       Improvement
max        114.136       82.401            39%
min        13.119        11.301            16%
mean       22.1882       21.8938           1%

	* sysdeps/x86_64/fpu/s_cosf.S: Removed.
2017-12-14 04:39:53 -08:00
H.J. Lu
ac817e083b x86-64: Add cosf with FMA
On Skylake, bench-cosf reports performance improvement:

            Before        After         Improvement
max        135.362       94.552            43%
min        8.532         7.688             11%
mean       17.1446       11.8128           45%

	* sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines):
	Add s_cosf-sse2 and s_cosf-fma.
	(CFLAGS-s_cosf-fma.c): New.
	* sysdeps/x86_64/fpu/multiarch/s_cosf-fma.c: New file.
	* sysdeps/x86_64/fpu/multiarch/s_cosf-sse2.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_cosf.c: Likewise.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2017-12-12 15:32:58 -08:00
H.J. Lu
9d0ffa60ad x86-64: Add sinf with FMA
On Skylake, bench-sinf reports performance improvement:

            Before        After         Improvement
max        153.996       100.094           54%
min        8.546         6.852             25%
mean       18.1223       11.802            54%

	* sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines):
	Add s_sinf-sse2 and s_sinf-fma.
	(CFLAGS-s_sinf-fma.c): New.
	* sysdeps/x86_64/fpu/multiarch/s_sinf-fma.c: New file.
	* sysdeps/x86_64/fpu/multiarch/s_sinf-sse2.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_sinf.c: Likewise.
2017-12-07 10:11:16 -08:00
H.J. Lu
9574c7b68d x86-64: Remove sysdeps/x86_64/fpu/s_sinf.S
On Ivy Bridge, bench-sinf reports performance improvement:

          s_sinf.S      s_sinf.c       Improvement
max        91.521        86.148            6%
min        14.061        11.265            25%
mean       23.3758       23.3344           0.2%

	* sysdeps/x86_64/fpu/s_sinf.S: Removed.
2017-12-07 09:44:17 -08:00
Joseph Myers
34bb10aabf Use libm_alias_float for x86_64.
Continuing the preparation for additional _FloatN / _FloatNx function
aliases, this patch makes x86_64 libm function implementations use
libm_alias_float to define function aliases, or libm_alias_float_other
where the main name is defined with versioned_symbol.

Tested with the glibc testsuite for x86_64, and tested with
build-many-glibcs.py for all its x86_64 configurations that installed
stripped shared libraries are unchanged by the patch.

	* sysdeps/x86_64/fpu/multiarch/e_exp2f.c: Include
	<libm-alias-float.h>.
	(exp2f): Define using libm_alias_float, or libm_alias_float_other
	if [SHARED].
	* sysdeps/x86_64/fpu/multiarch/e_expf.c: Include
	<libm-alias-float.h>.
	(exp2f): Define using libm_alias_float, or libm_alias_float_other
	if [SHARED].
	* sysdeps/x86_64/fpu/multiarch/e_log2f.c: Include
	<libm-alias-float.h>.
	(exp2f): Define using libm_alias_float, or libm_alias_float_other
	if [SHARED].
	* sysdeps/x86_64/fpu/multiarch/e_logf.c: Include
	<libm-alias-float.h>.
	(exp2f): Define using libm_alias_float, or libm_alias_float_other
	if [SHARED].
	* sysdeps/x86_64/fpu/multiarch/e_powf.c: Include
	<libm-alias-float.h>.
	(exp2f): Define using libm_alias_float, or libm_alias_float_other
	if [SHARED].
	* sysdeps/x86_64/fpu/multiarch/s_ceilf.c: Include
	<libm-alias-float.h>.
	(ceilf): Define using libm_alias_float.
	* sysdeps/x86_64/fpu/multiarch/s_floorf.c: Include
	<libm-alias-float.h>.
	(floorf): Define using libm_alias_float.
	* sysdeps/x86_64/fpu/multiarch/s_fmaf.c: Include
	<libm-alias-float.h>.
	(fmaf): Define using libm_alias_float.
	* sysdeps/x86_64/fpu/multiarch/s_nearbyintf.c: Include
	<libm-alias-float.h>.
	(nearbyintf): Define using libm_alias_float.
	* sysdeps/x86_64/fpu/multiarch/s_rintf.c: Include
	<libm-alias-float.h>.
	(rintf): Define using libm_alias_float.
	* sysdeps/x86_64/fpu/multiarch/s_truncf.c: Include
	<libm-alias-float.h>.
	(truncf): Define using libm_alias_float.
	* sysdeps/x86_64/fpu/s_copysignf.S: Include <libm-alias-float.h>.
	(copysignf): Define using libm_alias_float.
	* sysdeps/x86_64/fpu/s_cosf.S: Include <libm-alias-float.h>.
	(cosf): Define using libm_alias_float.
	* sysdeps/x86_64/fpu/s_fabsf.c: Include <libm-alias-float.h>.
	(fabsf): Define using libm_alias_float.
	* sysdeps/x86_64/fpu/s_fmaxf.S: Include <libm-alias-float.h>.
	(fmaxf): Define using libm_alias_float.
	* sysdeps/x86_64/fpu/s_fminf.S: Include <libm-alias-float.h>.
	(fminf): Define using libm_alias_float.
	* sysdeps/x86_64/fpu/s_llrintf.S: Include <libm-alias-float.h>.
	(llrintf): Define using libm_alias_float.
	[!__ILP32__] (lrintf): Likewise.
	* sysdeps/x86_64/fpu/s_sincosf.S: Include <libm-alias-float.h>.
	(sincosf): Define using libm_alias_float.
	* sysdeps/x86_64/fpu/s_sinf.S: Include <libm-alias-float.h>.
	(sinf): Define using libm_alias_float.
	* sysdeps/x86_64/x32/fpu/s_lrintf.S: Include <libm-alias-float.h>.
	(lrintf): Define using libm_alias_float.
2017-11-29 21:25:41 +00:00
Joseph Myers
011fba7e48 Use libm_alias_double for x86_64.
Continuing the preparation for additional _FloatN / _FloatNx function
aliases, this patch makes x86_64 libm function implementations use
libm_alias_double to define function aliases.

Tested with the glibc testsuite for x86_64, and tested with
build-many-glibcs.py for all its x86_64 configurations that installed
stripped shared libraries are unchanged by the patch.

	* sysdeps/x86_64/fpu/multiarch/s_atan.c: Include
	<libm-alias-double.h>.
	(atan): Define using libm_alias_double.
	* sysdeps/x86_64/fpu/multiarch/s_ceil.c: Include
	<libm-alias-double.h>.
	(ceil): Define using libm_alias_double.
	* sysdeps/x86_64/fpu/multiarch/s_floor.c: Include
	<libm-alias-double.h>.
	(floor): Define using libm_alias_double.
	* sysdeps/x86_64/fpu/multiarch/s_fma.c: Include
	<libm-alias-double.h>.
	(fma): Define using libm_alias_double.
	* sysdeps/x86_64/fpu/multiarch/s_nearbyint.c: Include
	<libm-alias-double.h>.
	(nearbyint): Define using libm_alias_double.
	* sysdeps/x86_64/fpu/multiarch/s_rint.c: Include
	<libm-alias-double.h>.
	(rint): Define using libm_alias_double.
	* sysdeps/x86_64/fpu/multiarch/s_sin.c: Include
	<libm-alias-double.h>.
	(sin): Define using libm_alias_double.
	(cos): Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_tan.c: Include
	<libm-alias-double.h>.
	(tan): Define using libm_alias_double.
	* sysdeps/x86_64/fpu/multiarch/s_trunc.c: Include
	<libm-alias-double.h>.
	(trunc): Define using libm_alias_double.
	* sysdeps/x86_64/fpu/s_copysign.S: Include <libm-alias-double.h>.
	(copysign): Define using libm_alias_double.
	* sysdeps/x86_64/fpu/s_fabs.c: Include <libm-alias-double.h>.
	(fabs): Define using libm_alias_double.
	* sysdeps/x86_64/fpu/s_fmax.S: Include <libm-alias-double.h>.
	(fmax): Define using libm_alias_double.
	* sysdeps/x86_64/fpu/s_fmin.S: Include <libm-alias-double.h>.
	(fmin): Define using libm_alias_double.
	* sysdeps/x86_64/fpu/s_llrint.S: Include <libm-alias-double.h>.
	(llrint): Define using libm_alias_double.
	[!__ILP32__] (lrint): Likewise.
	* sysdeps/x86_64/x32/fpu/s_lrint.S: Include <libm-alias-double.h>.
	(lrint): Define using libm_alias_double.
2017-11-29 19:01:21 +00:00
Joseph Myers
f58e5f4809 Use libm_alias_ldouble in sysdeps/x86_64/fpu.
This patch continues the preparation for additional _FloatN / _FloatNx
function aliases by using libm_alias_ldouble for sysdeps/x86_64/fpu
long double functions, so that they can have _Float64x aliases added
in future.

Tested for x86_64, including build-many-glibcs.py tests that installed
stripped shared libraries are unchanged by the patch.

	* sysdeps/x86_64/fpu/e_expl.S: Include <libm-alias-ldouble.h>.
	[USE_AS_EXPM1L] (expm1l): Define using libm_alias_ldouble.
	* sysdeps/x86_64/fpu/s_ceill.S: Include <libm-alias-ldouble.h>.
	(ceill): Define using libm_alias_ldouble.
	* sysdeps/x86_64/fpu/s_copysignl.S: Include
	<libm-alias-ldouble.h>.
	(copysignl): Define using libm_alias_ldouble.
	* sysdeps/x86_64/fpu/s_fabsl.S: Include <libm-alias-ldouble.h>.
	(fabsl): Define using libm_alias_ldouble.
	* sysdeps/x86_64/fpu/s_floorl.S: Include <libm-alias-ldouble.h>.
	(floorl): Define using libm_alias_ldouble.
	* sysdeps/x86_64/fpu/s_fmaxl.S: Include <libm-alias-ldouble.h>.
	(fmaxl): Define using libm_alias_ldouble.
	* sysdeps/x86_64/fpu/s_fminl.S: Include <libm-alias-ldouble.h>.
	(fminl): Define using libm_alias_ldouble.
	* sysdeps/x86_64/fpu/s_llrintl.S: Include <libm-alias-ldouble.h>.
	(llrintl): Define using libm_alias_ldouble.
	(lrintl): Likewise.
	* sysdeps/x86_64/fpu/s_nearbyintl.S: Include
	<libm-alias-ldouble.h>.
	(nearbyintl): Define using libm_alias_ldouble.
	* sysdeps/x86_64/fpu/s_truncl.S: Include <libm-alias-ldouble.h>.
	(truncl): Define using libm_alias_ldouble.
	* sysdeps/x86_64/x32/fpu/s_lrintl.S: Include
	<libm-alias-ldouble.h>.
	(lrintl): Define using libm_alias_ldouble.
2017-11-17 23:39:11 +00:00
H.J. Lu
a122dbfb2e Replace "if if " with "if " in comments
* include/alloc_buffer.h: Replace "if if " with "if " in
	comments.
	* sysdeps/mips/memcpy.S: Likkewise.
	* sysdeps/mips/memset.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core_avx512.S:
	Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core_sse4.S:
	Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core_avx2.S:
	Likewise.
2017-10-25 08:05:51 -07:00
H.J. Lu
80bb593563 x86-64: Add powf with FMA
For workload-spec2017.wrf, on Skylake, it improves performance by:

                           Before            After     Improvement
reciprocal-throughput      35.4713          27.3842       29%
latency                    82.4537          66.3175       24%

	* sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines):
	Add e_powf-fma.
	(CFLAGS-e_powf-fma.c): New.
	* sysdeps/x86_64/fpu/multiarch/e_powf-fma.c: New file.
	* sysdeps/x86_64/fpu/multiarch/e_powf.c: Likewise.
2017-10-22 08:08:00 -07:00
H.J. Lu
5c7adbd8ed x86-64: Add log2f with FMA
For workload-spec2017.wrf, on Skylake, it improves performance by:

                           Before            After     Improvement
reciprocal-throughput      16.5937          14.0789       17%
latency                    41.7755          35.3586       18%

	* sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines):
	Add e_log2f-fma.
	(CFLAGS-e_log2f-fma.c): New.
	* sysdeps/x86_64/fpu/multiarch/e_log2f-fma.c: New file.
	* sysdeps/x86_64/fpu/multiarch/e_log2f.c: Likewise.
2017-10-22 08:06:58 -07:00
H.J. Lu
0ccc7153cc x86-64: Add logf with FMA
For workload-spec2017.wrf, on Skylake, it improves performance by:

                           Before            After     Improvement
reciprocal-throughput      16.1534          13.8874       16%
latency                    41.9642          34.3072       22%

	* sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines):
	Add e_logf-fma.
	(CFLAGS-e_logf-fma.c): New.
	* sysdeps/x86_64/fpu/multiarch/e_logf-fma.c: New file.
	* sysdeps/x86_64/fpu/multiarch/e_logf.c: Likewise.
2017-10-22 08:05:15 -07:00
H.J. Lu
5d15c96975 x86-64: Add exp2f with FMA
For workload-spec2017.wrf, on Skylake, it improves performance by:

                           Before            After     Improvement
reciprocal-throughput      13.0291          11.2225       16%
latency                    44.5154          37.5766       18%

	* sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines):
	Add e_exp2f-fma.
	(CFLAGS-e_exp2f-fma.c): New.
	* sysdeps/x86_64/fpu/multiarch/e_exp2f-fma.c: New file.
	* sysdeps/x86_64/fpu/multiarch/e_exp2f.c: Likewise.
2017-10-22 07:57:50 -07:00
H.J. Lu
e1f59bebd8 x86-64: Replace assembly versions of e_expf with generic e_expf.c
This patch replaces x86-64 assembly versions of e_expf with generic
e_expf.c.  For workload-spec2017.wrf, on Nehalem, it improves
performance by:

                           Before            After     Improvement
reciprocal-throughput      36.039           20.7749       73%
latency                    58.8096          40.8715       43%

On Skylake, it improves

                           Before            After     Improvement
reciprocal-throughput      18.4436          11.1693       65%
latency                    47.5162          37.5411       26%

	* sysdeps/x86_64/fpu/e_expf.S: Removed.
	* sysdeps/x86_64/fpu/multiarch/e_expf-fma.S: Likewise.
	* sysdeps/x86_64/fpu/w_expf.c: Likewise.
	* sysdeps/x86_64/fpu/libm-test-ulps: Updated for generic
	e_expf.c.
	* sysdeps/x86_64/fpu/multiarch/Makefile (CFLAGS-e_expf-fma.c):
	New.
	* sysdeps/x86_64/fpu/multiarch/e_expf-fma.c: New file.
	* sysdeps/x86_64/fpu/multiarch/e_expf.c (__redirect_ieee754_expf):
	Renamed to ...
	(__redirect_expf): This.
	(SYMBOL_NAME): Changed to expf.
	(__ieee754_expf): Renamed to ...
	(__expf): This.
	(__GI___expf): This.
	(__ieee754_expf): Add strong_alias.
	(__expf_finite): Likewise.
	(__expf): New.
	Include <sysdeps/ieee754/flt-32/e_expf.c>.
2017-10-22 07:49:55 -07:00
Joseph Myers
527cd19c3d Make dbl-64 atan and tan into weak aliases.
This patch converts the dbl-64 implementations of atan and tan into
weak aliases of __atan and __tan, in preparation for making them use
libm_alias_double.  Consequent changes are made to the x86_64
multiarch versions wrapping round them (with the dbl-64 functions,
like other such functions, being made not to define their aliases at
all if __atan or __tan are defined as macros by an including file).

Tested for x86_64, and with build-many-glibcs.py.

	* sysdeps/ieee754/dbl-64/s_atan.c (atan): Rename to __atan and
	define as weak alias of __atan.  Do not define any aliases if
	[__atan].
	[NO_LONG_DOUBLE] (__atanl): Define as strong alias of __atan.
	[NO_LONG_DOUBLE] (atanl): Define as weak alias of __atanl.
	* sysdeps/ieee754/dbl-64/s_tan.c (tan): Rename to __tan and define
	as weak alias of __tan.  Do not define any aliases if [__tan].
	[NO_LONG_DOUBLE] (__tanl): Define as strong alias of __tan.
	[NO_LONG_DOUBLE] (tanl): Define as weak alias of __tanl.
	* sysdeps/x86_64/fpu/multiarch/s_atan-avx.c (atan): Rename to
	__atan.
	* sysdeps/x86_64/fpu/multiarch/s_atan-fma.c (atan): Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_atan-fma4.c (atan): Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_atan.c (atan): Rename to __atan
	and define as weak alias of __atan.
	* sysdeps/x86_64/fpu/multiarch/s_tan-avx.c (tan): Rename to
	__atan.
	* sysdeps/x86_64/fpu/multiarch/s_tan-fma.c (tan): Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_tan-fma4.c (tan): Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_tan.c (tan): Rename to __tan and
	define as weak alias of __tan.
2017-10-02 20:20:52 +00:00
Szabolcs Nagy
f7a0b063e7 Do not wrap expf and exp2f
The new generic expf and exp2f code don't need wrappers any more, they
set errno inline, so only use the wrappers on targets that need it.
(If the wrapper is needed, then the top level wrapper code is included,
otherwise empty w_exp*f.c is used to suppress the wrapper.)

A powerpc64 expf implementation includes the expf c code directly which
needed some changes.

	* sysdeps/ieee754/flt-32/e_exp2f.c (__exp2f): Define without wrapper.
	* sysdeps/ieee754/flt-32/e_expf.c (__expf): Likewise
	* sysdeps/ieee754/flt-32/w_exp2f.c: New file.
	* sysdeps/ieee754/flt-32/w_expf.c: New file.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/e_expf-ppc64.c: Update for
	the new expf code.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/w_expf.c: New file.
	* sysdeps/powerpc/powerpc64/power8/fpu/w_expf.c: New file.
	* sysdeps/m68k/m680x0/fpu/w_exp2f.c: New file.
	* sysdeps/m68k/m680x0/fpu/w_expf.c: New file.
	* sysdeps/i386/fpu/w_exp2f.c: New file.
	* sysdeps/i386/fpu/w_expf.c: New file.
	* sysdeps/i386/i686/fpu/multiarch/w_expf.c: New file.
	* sysdeps/x86_64/fpu/w_expf.c: New file.
2017-10-02 14:38:54 +01:00
Joseph Myers
2f92505d20 Update x86_64 libm-test-ulps.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
2017-09-29 18:03:48 +00:00
Joseph Myers
ae8372d7e4 Add SSE4.1 trunc, truncf (bug 20142).
This patch adds SSE4.1 versions of trunc and truncf, using the roundsd
/ roundss instructions, similar to the versions of ceil, floor, rint
and nearbyint functions we already have.  In my testing with the glibc
benchtests these are about 30% faster than the C versions for double,
20% faster for float.

Tested for x86_64.

	[BZ #20142]
	* sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines):
	Add s_trunc-c, s_truncf-c, s_trunc-sse4_1 and s_truncf-sse4_1.
	* sysdeps/x86_64/fpu/multiarch/s_trunc-c.c: New file.
	* sysdeps/x86_64/fpu/multiarch/s_trunc-sse4_1.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_trunc.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_truncf-c.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_truncf-sse4_1.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_truncf.c: Likewise.
2017-09-20 16:54:05 +00:00
H.J. Lu
ef8adeb041 x86: Add MathVec_Prefer_No_AVX512 to cpu-features [BZ #21967]
AVX512 functions in mathvec are used on machines with AVX512.  An AVX2
wrapper is also provided and it can be used when the AVX512 version
isn't profitable.  MathVec_Prefer_No_AVX512 is addded to cpu-features.
If glibc.tune.hwcaps=MathVec_Prefer_No_AVX512 is set in GLIBC_TUNABLES
environment variable, the AVX2 wrapper will be used.

Tested on x86-64 machines with and without AVX512.  Also verified
glibc.tune.hwcaps=MathVec_Prefer_No_AVX512 on AVX512 machine.

	[BZ #21967]
	* sysdeps/x86/cpu-features.h (bit_arch_MathVec_Prefer_No_AVX512):
	New.
	(index_arch_MathVec_Prefer_No_AVX512): Likewise.
	* sysdeps/x86/cpu-tunables.c (TUNABLE_CALLBACK (set_hwcaps)):
	Handle MathVec_Prefer_No_AVX512.
	* sysdeps/x86_64/fpu/multiarch/ifunc-mathvec-avx512.h
	(IFUNC_SELECTOR): Return AVX2 version if MathVec_Prefer_No_AVX512
	is set.
2017-09-12 07:54:47 -07:00
Markus Trippelsdorf
4c03a69680 Update x86_64 ulps for AMD Ryzen.
* sysdeps/x86_64/fpu/libm-test-ulps: Update for AMD Ryzen.
2017-09-08 19:57:12 +00:00
Joseph Myers
5a80d39d0d Obsolete pow10 functions.
This patch obsoletes the pow10, pow10f and pow10l functions (makes
them into compat symbols, not available for new ports or static
linking).  The exp10 names for these functions are standardized (in TS
18661-4) and were added in the same glibc version (2.1) as pow10 so
source code can change to use them without any loss of portability.
Since pow10 is deliberately not provided for _Float128, only exp10,
this slightly simplifies moving to the new wrapper templates in the
!LIBM_SVID_COMPAT case, by avoiding needing to arrange for pow10,
pow10f and pow10l to be defined by those templates.

Tested for x86_64, and with build-many-glibcs.py.

	* manual/math.texi (pow10): Do not document.
	(pow10f): Likewise.
	(pow10l): Likewise.
	* math/bits/mathcalls.h [__USE_GNU] (pow10): Do not declare.
	* math/bits/math-finite.h [__USE_GNU] (pow10): Likewise.
	* math/libm-test-exp10.inc (pow10_test): Remove.
	(do_test): Do not call pow10.
	* math/w_exp10_compat.c (pow10): Make into compat symbol.
	[NO_LONG_DOUBLE] (pow10l): Likewise.
	* math/w_exp10f_compat.c (pow10f): Likewise.
	* math/w_exp10l_compat.c (pow10l): Likewise.
	* sysdeps/ia64/fpu/e_exp10.S: Include <shlib-compat.h>.
	(pow10): Make into compat symbol.
	* sysdeps/ia64/fpu/e_exp10f.S: Include <shlib-compat.h>.
	(pow10f): Make into compat symbol.
	* sysdeps/ia64/fpu/e_exp10l.S: Include <shlib-compat.h>.
	(pow10l): Make into compat symbol.
	* sysdeps/ieee754/ldbl-opt/Makefile (libnldbl-calls): Remove
	pow10.
	(CFLAGS-nldbl-pow10.c): Remove variable..
	* sysdeps/ieee754/ldbl-opt/nldbl-pow10.c: Remove file.
	* sysdeps/ieee754/ldbl-opt/w_exp10_compat.c (pow10l): Condition on
	[SHLIB_COMPAT (libm, GLIBC_2_1, GLIBC_2_27)].
	* sysdeps/ieee754/ldbl-opt/w_exp10l_compat.c (compat_symbol):
	Undefine and redefine.
	(pow10l): Make into compat symbol.
	* sysdeps/aarch64/libm-test-ulps: Remove pow10 ulps.
	* sysdeps/alpha/fpu/libm-test-ulps: Likewise.
	* sysdeps/arm/libm-test-ulps: Likewise.
	* sysdeps/hppa/fpu/libm-test-ulps: Likewise.
	* sysdeps/i386/fpu/libm-test-ulps: Likewise.
	* sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise.
	* sysdeps/microblaze/libm-test-ulps: Likewise.
	* sysdeps/mips/mips32/libm-test-ulps: Likewise.
	* sysdeps/mips/mips64/libm-test-ulps: Likewise.
	* sysdeps/nios2/libm-test-ulps: Likewise.
	* sysdeps/powerpc/fpu/libm-test-ulps: Likewise.
	* sysdeps/powerpc/nofpu/libm-test-ulps: Likewise.
	* sysdeps/s390/fpu/libm-test-ulps: Likewise.
	* sysdeps/sh/libm-test-ulps: Likewise.
	* sysdeps/sparc/fpu/libm-test-ulps: Likewise.
	* sysdeps/tile/libm-test-ulps: Likewise.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2017-09-01 21:13:18 +00:00
H.J. Lu
5e2bc4ff33 x86_64 __redirect_ieee754_expf: Change double to float
__redirect_ieee754_expf has type float, not double.

	* sysdeps/x86_64/fpu/multiarch/e_expf.c (__redirect_ieee754_expf):
	Change double to float.
2017-08-28 08:45:53 -07:00
H.J. Lu
fcaaca412f x86-64: Regenerate libm-test-ulps for AVX512 mathvec tests
Update libm-test-ulps for AVX512 mathvec tests by running
“make regen-ulps” on Intel Xeon processor with AVX512.

	* sysdeps/x86_64/fpu/libm-test-ulps: Regenerated.
2017-08-23 09:11:55 -07:00
H.J. Lu
b9eaca8fa0 x86_64: Replace AVX512F .byte sequences with instructions
Since binutils 2.25 or later is required to build glibc, we can replace
AVX512F .byte sequences with AVX512F instructions.

Tested on x86-64 and x32.  There are no code differences in libmvec.so
and libmvec.a.

	* sysdeps/x86_64/fpu/svml_d_sincos8_core.S: Replace AVX512F
	.byte sequences with AVX512F instructions.
	* sysdeps/x86_64/fpu/svml_d_wrapper_impl.h: Likewise.
	* sysdeps/x86_64/fpu/svml_s_sincosf16_core.S: Likewise.
	* sysdeps/x86_64/fpu/svml_s_wrapper_impl.h: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core_avx512.S:
	Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core_avx512.S:
	Likewise.
2017-08-23 06:26:44 -07:00
H.J. Lu
098b9dd468 x86-64: Check FMA_Usable in ifunc-mathvec-avx2.h [BZ #21966]
Since the AVX2 version of mathvec functions uses FMA, it can only be
used when FMA is usable.

	[BZ #21966]
	* sysdeps/x86_64/fpu/multiarch/ifunc-mathvec-avx2.h
	(IFUNC_SELECTOR): Don't use the AVX2 version if FMA isn't
	usable.
2017-08-18 06:19:07 -07:00
H.J. Lu
24a2e6588d x86-64: Optimize e_expf with FMA [BZ #21912]
FMA optimized e_expf improves performance by more than 50% on Skylake.

	[BZ #21912]
	* sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines):
	Add e_expf-fma.
	* sysdeps/x86_64/fpu/multiarch/e_expf-fma.S: New file.
	* sysdeps/x86_64/fpu/multiarch/e_expf.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/ifunc-fma.h: Likewise.
2017-08-16 08:43:48 -07:00
H.J. Lu
f59f7adb4a x86-64: Align L(SP_RANGE)/L(SP_INF_0) to 8 bytes [BZ #21955]
sysdeps/x86_64/fpu/e_expf.S has

        lea     L(SP_RANGE)(%rip), %rdx /* load over/underflow bound */
        cmpl    (%rdx,%rax,4), %ecx     /* |x|<under/overflow bound ? */
...
        /* Here if |x| is Inf */
        lea     L(SP_INF_0)(%rip), %rdx /* depending on sign of x: */
        movss   (%rdx,%rax,4), %xmm0    /* return zero or Inf */
        ret
...
         .section .rodata.cst8,"aM",@progbits,8
...
        .p2align 2
L(SP_RANGE): /* single precision overflow/underflow bounds */
        .long   0x42b17217      /* if x>this bound, then result overflows */
        .long   0x42cff1b4      /* if x<this bound, then result underflows */
        .type L(SP_RANGE), @object
        ASM_SIZE_DIRECTIVE(L(SP_RANGE))

        .p2align 2
L(SP_INF_0):
        .long   0x7f800000      /* single precision Inf */
        .long   0               /* single precision zero */
        .type L(SP_INF_0), @object
        ASM_SIZE_DIRECTIVE(L(SP_INF_0))

Since L(SP_RANGE) and L(SP_INF_0) are in .rodata.cst8 section, they must
be aligned to 8 bytes.

	[BZ #21955]
	* sysdeps/x86_64/fpu/e_expf.S (L(SP_RANGE)): Aligned to 8 bytes.
	(L(SP_INF_0)): Likewise.
2017-08-15 14:05:14 -07:00
H.J. Lu
57a72fa350 x86-64: Add FMA multiarch functions to libm
This patch adds multiarch functions optimized with -mfma -mavx2 to libm.
e_pow-fma.c is compiled with $(config-cflags-nofma) due to PR 19003.

	* sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines):
	Add e_exp-fma, e_log-fma, e_pow-fma, s_atan-fma, e_asin-fma,
	e_atan2-fma, s_sin-fma, s_tan-fma, mplog-fma, mpa-fma,
	slowexp-fma, slowpow-fma, sincos32-fma, doasin-fma, dosincos-fma,
	halfulp-fma, mpexp-fma, mpatan2-fma, mpatan-fma, mpsqrt-fma,
	and mptan-fma.
	(CFLAGS-doasin-fma.c): New.
	(CFLAGS-dosincos-fma.c): Likewise.
	(CFLAGS-e_asin-fma.c): Likewise.
	(CFLAGS-e_atan2-fma.c): Likewise.
	(CFLAGS-e_exp-fma.c): Likewise.
	(CFLAGS-e_log-fma.c): Likewise.
	(CFLAGS-e_pow-fma.c): Likewise.
	(CFLAGS-halfulp-fma.c): Likewise.
	(CFLAGS-mpa-fma.c): Likewise.
	(CFLAGS-mpatan-fma.c): Likewise.
	(CFLAGS-mpatan2-fma.c): Likewise.
	(CFLAGS-mpexp-fma.c): Likewise.
	(CFLAGS-mplog-fma.c): Likewise.
	(CFLAGS-mpsqrt-fma.c): Likewise.
	(CFLAGS-mptan-fma.c): Likewise.
	(CFLAGS-s_atan-fma.c): Likewise.
	(CFLAGS-sincos32-fma.c): Likewise.
	(CFLAGS-slowexp-fma.c): Likewise.
	(CFLAGS-slowpow-fma.c): Likewise.
	(CFLAGS-s_sin-fma.c): Likewise.
	(CFLAGS-s_tan-fma.c): Likewise.
	* sysdeps/x86_64/fpu/multiarch/doasin-fma.c: New file.
	* sysdeps/x86_64/fpu/multiarch/dosincos-fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_asin-fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_atan2-fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_exp-fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_log-fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_pow-fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/halfulp-fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/ifunc-avx-fma4.h: Likewise.
	* sysdeps/x86_64/fpu/multiarch/ifunc-fma4.h: Likewise.
	* sysdeps/x86_64/fpu/multiarch/mpa-fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/mpatan-fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/mpatan2-fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/mpexp-fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/mplog-fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/mpsqrt-fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/mptan-fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_atan-fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_sin-fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_tan-fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/sincos32-fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/slowexp-fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/slowpow-fma.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_asin.c: Rewrite.
	* sysdeps/x86_64/fpu/multiarch/e_atan2.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_exp.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_log.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/e_pow.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_atan.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_sin.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_tan.c: Likewise.
2017-08-07 08:20:56 -07:00
H.J. Lu
8537e0f6cf x86-64: Implement libmathvec IFUNC selectors in C
* sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines)
	Add svml_d_cos2_core-sse2, svml_d_cos4_core-sse,
	svml_d_cos8_core-avx2, svml_d_exp2_core-sse2,
	svml_d_exp4_core-sse, svml_d_exp8_core-avx2,
	svml_d_log2_core-sse2, svml_d_log4_core-sse,
	svml_d_log8_core-avx2, svml_d_pow2_core-sse2,
	svml_d_pow4_core-sse, svml_d_pow8_core-avx2
	svml_d_sin2_core-sse2, svml_d_sin4_core-sse,
	svml_d_sin8_core-avx2, svml_d_sincos2_core-sse2,
	svml_d_sincos4_core-sse, svml_d_sincos8_core-avx2,
	svml_s_cosf16_core-avx2, svml_s_cosf4_core-sse2,
	svml_s_cosf8_core-sse, svml_s_expf16_core-avx2,
	svml_s_expf4_core-sse2, svml_s_expf8_core-sse,
	svml_s_logf16_core-avx2, svml_s_logf4_core-sse2,
	svml_s_logf8_core-sse, svml_s_powf16_core-avx2,
	svml_s_powf4_core-sse2, svml_s_powf8_core-sse,
	svml_s_sincosf16_core-avx2, svml_s_sincosf4_core-sse2,
	svml_s_sincosf8_core-sse, svml_s_sinf16_core-avx2,
	svml_s_sinf4_core-sse2 and svml_s_sinf8_core-sse.
	* sysdeps/x86_64/fpu/multiarch/ifunc-mathvec-avx2.h: New file.
	* sysdeps/x86_64/fpu/multiarch/ifunc-mathvec-avx512.h: Likewise.
	* sysdeps/x86_64/fpu/multiarch/ifunc-mathvec-sse4_1.h: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_log2_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_log4_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_log8_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_cosf16_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_cosf4_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_cosf8_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_expf16_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_expf4_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_expf8_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_logf16_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_logf4_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_logf8_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_powf16_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_powf4_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_powf8_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sincosf16_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sincosf4_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sincosf8_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sinf16_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sinf4_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sinf8_core.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core-sse2.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVbN2v_cos): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core-sse.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVdN4v_cos): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core-avx2.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVeN8v_cos): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core-sse2.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVbN2v_exp): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core-sse.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVdN4v_exp): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core-avx2.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVeN8v_exp): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_log2_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_log2_core-sse2.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVbN2v_log): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_log4_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_log4_core-sse.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVdN4v_log): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_log8_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_log8_core-avx2.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVeN8v_log): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core-sse2.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVbN2vv_pow): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core-sse.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVdN4vv_pow): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core-avx2.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVeN8vv_pow): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core-sse2.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVbN2v_sin): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core-sse.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVbN4v_sin): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core-avx2.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVbN8v_sin): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core-sse2.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVbN2vvv_sincos): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core-sse.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVdN4vvv_sincos): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core-avx2.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVeN8vvv_sincos): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_cosf16_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_cosf16_core-avx2.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVeN16v_cosf): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_cosf4_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_cosf4_core-sse2.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVbN4v_cosf): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_cosf8_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_cosf8_core-sse.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVdN8v_cosf): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_expf16_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_expf16_core-avx2.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVeN16v_expf): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_expf4_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_expf4_core-sse2.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVbN4v_expf): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_expf8_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_expf8_core-sse.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVdN8v_expf): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_logf16_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_logf16_core-avx2.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVeN16v_logf): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_logf4_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_logf4_core-sse2.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVbN4v_logf): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_logf8_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_logf8_core-sse.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVdN8v_logf): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_powf16_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_powf16_core-avx2.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVeN16vv_powf): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_powf4_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_powf4_core-sse2.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVbN4vv_powf): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_powf8_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_powf8_core-sse.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVdN8vv_powf): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sincosf16_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_sincosf16_core-avx2.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVeN16vvv_sincosf): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sincosf4_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_sincosf4_core-sse2.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVbN4vvv_sincosf): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sincosf8_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_sincosf8_core-sse.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVdN8vvv_sincosf): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sinf16_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_sinf16_core-avx2.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVeN16v_sinf): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sinf4_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_sinf4_core-sse2.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVbN4v_sinf): Removed.
	* sysdeps/x86_64/fpu/multiarch/svml_d_sinf8_core.S:  Renamed to
	...
	* sysdeps/x86_64/fpu/multiarch/svml_d_sinf8_core-sse.S: This.
	Don't include <sysdep.h> nor <init-arch.h>.
	(_ZGVdN8v_sinf): Removed.
2017-08-04 13:03:58 -07:00
H.J. Lu
10a87ca476 x86-64: Implement libm IFUNC selectors in C
* sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines):
	Add s_ceil-sse4_1, s_ceilf-sse4_1, s_floor-sse4_1,
	s_floorf-sse4_1, s_nearbyint-sse4_1, s_nearbyintf-sse4_1,
	s_rint-sse4_1 and s_rintf-sse4_1.
	* sysdeps/x86_64/fpu/multiarch/ifunc-sse4_1.h: New file.
	* sysdeps/x86_64/fpu/multiarch/s_ceil.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_ceilf.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_floor.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_floorf.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_nearbyint.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_nearbyintf.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_rint.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_rintf.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_ceil.S: Renamed to ...
	* sysdeps/x86_64/fpu/multiarch/s_ceil-sse4_1.S: This.  Don't
	include <machine/asm.h> nor <init-arch.h>.  Include <sysdep.h>.
	(__ceil): Removed.
	* sysdeps/x86_64/fpu/multiarch/s_ceilf.S: Renamed to ...
	* sysdeps/x86_64/fpu/multiarch/s_ceilf-sse4_1.S: This.  Don't
	include <machine/asm.h> nor <init-arch.h>.  Include <sysdep.h>.
	(__ceilf): Removed.
	* sysdeps/x86_64/fpu/multiarch/s_floor.S: Renamed to ...
	* sysdeps/x86_64/fpu/multiarch/s_floor-sse4_1.S: This.  Don't
	include <machine/asm.h> nor <init-arch.h>.  Include <sysdep.h>.
	(__floor): Removed.
	* sysdeps/x86_64/fpu/multiarch/s_floorf.S: Renamed to ...
	* sysdeps/x86_64/fpu/multiarch/s_floorf-sse4_1.S: This.  Don't
	include <machine/asm.h> nor <init-arch.h>.  Include <sysdep.h>.
	(__floorf): Removed.
	* sysdeps/x86_64/fpu/multiarch/s_nearbyint.S: Renamed to ...
	* sysdeps/x86_64/fpu/multiarch/s_nearbyint-sse4_1.S: This.  Don't
	include <machine/asm.h> nor <init-arch.h>.  Include <sysdep.h>.
	(__nearbyint): Removed.
	* sysdeps/x86_64/fpu/multiarch/s_nearbyintf.S: Renamed to ...
	* sysdeps/x86_64/fpu/multiarch/s_nearbyintf-sse4_1.S: This.  Don't
	include <machine/asm.h> nor <init-arch.h>.  Include <sysdep.h>.
	(__nearbyintf): Removed.
	* sysdeps/x86_64/fpu/multiarch/s_rint.S: Renamed to ...
	* sysdeps/x86_64/fpu/multiarch/s_rint-sse4_1.S: This.  Don't
	include <machine/asm.h> nor <init-arch.h>.  Include <sysdep.h>.
	(__rint): Removed.
	* sysdeps/x86_64/fpu/multiarch/s_rintf.S: Renamed to ...
	* sysdeps/x86_64/fpu/multiarch/s_rintf-sse4_1.S: This.  Don't
	include <machine/asm.h> nor <init-arch.h>.  Include <sysdep.h>.
	(__rintf): Removed.
2017-08-04 13:02:13 -07:00
Joseph Myers
c86ed71d63 Add float128 support for x86_64, x86.
This patch enables float128 support for x86_64 and x86.  All GCC
versions that can build glibc provide the required support, but since
GCC 6 and before don't provide __builtin_nanq / __builtin_nansq, sNaN
tests and some tests of NaN payloads need to be disabled with such
compilers (this does not affect the generated glibc binaries at all,
just the tests).  bits/floatn.h declares float128 support to be
available for GCC versions that provide the required libgcc support
(4.3 for x86_64, 4.4 for i386 GNU/Linux, 4.5 for i386 GNU/Hurd);
compilation-only support was present some time before then, but not
really useful without the libgcc functions.

fenv_private.h needed updating to avoid trying to put _Float128 values
in registers.  I make no assertion of optimality of the
math_opt_barrier / math_force_eval definitions for this case; they are
simply intended to be sufficient to work correctly.

Tested for x86_64 and x86, with GCC 7 and GCC 6.  (Testing for x32 was
compilation tests only with build-many-glibcs.py to verify the ABI
baseline updates.  I have not done any testing for Hurd, although the
float128 support is enabled there as for GNU/Linux.)

	* sysdeps/i386/Implies: Add ieee754/float128.
	* sysdeps/x86_64/Implies: Likewise.
	* sysdeps/x86/bits/floatn.h: New file.
	* sysdeps/x86/float128-abi.h: Likewise.
	* manual/math.texi (Mathematics): Document support for _Float128
	on x86_64 and x86.
	* sysdeps/i386/fpu/fenv_private.h: Include <bits/floatn.h>.
	(math_opt_barrier): Do not put _Float128 values in floating-point
	registers.
	(math_force_eval): Likewise.
	[__x86_64__] (SET_RESTORE_ROUNDF128): New macro.
	* sysdeps/x86/fpu/Makefile [$(subdir) = math] (CPPFLAGS): Append
	to Makefile variable.
	* sysdeps/x86/fpu/e_sqrtf128.c: New file.
	* sysdeps/x86/fpu/sfp-machine.h: Likewise.  Based on libgcc.
	* sysdeps/x86/math-tests.h: New file.
	* math/libm-test-support.h (XFAIL_FLOAT128_PAYLOAD): New macro.
	* math/libm-test-getpayload.inc (getpayload_test_data): Use
	XFAIL_FLOAT128_PAYLOAD.
	* math/libm-test-setpayload.inc (setpayload_test_data): Likewise.
	* math/libm-test-totalorder.inc (totalorder_test_data): Likewise.
	* math/libm-test-totalordermag.inc (totalordermag_test_data):
	Likewise.
	* sysdeps/unix/sysv/linux/i386/libc.abilist: Update.
	* sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/x86_64/64/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.
	* sysdeps/i386/fpu/libm-test-ulps: Likewise.
	* sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2017-06-26 22:02:24 +00:00
Zack Weinberg
fd860eaaa8 Remove __need macros from errno.h (__need_Emath, __need_error_t).
This is fairly complicated, not because the users of __need_Emath and
__need_error_t have complicated requirements, but because the core
changes had a lot of fallout.

__need_error_t exists for gnulib compatibility in argz.h and argp.h.
error_t itself is a Hurdism, an enum containing all the E-constants,
so you can do 'p (error_t) errno' in gdb and get a symbolic value.
argz.h and argp.h use it for function return values, and they want to
fall back to 'int' when that's not available.  There is no reason why
these nonstandard headers cannot just go ahead and include all of
errno.h; so we do that.

__need_Emath is defined only by .S files; what they _really_ need is
for errno.h to avoid declaring anything other than the E-constants
(e.g. 'extern int __errno_location(void);' is a syntax error in
assembly language). This is replaced with a check for __ASSEMBLER__ in
errno.h, plus a carefully documented requirement for bits/errno.h not
to define anything other than macros.  That in turn has the
consequence that bits/errno.h must not define errno - fortunately, all
live ports use the same definition of errno, so I've moved it to
errno.h.  The Hurd bits/errno.h must also take care not to define
error_t when __ASSEMBLER__ is defined, which involves repeating all of
the definitions twice, but it's a generated file so that's okay.

	* stdlib/errno.h: Remove __need_Emath and __need_error_t logic.
	Reorganize file.  Declare errno here.  When __ASSEMBLER__ is
	defined, don't declare anything other than the E-constants.

	* include/errno.h: Change conditional for exposing internal
	declarations to (not _ISOMAC and not __ASSEMBLER__).
	* bits/errno.h: Remove logic for __need_Emath.  Document
	requirements for a port-specific bits/errno.h.

	* sysdeps/unix/sysv/linux/bits/errno.h
	* sysdeps/unix/sysv/linux/alpha/bits/errno.h
	* sysdeps/unix/sysv/linux/hppa/bits/errno.h
	* sysdeps/unix/sysv/linux/mips/bits/errno.h
	* sysdeps/unix/sysv/linux/sparc/bits/errno.h:
	Add multiple-include guard and check against improper inclusion.
	Remove __need_Emath logic.  Don't declare errno here.  Ensure all
	constants are defined as simple integer literals.  Consistent
	formatting.
	* sysdeps/mach/hurd/errnos.awk: Likewise.  Only define error_t and
	enum __error_t_codes if __ASSEMBLER__ is not defined.
	* sysdeps/mach/hurd/bits/errno.h: Regenerate.

	* argp/argp.h, string/argz.h: Don't define __need_error_t before
	including errno.h.
	* sysdeps/i386/i686/fpu/multiarch/s_cosf-sse2.S
	* sysdeps/i386/i686/fpu/multiarch/s_sincosf-sse2.S
	* sysdeps/i386/i686/fpu/multiarch/s_sinf-sse2.S
	* sysdeps/x86_64/fpu/s_cosf.S
	* sysdeps/x86_64/fpu/s_sincosf.S
	* sysdeps/x86_64/fpu/s_sinf.S:
	Just include errno.h; don't define __need_Emath or include
	bits/errno.h directly.
2017-06-14 08:14:34 -04:00
Zack Weinberg
7c3018f9e4 Suppress internal declarations for most of the testsuite.
This patch adds a new build module called 'testsuite'.
IS_IN (testsuite) implies _ISOMAC, as do IS_IN_build and __cplusplus
(which means several ad-hoc tests for __cplusplus can go away).
libc-symbols.h now suppresses almost all of *itself* when _ISOMAC is
defined; in particular, _ISOMAC mode does not get config.h
automatically anymore.

There are still quite a few tests that need to see internal gunk of
one variety or another.  For them, we now have 'tests-internal' and
'test-internal-extras'; files in this category will still be compiled
with MODULE_NAME=nonlib, and everything proceeds as it always has.
The bulk of this patch is moving tests from 'tests' to
'tests-internal'.  There is also 'tests-static-internal', which has
the same effect on files in 'tests-static', and 'modules-names-tests',
which has the *inverse* effect on files in 'modules-names' (it's
inverted because most of the things in modules-names are *not* tests).
For both of these, the file must appear in *both* the new variable and
the old one.

There is also now a special case for when libc-symbols.h is included
without MODULE_NAME being defined at all.  (This happens during the
creation of libc-modules.h, and also when preprocessing Versions
files.)  When this happens, IS_IN is set to be always false and
_ISOMAC is *not* defined, which was the status quo, but now it's
explicit.

The remaining changes to C source files in this patch seemed likely to
cause problems in the absence of the main change.  They should be
relatively self-explanatory.  In a few cases I duplicated a definition
from an internal header rather than move the test to tests-internal;
this was a judgement call each time and I'm happy to change those
however reviewers feel is more appropriate.

	* Makerules: New subdir configuration variables 'tests-internal'
	and 'test-internal-extras'.  Test files in these categories will
	still be compiled with MODULE_NAME=nonlib.  Test files in the
	existing categories (tests, xtests, test-srcs, test-extras) are
	now compiled with MODULE_NAME=testsuite.
	New subdir configuration variable 'modules-names-tests'.  Files
	which are in both 'modules-names' and 'modules-names-tests' will
	be compiled with MODULE_NAME=testsuite instead of
	MODULE_NAME=extramodules.
	(gen-as-const-headers): Move to tests-internal.
	(do-tests-clean, common-mostlyclean): Support tests-internal.
	* Makeconfig (built-modules): Add testsuite.
	* Makefile: Change libof-check-installed-headers-c and
	libof-check-installed-headers-cxx to 'testsuite'.
	* Rules: Likewise.  Support tests-internal.
	* benchtests/strcoll-inputs/filelist#en_US.UTF-8:
	Remove extra-modules.mk.

	* config.h.in: Don't check for __OPTIMIZE__ or __FAST_MATH__ here.
	* include/libc-symbols.h: Move definitions of _GNU_SOURCE,
	PASTE_NAME, PASTE_NAME1, IN_MODULE, IS_IN, and IS_IN_LIB to the
	very top of the file and rationalize their order.
	If MODULE_NAME is not defined at all, define IS_IN to always be
	false, and don't define _ISOMAC.
	If any of IS_IN (testsuite), IS_IN_build, or __cplusplus are
	true, define _ISOMAC and suppress everything else in this file,
	starting with the inclusion of config.h.
	Do check for inappropriate definitions of __OPTIMIZE__ and
	__FAST_MATH__ here, but only if _ISOMAC is not defined.
        Correct some out-of-date commentary.

	* include/math.h: If _ISOMAC is defined, undefine NO_LONG_DOUBLE
	and _Mlong_double_ before including math.h.
	* include/string.h: If _ISOMAC is defined, don't expose
	_STRING_ARCH_unaligned. Move a comment to a more appropriate
	location.

	* include/errno.h, include/stdio.h, include/stdlib.h, include/string.h
	* include/time.h, include/unistd.h, include/wchar.h: No need to
	check __cplusplus nor use __BEGIN_DECLS/__END_DECLS.

	* misc/sys/cdefs.h (__NTHNL): New macro.
	* sysdeps/m68k/m680x0/fpu/bits/mathinline.h
	(__m81_defun): Use __NTHNL to avoid errors with GCC 6.

	* elf/tst-env-setuid-tunables.c: Include config.h with _LIBC
	defined, for HAVE_TUNABLES.
	* inet/tst-checks-posix.c: No need to define _ISOMAC.
	* intl/tst-gettext2.c: Provide own definition of N_.
	* math/test-signgam-finite-c99.c: No need to define _ISOMAC.
	* math/test-signgam-main.c: No need to define _ISOMAC.
	* stdlib/tst-strtod.c: Convert to test-driver. Split locale_test to...
	* stdlib/tst-strtod1i.c: ...this new file.
	* stdlib/tst-strtod5.c: Convert to test-driver and add copyright notice.
        Split tests of __strtod_internal to...
	* stdlib/tst-strtod5i.c: ...this new file.
	* string/test-string.h: Include stdint.h. Duplicate definition of
	inhibit_loop_to_libcall here (from libc-symbols.h).
	* string/test-strstr.c: Provide dummy definition of
	libc_hidden_builtin_def when including strstr.c.
	* sysdeps/ia64/fpu/libm-symbols.h: Suppress entire file in _ISOMAC
	mode; no need to test __STRICT_ANSI__ nor __cplusplus as well.
	* sysdeps/x86_64/fpu/math-tests-arch.h: Include cpu-features.h.
	Don't include init-arch.h.
	* sysdeps/x86_64/multiarch/test-multiarch.h: Include cpu-features.h.
	Don't include init-arch.h.

	* elf/Makefile: Move tst-ptrguard1-static, tst-stackguard1-static,
	tst-tls1-static, tst-tls2-static, tst-tls3-static, loadtest,
	unload, unload2, circleload1, neededtest, neededtest2,
	neededtest3, neededtest4, tst-tls1, tst-tls2, tst-tls3,
	tst-tls6, tst-tls7, tst-tls8, tst-dlmopen2, tst-ptrguard1,
	tst-stackguard1, tst-_dl_addr_inside_object, and all of the
	ifunc tests to tests-internal.
	Don't add $(modules-names) to test-extras.
	* inet/Makefile: Move tst-inet6_scopeid_pton to tests-internal.
	Add tst-deadline to tests-static-internal.
	* malloc/Makefile: Move tst-mallocstate and tst-scratch_buffer to
	tests-internal.
	* misc/Makefile: Move tst-atomic and tst-atomic-long to tests-internal.
	* nptl/Makefile: Move tst-typesizes, tst-rwlock19, tst-sem11,
	tst-sem12, tst-sem13, tst-barrier5, tst-signal7, tst-tls3,
	tst-tls3-malloc, tst-tls5, tst-stackguard1, tst-sem11-static,
	tst-sem12-static, and tst-stackguard1-static to tests-internal.
        Link tests-internal with libpthread also.
	Don't add $(modules-names) to test-extras.
	* nss/Makefile: Move tst-field to tests-internal.
	* posix/Makefile: Move bug-regex5, bug-regex20, bug-regex33,
	tst-rfc3484, tst-rfc3484-2, and tst-rfc3484-3 to tests-internal.
	* stdlib/Makefile: Move tst-strtod1i, tst-strtod3, tst-strtod4,
	tst-strtod5i, tst-tls-atexit, and tst-tls-atexit-nodelete to
	tests-internal.
        * sunrpc/Makefile: Move tst-svc_register to tests-internal.
	* sysdeps/powerpc/Makefile: Move test-get_hwcap and
	test-get_hwcap-static to tests-internal.
	* sysdeps/unix/sysv/linux/Makefile: Move tst-setgetname to
	tests-internal.
	* sysdeps/x86_64/fpu/Makefile: Add all libmvec test modules to
	modules-names-tests.
2017-05-11 19:27:59 -04:00
Zack Weinberg
963394a22b Allow direct use of math_ldbl.h in testsuite.
A few 'long double'-related tests include math_private.h just for
their variety of math_ldbl.h, which contains macros for assembling and
disassembling the binary representation of 'long double'.  math_ldbl.h
insists on being included from math_private.h, but if we relax this
restriction (and fix some portability sloppiness) we can use it
directly and not have to expose all of math_private.h to the testsuite.

	* sysdeps/generic/math_private.h: Use __BIG_ENDIAN and
	__LITTLE_ENDIAN, not BIG_ENDIAN and LITTLE_ENDIAN.

	* sysdeps/generic/math_ldbl.h
	* sysdeps/ia64/fpu/math_ldbl.h
	* sysdeps/ieee754/ldbl-128/math_ldbl.h
	* sysdeps/ieee754/ldbl-128ibm/math_ldbl.h
	* sysdeps/ieee754/ldbl-96/math_ldbl.h
	* sysdeps/powerpc/fpu/math_ldbl.h
	* sysdeps/x86_64/fpu/math_ldbl.h:
	Allow direct inclusion.  Use uintNN_t instead of u_intNN_t.
	Use __BIG_ENDIAN and __LITTLE_ENDIAN, not BIG_ENDIAN and
	LITTLE_ENDIAN.  Include endian.h and/or stdint.h if necessary.
	Add copyright notices.

	* sysdeps/ieee754/ldbl-128ibm/math_ldbl.h (ldbl_canonicalize_int):
	Don't use EXTRACT_WORDS64.

	* sysdeps/ieee754/ldbl-96/test-canonical-ldbl-96.c
	* sysdeps/ieee754/ldbl-96/test-totalorderl-ldbl-96.c
	* sysdeps/ieee754/ldbl-128ibm/test-canonical-ldbl-128ibm.c
	* sysdeps/ieee754/ldbl-128ibm/test-totalorderl-ldbl-128ibm.c:
	Include math_ldbl.h, not math_private.h.
2017-02-25 10:40:48 -05:00
Joseph Myers
92061bb033 Run libm tests separately for each function.
At present, libm tests for each function get built into a single
executable (for each floating point type, for each of normal / inline
/ finite-math-only functions, plus vector variants) and run together,
resulting in a single PASS or FAIL (for each of those nine variants
plus vector variants).  Building this executable involves reading
over 50 MB of libm-test-*.c sources.

This patch arranges for tests of each function to be run separately
from the makefiles instead.  There are 121 functions being tested for
each (type, variant pair) (actually 126, but run as 121 from the
Makefile because each of the pairs (exp10, pow10), (isfinite, finite),
(lgamma, gamma), (remainder, drem), (scalbn, ldexp), shares a table of
test results and so is run together), so 1089 separate tests run from
the Makefile, plus 48 vector tests on x86_64 (six functions for eight
vector variants).  Each test only involves a libm-test-<func>.c file
of no more than about 4 MB, rather than all such files taking about 50
MB.  With tests run separately, test summaries will indicate which
functions actually have problems (of course, those problems may just
be out-of-date libm-test-ulps files if the file hasn't been updated
for the architecture in question recently).

All the .c files for the 1089+48 tests are generated automatically
from the Makefiles.  Various checked-in boilerplate .c files are
removed as no longer needed.  CFLAGS definitions for the different
kinds of tests are generated using makefile iterators to apply
target-specific variable settings.  libm-have-vector-test.h is no
longer needed; the list of functions to test for each vector type is
now in the sysdeps Makefile.

This should reduce the amount of boilerplate needed for float128
testing support; test-float128.h will still be needed, but not various
.c files or Makefile CFLAGS definitions.  The logic for creating
dependencies on libm-test-support-*.o files should also render
<https://sourceware.org/ml/libc-alpha/2017-02/msg00279.html>
unnecessary.

Tested for x86_64 and x86.

	* math/Makefile (libm-tests-generated): Remove variable.
	(libm-tests-base-normal): New variable.
	(libm-tests-base-finite): Likewise.
	(libm-tests-base-inline): Likewise.
	(libm-tests-base): Likewise.
	(libm-tests-normal): Likewise.
	(libm-tests-finite): Likewise.
	(libm-tests-inline): Likewise.
	(libm-tests-vector): Likewise.
	(libm-tests): Define in terms of these new variables.
	(libm-tests-for-type): New variable.
	(libm-tests.o): Move definition.
	(tests): Move addition of $(libm-tests).
	(generated): Update for new and removed libm test files.
	($(objpfx)libm-test.c): Remove target.
	($(objpfx)libm-have-vector-test.h): Likewise.
	(CFLAGS-test-double-vlen2.c): Remove variable.
	(CFLAGS-test-double-vlen4.c): Likewise.
	(CFLAGS-test-double-vlen8.c): Likewise.
	(CFLAGS-test-float-vlen4.c): Likewise.
	(CFLAGS-test-float-vlen8.c): Likewise.
	(CFLAGS-test-float-vlen16.c): Likewise.
	(CFLAGS-test-float.c): Likewise.
	(CFLAGS-test-float-finite.c): Likewise.
	(CFLAGS-libm-test-support-float.c): Likewise.
	(CFLAGS-test-double.c): Likewise.
	(CFLAGS-test-double-finite.c): Likewise.
	(CFLAGS-libm-test-support-double.c): Likewise.
	(CFLAGS-test-ldouble.c): Likewise.
	(CFLAGS-test-ldouble-finite.c): Likewise.
	(CFLAGS-libm-test-support-ldouble.c): Likewise.
	(libm-test-inline-cflags): New variable.
	(CFLAGS-test-ifloat.c): Remove variable.
	(CFLAGS-test-idouble.c): Likewise.
	(CFLAGS-test-ildouble.c): Likewise.
	($(addprefix $(objpfx), $(libm-tests.o))): Move target and update
	dependencies.
	($(foreach t,$(libm-tests-normal),$(objpfx)$(t).c)): New rule.
	($(foreach t,$(libm-tests-finite),$(objpfx)$(t).c)): Likewise.
	($(foreach t,$(libm-tests-inline),$(objpfx)$(t).c)): Likewise.
	($(foreach t,$(libm-tests-vector),$(objpfx)$(t).c)): Likewise.
	($(foreach t,$(types),$(objpfx)libm-test-support-$(t).c)):
	Likewise.
	(dependencies on libm-test-support-*.o): Remove.
	($(foreach f,$(libm-test-funcs-all),$(objpfx)$(o)-$(f).o)): New
	rules using iterators.
	($(addprefix $(objpfx),$(call libm-tests-for-type,$(o)))):
	Likewise.
	($(objpfx)libm-test-support-$(o).o): Likewise.
	($(addprefix $(objpfx),$(filter-out $(tests-static)
	$(libm-vec-tests),$(tests)))): Filter out $(libm-tests-vector)
	instead.
	($(addprefix $(objpfx), $(libm-vec-tests))): Use iterator to
	define rule instead.
	* math/README.libm-test: Update.
	* math/libm-test-acos.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-acosh.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-asin.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-asinh.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-atan.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-atan2.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-atanh.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-cabs.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-cacos.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-cacosh.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-canonicalize.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-carg.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-casin.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-casinh.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-catan.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-catanh.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-cbrt.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-ccos.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-ccosh.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-ceil.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-cexp.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-cimag.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-clog.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-clog10.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-conj.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-copysign.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-cos.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-cosh.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-cpow.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-cproj.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-creal.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-csin.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-csinh.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-csqrt.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-ctan.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-ctanh.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-erf.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-erfc.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-exp.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-exp10.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-exp2.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-expm1.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-fabs.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-fdim.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-floor.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-fma.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-fmax.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-fmaxmag.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-fmin.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-fminmag.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-fmod.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-fpclassify.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-frexp.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-fromfp.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-fromfpx.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-getpayload.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-hypot.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-ilogb.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-iscanonical.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-iseqsig.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-isfinite.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-isgreater.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-isgreaterequal.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-isinf.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-isless.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-islessequal.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-islessgreater.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-isnan.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-isnormal.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-issignaling.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-issubnormal.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-isunordered.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-iszero.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-j0.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-j1.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-jn.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-lgamma.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-llogb.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-llrint.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-llround.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-log.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-log10.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-log1p.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-log2.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-logb.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-lrint.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-lround.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-modf.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-nearbyint.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-nextafter.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-nextdown.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-nexttoward.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-nextup.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-pow.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-remainder.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-remquo.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-rint.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-round.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-roundeven.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-scalb.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-scalbln.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-scalbn.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-setpayload.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-setpayloadsig.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-signbit.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-significand.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-sin.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-sincos.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-sinh.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-sqrt.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-tan.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-tanh.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-tgamma.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-totalorder.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-totalordermag.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-trunc.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-ufromfp.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-ufromfpx.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-y0.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-y1.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-yn.inc: Include libm-test-driver.c.
	(do_test): New function.
	* math/libm-test-driver.c: Do not include libm-have-vector-test.h.
	(HAVE_VECTOR): Remove macro.
	(START): Do not call HAVE_VECTOR.
	* math/test-double-vlen2.h (FUNC_TEST): Remove macro.
	* math/test-double-vlen4.h (FUNC_TEST): Remove macro.
	* math/test-double-vlen8.h (FUNC_TEST): Remove macro.
	* math/test-float-vlen16.h (FUNC_TEST): Remove macro.
	* math/test-float-vlen4.h (FUNC_TEST): Remove macro.
	* math/test-float-vlen8.h (FUNC_TEST): Remove macro.
	* math/test-math-vector.h (FUNC_TEST): New macro.
	(WRAPPER_DECL): Rename to WRAPPER_DECL_f.
	* sysdeps/x86_64/fpu/Makefile (double-vlen2-funcs): New variable.
	(double-vlen4-funcs): Likewise.
	(double-vlen4-avx2-funcs): Likewise.
	(double-vlen8-funcs): Likewise.
	(float-vlen4-funcs): Likewise.
	(float-vlen8-funcs): Likewise.
	(float-vlen8-avx2-funcs): Likewise.
	(float-vlen16-funcs): Likewise.
	(CFLAGS-test-double-vlen4-avx2.c): Remove variable.
	(CFLAGS-test-float-vlen8-avx2.c): Likewise.
	* sysdeps/x86_64/fpu/test-double-vlen4.h (TEST_VECTOR_cos): Remove
	macro.
	(TEST_VECTOR_sin): Likewise.
	(TEST_VECTOR_sincos): Likewise.
	(TEST_VECTOR_log): Likewise.
	(TEST_VECTOR_exp): Likewise.
	(TEST_VECTOR_pow): Likewise.
	* sysdeps/x86_64/fpu/test-double-vlen8.h (TEST_VECTOR_cos):
	Likewise.
	(TEST_VECTOR_sin): Likewise.
	(TEST_VECTOR_sincos): Likewise.
	(TEST_VECTOR_log): Likewise.
	(TEST_VECTOR_exp): Likewise.
	(TEST_VECTOR_pow): Likewise.
	* sysdeps/x86_64/fpu/test-float-vlen16.h (TEST_VECTOR_cosf):
	Likewise.
	(TEST_VECTOR_sinf): Likewise.
	(TEST_VECTOR_sincosf): Likewise.
	(TEST_VECTOR_logf): Likewise.
	(TEST_VECTOR_expf): Likewise.
	(TEST_VECTOR_powf): Likewise.
	* sysdeps/x86_64/fpu/test-float-vlen8.h (TEST_VECTOR_cosf):
	Likewise.
	(TEST_VECTOR_sinf): Likewise.
	(TEST_VECTOR_sincosf): Likewise.
	(TEST_VECTOR_logf): Likewise.
	(TEST_VECTOR_expf): Likewise.
	(TEST_VECTOR_powf): Likewise.
	* math/gen-libm-have-vector-test.sh: Remove file.
	* math/libm-test.inc: Likewise.
	* math/libm-test-support-double.c: Likewise.
	* math/libm-test-support-float.c: Likewise.
	* math/libm-test-support-ldouble.c: Likewise.
	* math/test-double-finite.c: Likewise.: Likewise.
	* math/test-double.c: Likewise.
	* math/test-float-finite.c: Likewise.
	* math/test-float.c: Likewise.
	* math/test-idouble.c: Likewise.
	* math/test-ifloat.c: Likewise.
	* math/test-ildouble.c: Likewise.
	* math/test-ldouble-finite.c: Likewise.
	* math/test-ldouble.c: Likewise.
	* sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise.
	* sysdeps/x86_64/fpu/test-double-vlen2.h: Likewise.
	* sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise.
	* sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise.
	* sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise.
	* sysdeps/x86_64/fpu/test-float-vlen16.c: Likewise.
	* sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise.
	* sysdeps/x86_64/fpu/test-float-vlen4.h: Likewise.
	* sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise.
	* sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise.
2017-02-24 00:52:49 +00:00
Joseph Myers
2c51dfd05d Move tests of catan, catanh to auto-libm-test-*.
This patch moves tests of catan and catanh with finite inputs (other
than the divide-by-zero cases producing an exact infinity) to using
the auto-libm-test machinery.  Each of auto-libm-test-out-catan and
auto-libm-test-out-catanh takes about three seconds to generate on my
system (so in fact it wasn't necessary after all to defer the move to
auto-libm-test-* until the output files were split up by function).

Tested for x86_64 and x86 and ulps updated accordingly.

	* math/auto-libm-test-in: Add tests of catan and catanh.
	* math/auto-libm-test-out-catan: New generated file.
	* math/auto-libm-test-out-catanh: Likewise.
	* math/libm-test-catan.inc (catan_test_data): Use AUTO_TESTS_c_c.
	Move tests with finite inputs, except divide-by-zero cases, to
	auto-libm-test-in.
	* math/libm-test-catanh.inc (catanh_test_data): Likewise.
	* math/Makefile (libm-test-funcs-auto): Add catan and catanh.
	(libm-test-funcs-noauto): Remove catan and catanh.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2017-02-17 18:42:37 +00:00
Joseph Myers
fa2a3dd7a3 Move tests of casin, casinh to auto-libm-test-*.
This patch moves tests of casin and casinh with finite inputs to using
the auto-libm-test machinery.  Each of auto-libm-test-out-casin and
auto-libm-test-out-casinh takes about 38 minutes to generate on my
system because of MPC slowness on special cases that appear in the
tests (with MPC 1.0.3; I don't know to what extent current MPC master
might speed it up).

Tested for x86_64 and x86 and ulps updated accordingly.

	* math/auto-libm-test-in: Add tests of casin and casinh.
	* math/auto-libm-test-out-casin: New generated file.
	* math/auto-libm-test-out-casinh: Likewise.
	* math/libm-test-casin.inc (casin_test_data): Use AUTO_TESTS_c_c.
	Move tests with finite inputs to auto-libm-test-in.
	* math/libm-test-casinh.inc (casinh_test_data): Likewise.
	* math/Makefile (libm-test-funcs-auto): Add casin and casinh.
	(libm-test-funcs-noauto): Remove casin and casinh.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2017-02-17 18:14:02 +00:00
Joseph Myers
6b8303a383 Move tests of cacos, cacosh to auto-libm-test-*.
This patch moves tests of cacos and cacosh with finite inputs to using
the auto-libm-test machinery.  Each of auto-libm-test-out-cacos and
auto-libm-test-out-cacosh takes about 80 minutes to generate on my
system because of MPC slowness on special cases that appear in the
tests (with MPC 1.0.3; I don't know to what extent current MPC master
might speed it up).

Tested for x86_64 and x86 and ulps updated accordingly.

	* math/auto-libm-test-in: Add tests of cacos and cacosh.
	* math/auto-libm-test-out-cacos: New generated file.
	* math/auto-libm-test-out-cacosh: Likewise.
	* math/libm-test-cacos.inc (cacos_test_data): Use AUTO_TESTS_c_c.
	Move tests with finite inputs to auto-libm-test-in.
	* math/libm-test-cacosh.inc (cacosh_test_data): Likewise.
	* math/Makefile (libm-test-funcs-auto): Add cacos and cacosh.
	(libm-test-funcs-noauto): Remove cacos and cacosh.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2017-02-17 17:44:23 +00:00
Joseph Myers
f7a51347a4 Revert header inclusion changes that break math/ testing on x86_64.
Revert:
	2017-02-16  Zack Weinberg  <zackw@panix.com>

	* sysdeps/x86_64/fpu/math-tests-arch.h: Include cpu-features.h.
	Don't include init-arch.h.
	* sysdeps/x86_64/multiarch/test-multiarch.h: Include cpu-features.h.
	Don't include init-arch.h.
2017-02-17 17:08:17 +00:00
Zack Weinberg
ceaa98897c Add missing header files throughout the testsuite.
* crypt/md5.h: Test _LIBC with #if defined, not #if.
	* dirent/opendir-tst1.c: Include sys/stat.h.
	* dirent/tst-fdopendir.c: Include sys/stat.h.
	* dirent/tst-fdopendir2.c: Include stdlib.h.
	* dirent/tst-scandir.c: Include stdbool.h.
	* elf/tst-auditmod1.c: Include link.h and stddef.h.
	* elf/tst-tls15.c: Include stdlib.h.
	* elf/tst-tls16.c: Include stdlib.h.
	* elf/tst-tls17.c: Include stdlib.h.
	* elf/tst-tls18.c: Include stdlib.h.
	* iconv/tst-iconv6.c: Include endian.h.
	* iconvdata/bug-iconv11.c: Include limits.h.
	* io/test-utime.c: Include stdint.h.
	* io/tst-faccessat.c: Include sys/stat.h.
	* io/tst-fchmodat.c: Include sys/stat.h.
	* io/tst-fchownat.c: Include sys/stat.h.
	* io/tst-fstatat.c: Include sys/stat.h.
	* io/tst-futimesat.c: Include sys/stat.h.
	* io/tst-linkat.c: Include sys/stat.h.
	* io/tst-mkdirat.c: Include sys/stat.h and stdbool.h.
	* io/tst-mkfifoat.c: Include sys/stat.h and stdbool.h.
	* io/tst-mknodat.c: Include sys/stat.h and stdbool.h.
	* io/tst-openat.c: Include stdbool.h.
	* io/tst-readlinkat.c: Include sys/stat.h.
	* io/tst-renameat.c: Include sys/stat.h.
	* io/tst-symlinkat.c: Include sys/stat.h.
	* io/tst-unlinkat.c: Include stdbool.h.
	* libio/bug-memstream1.c: Include stdlib.h.
	* libio/bug-wmemstream1.c: Include stdlib.h.
	* libio/tst-fwrite-error.c: Include stdlib.h.
	* libio/tst-memstream1.c: Include stdlib.h.
	* libio/tst-memstream2.c: Include stdlib.h.
	* libio/tst-memstream3.c: Include stdlib.h.
	* malloc/tst-interpose-aux.c: Include stdint.h.
	* misc/tst-preadvwritev-common.c: Include sys/stat.h.
	* nptl/tst-basic7.c: Include limits.h.
	* nptl/tst-cancel25.c: Include pthread.h, not pthreadP.h.
	* nptl/tst-cancel4.c: Include stddef.h, limits.h, and sys/stat.h.
	* nptl/tst-cancel4_1.c: Include stddef.h.
	* nptl/tst-cancel4_2.c: Include stddef.h.
	* nptl/tst-cond16.c: Include limits.h.
	Use sysconf(_SC_PAGESIZE) instead of __getpagesize.
	* nptl/tst-cond18.c: Include limits.h.
	Use sysconf(_SC_PAGESIZE) instead of __getpagesize.
	* nptl/tst-cond4.c: Include stdint.h.
	* nptl/tst-cond6.c: Include stdint.h.
	* nptl/tst-stack2.c: Include limits.h.
	* nptl/tst-stackguard1.c: Include stddef.h.
	* nptl/tst-tls4.c: Include stdint.h. Don't include tls.h.
	* nptl/tst-tls4moda.c: Include stddef.h.
	Don't include stdio.h, unistd.h, or tls.h.
	* nptl/tst-tls4modb.c: Include stddef.h.
	Don't include stdio.h, unistd.h, or tls.h.
	* nptl/tst-tls5.h: Include stddef.h. Don't include stdlib.h or tls.h.
	* posix/tst-getaddrinfo2.c: Include stdio.h.
	* posix/tst-getaddrinfo5.c: Include stdio.h.
	* posix/tst-pathconf.c: Include sys/stat.h.
	* posix/tst-posix_fadvise-common.c: Include stdint.h.
	* posix/tst-preadwrite-common.c: Include sys/stat.h.
	* posix/tst-regex.c: Include stdint.h.
	Don't include spawn.h or spawn_int.h.
	* posix/tst-regexloc.c: Don't include spawn.h or spawn_int.h.
	* posix/tst-vfork3.c: Include sys/stat.h.
	* resolv/tst-bug18665-tcp.c: Include stdlib.h.
	* resolv/tst-res_hconf_reorder.c: Include stdlib.h.
	* resolv/tst-resolv-search.c: Include stdlib.h.
	* stdio-common/tst-fmemopen2.c: Include stdint.h.
	* stdio-common/tst-vfprintf-width-prec.c: Include stdlib.h.
	* stdlib/test-canon.c: Include sys/stat.h.
	* stdlib/tst-tls-atexit.c: Include stdbool.h.
	* string/test-memchr.c: Include stdint.h.
	* string/tst-cmp.c: Include stdint.h.
	* sysdeps/pthread/tst-timer.c: Include stdint.h.
	* sysdeps/unix/sysv/linux/tst-sync_file_range.c: Include stdint.h.
	* sysdeps/wordsize-64/tst-writev.c: Include limits.h and stdint.h.
	* sysdeps/x86_64/fpu/math-tests-arch.h: Include cpu-features.h.
	Don't include init-arch.h.
	* sysdeps/x86_64/multiarch/test-multiarch.h: Include cpu-features.h.
	Don't include init-arch.h.
	* sysdeps/x86_64/tst-auditmod10b.c: Include link.h and stddef.h.
	* sysdeps/x86_64/tst-auditmod3b.c: Include link.h and stddef.h.
	* sysdeps/x86_64/tst-auditmod4b.c: Include link.h and stddef.h.
	* sysdeps/x86_64/tst-auditmod5b.c: Include link.h and stddef.h.
	* sysdeps/x86_64/tst-auditmod6b.c: Include link.h and stddef.h.
	* sysdeps/x86_64/tst-auditmod6c.c: Include link.h and stddef.h.
	* sysdeps/x86_64/tst-auditmod7b.c: Include link.h and stddef.h.
	* time/clocktest.c: Include stdint.h.
	* time/tst-posixtz.c: Include stdint.h.
	* timezone/tst-timezone.c: Include stdint.h.
2017-02-16 17:33:18 -05:00