Commit Graph

758 Commits

Author SHA1 Message Date
Andreas Schwab
5bb43a4319 Make __ffs hidden 2013-09-20 21:25:31 +02:00
Ondřej Bílka
5905e7b3e2 Faster strchr implementation. 2013-09-11 17:07:38 +02:00
Joseph Myers
ffa3cd7f1a Fix lgammaf spurious underflow (bug 15427). 2013-09-03 15:32:54 +00:00
Ondřej Bílka
8f02859f17 Add unaligned strcmp. 2013-09-03 16:27:10 +02:00
Joseph Myers
b7835e3223 Fix spurious jnf underflows (bug 14155). 2013-09-02 14:51:24 +00:00
Ondřej Bílka
382466e04e Fix typos. 2013-08-30 18:08:59 +02:00
Ondřej Bílka
0186c6e97e Fix rawmemchr regression on bulldozer. 2013-08-30 10:14:37 +02:00
Ondřej Bílka
c0c3f78afb Fix typos. 2013-08-21 19:48:48 +02:00
Jeroen Albers
72c90ed01f Update x86 and x86_64 ulps on AMD FX-8350 with GCC 4.8.1. 2013-07-05 12:58:20 +00:00
Markus Trippelsdorf
5314ed1afd Update x86_64 ULPs. 2013-07-02 22:01:13 +00:00
Joseph Myers
67338156ea Regenerate x86 and x86_64 ulps. 2013-07-02 20:01:15 +00:00
Liubov Dmitrieva
6308fd9a46 Skip SSE4.2 versions on Intel Silvermont
SSE2/SSSE3 versions are faster than SSE4.2 versions on Intel Silvermont.
2013-06-28 15:31:40 -07:00
Liubov Dmitrieva
11b8a0e1d7 Fix buffers overrun in x86_64 memcmp-ssse3.S 2013-06-26 12:31:51 -07:00
Liubov Dmitrieva
d086fc7ba0 Set fast unaligned load flag for new Intel microarchitecture
I have small patch for new Intel Silvermont machines.

http://newsroom.intel.com/community/intel_newsroom/blog/2013/05/06/intel-launches-low-power-high-performance-silvermont-microarchitecture

I checked this on my machine and see that strcpy, ... unaligned
versions are faster than ssse3 versions.
2013-06-14 20:46:15 +02:00
Siddhesh Poyarekar
747ef469ff Add rtld-memset.S for x86_64
Resolves: BZ #15627

Add an assembler version of rtld-memset to avoid using SSE registers.
2013-06-15 00:09:26 +05:30
Joseph Myers
9c84384cc1 Remove trailing whitespace. 2013-06-05 20:44:03 +00:00
Siddhesh Poyarekar
b937534868 Avoid crashing in LD_DEBUG when program name is unavailable
Resolves: #15465

The program name may be unavailable if the user application tampers
with argc and argv[].  Some parts of the dynamic linker caters for
this while others don't, so this patch consolidates the check and
fallback into a single macro and updates all users.
2013-05-29 21:34:12 +05:30
Joseph Myers
dd4259b9f7 Test drem and pow10 in libm-test.inc. 2013-05-24 20:33:14 +00:00
Joseph Myers
4f8dfe270b Use same tests for isfinite/finite, lgamma/gamma. 2013-05-24 19:21:22 +00:00
Joseph Myers
b50a71810b Don't include expected results in libm-test test names. 2013-05-22 11:49:36 +00:00
Ondrej Bilka
b2b671b677 Faster memset on x64
This implementation speed up memset in several ways. First is avoiding
expensive computed jump. Second is using fact that arguments of memset
are most of time aligned to 8 bytes.

Benchmark results on:
kam.mff.cuni.cz/~ondra/benchmark_string/memset_profile_result27_04_13.tar.bz2
2013-05-20 08:32:45 +02:00
Ondrej Bilka
2d48b41c8f Faster memcpy on x64.
We add new memcpy version that uses unaligned loads which are fast
on modern processors. This allows second improvement which is avoiding
computed jump which is relatively expensive operation.

Tests available here:
http://kam.mff.cuni.cz/~ondra/memcpy_profile_result27_04_13.tar.bz2
2013-05-20 08:24:41 +02:00
Joseph Myers
db62a90753 Handle sincos with generic libm-test logic. 2013-05-19 14:45:41 +00:00
Ryan S. Arnold
e054f49430 Add #include <stdint.h> for uint[32|64]_t usage (except installed headers). 2013-05-16 11:32:54 -05:00
Peter Collingbourne
1deff3dca1 Use movq for 64-bit operations
The EXTRACT_WORDS64 and INSERT_WORDS64 macros use movd for a 64-bit
operation.  Somehow gcc manages to turn this into movq, but LLVM won't.

2013-05-15  Peter Collingbourne  <pcc@google.com>

	* sysdeps/x86_64/fpu/math_private.h (MOVQ): New macro.
	(EXTRACT_WORDS64) Use where appropriate.
	(INSERT_WORDS64) Likewise.
2013-05-15 20:33:45 +02:00
Peter Collingbourne
791f3ba0db Use x constraints for operands to vfmaddss and vfmaddsd
While these instructions accept memory operands, only one operand
may be a memory operand.  Giving two operands xm constraints gives
the compiler the option of using memory for both operands, which
would result in invalid assembly code.  Using x for all operands is
more appropriate, as most x86_64 calling conventions will pass the
arguments in registers anyway.

2013-05-15  Peter Collingbourne  <pcc@google.com>

	* sysdeps/x86_64/fpu/multiarch/s_fma.c (__fma_fma4): Replace xm
	constraints with x constraints.
	* sysdeps/x86_64/fpu/multiarch/s_fmaf.c (__fmaf_fma4): Likewise.
2013-05-15 20:31:53 +02:00
Joseph Myers
d8cd06db62 Improve tgamma accuracy (bugs 2546, 2560, 5159, 15426). 2013-05-08 11:58:18 +00:00
Joseph Myers
10de07f5fd Fix catan, catanh spurious underflows (bug 15423). 2013-05-01 10:07:00 +00:00
Joseph Myers
caf84319c1 Fix catan, catanh inaccuracy from atan2 denominators near 0 (bug 15416). 2013-04-30 11:27:35 +00:00
Joseph Myers
5b4217d71f Fix catan, catanh spurious overflows (bug 15409). 2013-04-27 14:57:41 +00:00
Markus Trippelsdorf
1b8359836d Update x86_64 ULPs
2013-04-26  Markus Trippelsdorf  <markus@trippelsdorf.de>

	* sysdeps/x86_64/fpu/libm-test-ulps: Update.
2013-04-26 09:30:46 +02:00
Joseph Myers
73709b2611 Move x86_64-specific audit tests to sysdeps/x86_64/. 2013-04-25 19:23:11 +00:00
Joseph Myers
2f38fbfe09 Fix catan, catanh inaccuracy through use of log (bug 15394). 2013-04-24 18:49:13 +00:00
Carlos O'Donell
aba5e333d4 libm-test.inc: Fix tests where cos(PI/2) != 0.
The value of PI is never exactly PI in any floating point representation,
and the value of PI/2 is never PI/2. It is wrong to expect cos(M_PI_2l)
to return 0, instead it will return an answer that is  non-zero because
M_PI_2l doesn't round to exactly PI/2 in the type used.

That is to say that the correct answer is to do the following:
* Take PI or PI/2.
* Round to the floating point representation.
* Take the rounded value and compute an infinite precision cos or sin.
* Use the rounded result of the infinite precision cos or sin as the
  answer to the test.

I used printf to do the type rounding, and Wolfram's Alpha to do the
infinite precision cos calculations.

The following changes bring x86-64 and x86 to 1/2 ulp for two tests.
It shows that the x86 cos implementation is quite good, and that
our test are flawed.

Unfortunately given that the rounding errors are type dependent we
need to fix this for each type. No regressions on x86-64 or x86.

---

2013-04-11  Carlos O'Donell  <carlos@redhat.com>

	* math/libm-test.inc (cos_test): Fix PI/2 test.
	(sincos_test): Likewise.
	* sysdeps/x86_64/fpu/libm-test-ulps: Regenerate.
	* sysdeps/i386/fpu/libm-test-ulps: Regenerate.
2013-04-11 08:52:18 -04:00
Joseph Myers
52ce486045 Fix cacosh inaccuracy and spurious exceptions (bug 15327). 2013-04-02 22:54:00 +00:00
Joseph Myers
ccc8cadf75 Fix casinh inaccuracy for imaginary part < 1.0, real part small (bug 10357). 2013-03-30 13:31:53 +00:00
Joseph Myers
3a7182a14b Fix casinh inaccuracy near i, imaginary part > 1 (bug 15307). 2013-03-27 14:38:44 +00:00
Dmitry V. Levin
2e0fb52187 BZ#11120: fix x86_64/strcmp.S NOT_IN_libc safeguards
Due to a typo repeated several times, this bug hasn't been fixed yet,
despite being marked as resolved in glibc 2.12.

* sysdeps/x86_64/strcmp.S: Replace all occurrences of NOT_IN_lib
with NOT_IN_libc.
2013-03-22 03:16:00 +00:00
Joseph Myers
0a1b2ae6f6 Fix casinh inaccuracy for argument with imaginary part 1 (bug 15287). 2013-03-21 10:27:10 +00:00
Joseph Myers
bef0b50749 Move system-specific settings out of toplevel configure.in and config.make.in. 2013-03-20 22:37:06 +00:00
Ondrej Bilka
37bb363f03 Faster strlen on x64. 2013-03-18 07:39:12 +01:00
Joseph Myers
d2f9799e7c Fix y1l spurious overflows for ldbl-96 (bug 15283). 2013-03-16 17:51:48 +00:00
Joseph Myers
06d5adfbda Regenerate sysdeps/x86_64/preconfigure. 2013-03-15 01:18:32 +00:00
Ondrej Bilka
80f844c9d8 Remove Prefer_SSE_for_memop on x64 2013-03-11 15:39:08 +01:00
Ondrej Bilka
87bd9bc4bd Revert " * sysdeps/x86_64/strlen.S: Replace with new SSE2 based implementation"
This reverts commit b79188d717.
2013-03-06 22:27:18 +01:00
Ondrej Bilka
b79188d717 * sysdeps/x86_64/strlen.S: Replace with new SSE2 based implementation
which is faster on all x86_64 architectures.
	Tested on AMD, Intel Nehalem, SNB, IVB.
2013-03-06 21:54:01 +01:00
Joseph Myers
2969121014 Remove bounded-pointers handling from x86_64 assembly sources. 2013-02-17 21:57:26 +00:00
Siddhesh Poyarekar
d6752ccd69 New __sqr function as a faster special case of __mul 2013-02-14 10:31:09 +05:30
Roland McGrath
f1d70dad53 Remove lots of inline keywords. 2013-02-07 14:44:18 -08:00
Joseph Myers
8cf28c5ebe Fix casinh spurious underflows away from [-i,i] (bug 15062). 2013-01-31 22:55:29 +00:00
Joseph Myers
728d7b43fc Fix cacos real-part inaccuracy for result real part near 0 (bug 15023). 2013-01-17 20:25:51 +00:00
H.J. Lu
22676eafed Implement x86 SIZE32/SIZE64 relocations 2013-01-16 20:31:03 -08:00
Joseph Myers
a9708fed77 Fix casinh, casin overflow (bug 14996). 2013-01-07 14:59:53 +00:00
H.J. Lu
afec409af9 Change __x86_64 prefix in cache size to __x86 2013-01-05 16:00:38 -08:00
Joseph Myers
cdc1c96fba Fix casinh, casin inaccuracy from cancellation (bug 14994). 2013-01-04 13:25:17 +00:00
H.J. Lu
5d7dd1ca84 Add HAS_RTM 2013-01-03 09:38:20 -08:00
Joseph Myers
568035b787 Update copyright notices with scripts/update-copyrights. 2013-01-02 19:05:09 +00:00
Siddhesh Poyarekar
b76eb5f076 Move mpone out to a global const
Code cleanup.
2012-12-27 20:43:24 +05:30
Joseph Myers
1bead169c3 Fix powl inaccuracy for x86_64 and x86 (bug 13881). 2012-11-28 13:40:54 +00:00
H.J. Lu
c515fb5148 Cast to __intptr_t before casting pointer to int64 2012-11-26 16:45:36 -08:00
Pino Toscano
94558d30b1 test-multiarch: terminate printf output with newline 2012-11-22 11:34:03 +01:00
David S. Miller
6d33cc9d9b Fix spurious underflows in ldbl-128 atan implementation.
With help from Joseph Myers.
	* sysdeps/ieee754/ldbl-128/s_atanl.c (__atanl): Handle tiny and
	very large arguments properly.
	* math/libm-test.inc (atan_test): New tests.
	(atan2_test): New tests.
	* sysdeps/sparc/fpu/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Update.
2012-11-19 15:31:24 -08:00
David S. Miller
05b227bdae Correct tinyness handling in long-double and float y0/y1.
With help from Joseph Myers.
	* sysdeps/ieee754/flt-32/e_j0f.c (__ieee754_y0f): Adjust tinyness
	cutoff to 2**-13.
	* sysdeps/ieee754/flt-32/e_j1f.c (__ieee754_y1f): Adjust tinyness
	cutoff to 2**-25.
	* sysdeps/ieee754/ldbl-128/e_j0l.c (U0): New constant.
	( __ieee754_y0l): Avoid arithmetic underflow when 'x' is very
	small.
	* sysdeps/ieee754/ldbl-128/e_j1l.c (__ieee754_y1l): Likewise.
	* math/libm-test.inc (y0_test): New tests.
	(y1_test): New tests.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Update.
	* sysdeps/sparc/fpu/libm-test-ulps: Update.
2012-11-18 12:33:53 -08:00
H.J. Lu
eb48db7e89 Also run tst-xmmymm.sh on i386 ld.so 2012-11-07 13:50:08 -08:00
Joseph Myers
60e235ee2a Fix spurious underflows from pow with results close to 1 (bug 14811). 2012-11-07 13:03:31 +00:00
Joseph Myers
5b5b04d628 Make fma use of Dekker and Knuth algorithms use round-to-nearest (bug 14796). 2012-11-03 19:48:53 +00:00
H.J. Lu
f62c8abcfb Compile x86 rtld with -mno-sse -mno-mmx 2012-11-02 18:43:27 -07:00
H.J. Lu
954ef0d98d Use sysdeps/x86/tininess.h for i386 and x86_64 2012-10-30 20:38:31 -07:00
Joseph Myers
2a27fd6dae Fix strtod handling of underflow (bug 14047). 2012-10-30 13:51:27 +00:00
H.J. Lu
ac49ecaf9d Add x86-64 __libc_ifunc_impl_list 2012-10-11 16:41:12 -07:00
H.J. Lu
9a387d1f78 Use IFUNC memmove/memset in x86-64 bcopy/bzero
Also add separate tests for bcopy and bzero.
2012-10-11 13:58:16 -07:00
Roland McGrath
b8493de0ec Add missing magic to GLIBC_PROVIDES. 2012-10-09 15:41:30 -07:00
H.J. Lu
0569936773 Define HAS_FMA with bit_FMA_Usable 2012-10-02 05:05:17 -07:00
H.J. Lu
9bac1d8624 Define VERSYMIDX/VALIDX/ADDRIDX in ldsodefs.h 2012-09-28 11:30:57 -07:00
H.J. Lu
31ed415328 Don't define x86-64 __strncmp_ssse3 in libc.a 2012-09-27 07:43:03 -07:00
Markus Trippelsdorf
43c4edba7e Update x86-64 ULPs 2012-09-26 12:46:51 +02:00
Joseph Myers
d032e0d29b Fix inaccuracy of clog, clog10 near |z| = 1 (bug 13629). 2012-09-25 19:43:49 +00:00
Liubov Dmitrieva
22bf5c1793 Add optimized sincosf for SSE2 for x86 and x86-64 2012-09-25 20:47:20 +02:00
Dmitry V. Levin
57c69bef13 Set "fail on error" mode directly in testsuite shell scripts 2012-09-25 02:48:31 +00:00
Dmitry V. Levin
9a9028b1fe Add copyright notices to testsuite shell scripts 2012-09-25 02:48:13 +00:00
Liubov Dmitrieva
80ccd52c95 Fix x86 SSE cosf, sinf issues
* sysdeps/i386/i686/fpu/multiarch/s_sinf-sse2.S: Fix
	unwind info if defined PIC. Fix special cases description.
	* sysdeps/i386/i686/fpu/multiarch/s_cosf-sse2.S: Likewise.

	* sysdeps/x86_64/fpu/s_sinf.S: Fix special cases description, fix
	DP_HI_MASK entry.
	* sysdeps/x86_64/fpu/s_cosf.S: Likewise.
2012-09-10 11:44:49 +02:00
Andreas Jaeger
bcd6c8dc64 Update libm-test-ulps 2012-09-03 15:43:56 +02:00
Liubov Dmitrieva
4ffffbd272 Add optimized sinf and cosf routines for x86 and x86-64
* sysdeps/i386/i686/fpu/multiarch/Makefile (sysdep_routines):
	Add s_sinf-sse2, s_conf-sse2.

	* sysdeps/i386/i686/fpu/multiarch/s_sinf-sse2.S: New file.
	* sysdeps/i386/i686/fpu/multiarch/s_cosf-sse2.S: New file.
	* sysdeps/i386/i686/fpu/multiarch/s_sinf.c: New file.
	* sysdeps/i386/i686/fpu/multiarch/s_cosf.c: New file.

	* sysdeps/ieee754/flt-32/s_sinf.c (SINF, SINF_FUNC): Add macros
	for using routine as __sinf_ia32.
	Use macro for function declaration and weak_alias.
	* sysdeps/ieee754/flt-32/s_cosf.c (COSF, COSF_FUNC): Add macros
	for using routine as __cosf_ia32.
	Use macro for function declaration and weak_alias.

	* sysdeps/i386/i686/fpu/multiarch/e_expf-sse2.S: Fix Copyright.
	* sysdeps/i386/i686/fpu/multiarch/e_expf.c: Fix Copyright.

	* sysdeps/x86_64/fpu/s_sinf.S: New file.
	* sysdeps/x86_64/fpu/s_cosf.S: New file.
	* sysdeps/x86_64/fpu/libm-test-ulps: Update.

	* math/libm-test.inc (cos_test): Add more test cases.
	(sin_test): Likewise.
	(sincos_test): Likewise.
2012-09-03 15:32:13 +02:00
H.J. Lu
5f30cfec00 Use the first element of GOT for ld.so addresses
[BZ #14538]
	* sysdeps/x86_64/dl-machine.h (elf_machine_dynamic): Use the
	first element of the GOT.
	(elf_machine_load_address): Return the difference between
	the runtime address of _DYNAMIC and elf_machine_dynamic ().
2012-09-02 05:22:24 -07:00
Roland McGrath
7312ca90dc Clean up x86_64/multiarch/strstr-c.c include order. 2012-08-15 11:38:57 -07:00
Roland McGrath
9a0a54864b Clean up x86_64/multiarch/memmove.c include order. 2012-08-15 11:26:02 -07:00
Mike Frysinger
ca98e1710e i386/x86_64: punt HAVE_CPP_ASM_DEBUGINFO
Pretty sure we require recent enough versions of gcc/binutils to make this
check pointless.  I can't any logs in the last few years where this check
didn't return "yes".

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2012-08-14 21:37:00 -04:00
Markus Trippelsdorf
ba6cba9eec Update x86-64 ULPs
The recent clog and clog10 fixes are causing some failing tests on my
AMD64 CPU.
2012-08-13 21:25:17 +02:00
H.J. Lu
f85fa27058 Avoid DWARF definition DIE on ifunc symbols 2012-08-09 16:04:37 -07:00
Marek Polacek
b67e9372b2 Get rid of ASM_TYPE_DIRECTIVE{,_PREFIX}. 2012-08-02 21:04:29 +02:00
Joseph Myers
d0419dbfbd Improve clog, clog10 handling of values with real or imaginary part slightly above 1 (bug 13629). 2012-07-31 14:21:19 +00:00
Joseph Myers
da865e95bc Improve clog, clog10 handling of values with real or imaginary part 1 (bug 13629). 2012-07-26 11:31:35 +00:00
Joseph Myers
3129cfc6ec Move testsuite audit definitions to sysdeps tst-audit.h files. 2012-07-26 11:29:07 +00:00
Joseph Myers
56e49b714e Move ldsodefs.h audit definitions to sysdeps directories. 2012-07-25 16:03:02 +00:00
Marek Polacek
3b05db33f6 Remove TLS configure checks. 2012-07-17 23:57:43 +02:00
Joseph Myers
cfc82fd8ac Split tls-macros.h into sysdeps directories. 2012-07-17 11:30:58 +00:00
Marek Polacek
7b8e0d49cb Get rid of ASM_GLOBAL_DIRECTIVE. 2012-07-10 14:30:24 +02:00
Joseph Myers
638a572eb0 Fix clog, clog10 spurious underflow exceptions (bug 14337). 2012-07-09 11:06:34 +00:00
Joseph Myers
f17ac40d7c Fix expm1 spurious underflow exceptions (bug 6778). 2012-07-06 11:17:41 +00:00
Joseph Myers
cdfe2c5eb3 Fix csqrt underflow (bugs 14157, 14331). 2012-07-05 11:02:13 +00:00