Commit Graph

152 Commits

Author SHA1 Message Date
Carlos O'Donell
1a0994f535 BZ#14059: Fix AVX and FMA4 detection.
Fix AVX and FMA4 detection by following the guidelines
set out by Intel and AMD for detecting these features.
2012-05-17 06:59:28 -07:00
Liubov Dmitrieva
d7bb4c428a Add optimized expf for x86
2012-05-14  Liubov Dmitrieva  <liubov.dmitrieva@gmail.com>

	* sysdeps/i386/i686/fpu/multiarch/Makefile: New file.
	* sysdeps/i386/i686fpu/multiarch/e_expf.c: New file.
	* sysdeps/i386/i686fpu/multiarch/e_expf-ia32.S: New file.
	* sysdeps/i386/i686/fpu/multiarch/e_expf-sse2.S: New file.
2012-05-14 11:23:56 +02:00
Mike Frysinger
3884932b78 memset: also update copyright years
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2012-04-08 14:20:37 -04:00
Mike Frysinger
1e4920e080 memset: fix define usage for shared libs
The proper define to check "am I in a shared lib" is "SHARED", not "PIC".
The two new memset_chk functions incorrectly depend on "PIC".

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2012-04-07 16:33:50 -04:00
Liubov Dmitrieva
4b43400f6a optimize the following memcpy: sysdeps/i386/i686/multiarch/memcpy-ssse3.S
I've improved the following implementation of memcpy:
"sysdeps/i386/i686/multiarch/memcpy-ssse3.S".

The patch includes some minor style fixes, but the important part is
just using prefetch loops for the case:

DATA_CACHE_SIZE_HALF <= len <  SHARED_CACHE_SIZE_HALF and
src and dst pointers have unequal 16 byte alignments.

This gives from 6% - 50% performance boost on the atom machine, about
24,73% in geometric mean.
2012-03-30 16:45:27 -04:00
H.J. Lu
eb96ffb07d Move stdio-common/_itoa.h to sysdeps/generic 2012-03-20 16:00:23 -07:00
Joseph Myers
0bab47b6b2 Fix x86 strcasecmp_l (bug 13786). 2012-02-29 22:37:38 +00:00
Paul Eggert
59ba27a63a Replace FSF snail mail address with URLs. 2012-02-09 23:18:22 +00:00
Marek Polacek
622c86f480 Remove __ELF__ conditionals 2012-02-07 00:41:11 +01:00
Joseph Myers
9a1d92541f Consistently use macros for x86 PIC thunks. 2012-02-03 23:22:53 +00:00
Liubov Dmitrieva
c044cf14b0 Fix wrong copying processing for last bytes in x86-32 wcscpy
Wrong copy algorithm for last bytes, not thread safety.
In some particular cases it uses the destination
memory beyond the string end for
16-byte load, puts changes into that part that is relevant
to destination string and writes whole 16-byte chunk into memory.
I have a test case where the memory beyond the string end contains
malloc/free data, that appear corrupted in case free() updates
it in between the 16-byte read and 16-byte write.
2011-12-23 08:50:39 -05:00
Liubov Dmitrieva
2bd779ae3f Fix overrun in strcpy destination buffer in x86-32/SSSE3 version 2011-12-22 14:22:00 -05:00
Ulrich Drepper
1d3e4b618a Optimized wcschr and wcscpy for x86-64 and x86-32 2011-12-17 14:39:23 -05:00
Andreas Schwab
5583a0862c Fix SSSE3/SSE4.2 strcasecmp[_l]/strncasecmp[_l] for non-PIC and -mno-tls-direct-seg-refs 2011-11-16 11:48:10 +01:00
Ulrich Drepper
6abf346582 Add SSE4.2 support for strcasecmp and strncasecmp on x86-32 2011-11-14 18:24:35 -05:00
Ulrich Drepper
76e3966e9e SSSE3 optimized strcasecmp and strncasecmp for x86-32 2011-11-13 09:50:13 -05:00
Ulrich Drepper
e7f4b08ee9 Fix warnings in fallback C code of x86-32 wide memory functions 2011-11-12 00:50:26 -05:00
Ulrich Drepper
fe72eebd67 Remove unnecessary code from x86-32 SSSE3 strncmp 2011-11-08 07:50:20 -05:00
Andreas Schwab
0c92d8a87a Fix some warning nits 2011-10-28 12:02:08 +02:00
Andreas Schwab
b43433460b Move wide char related routines to wcsmbs subdir 2011-10-28 12:01:29 +02:00
Ulrich Drepper
2fa2ae85ca Fix strnlen change 2011-10-23 16:30:40 -04:00
Liubov Dmitrieva
fc2ee42abe Add optimized wcslen and strnlen for x86-32 2011-10-23 15:17:23 -04:00
Michael Zolotukhin
979c70a3b1 Improve x86-32 SSSE3 memcpy 2011-10-23 14:28:26 -04:00
Ulrich Drepper
f17424ed53 Fix WS 2011-10-23 13:35:24 -04:00
Liubov Dmitrieva
95584d3b33 Fix signedness in wcscmp comparison 2011-10-23 13:34:15 -04:00
Ulrich Drepper
79b195b55a No need for boundary case handling in x86-32 __ieee_log 2011-10-15 22:21:53 -04:00
Ulrich Drepper
ba1a0d5938 No need for boundary case handling in x86-32 __ieee_logf 2011-10-15 18:09:12 -04:00
Liubov Dmitrieva
be13f7bff6 Optimized memcmp and wmemcmp for x86-64 and x86-32 2011-10-15 11:10:08 -04:00
Ulrich Drepper
38ad40ceca Optimize x86-32 log 2011-10-14 23:41:47 -04:00
Ulrich Drepper
f9e123204e Fix whitespaces 2011-10-12 11:42:57 -04:00
Liubov Dmitrieva
951fbcec70 Optimized memchr, memrchr, rawmemchr for x86-32 2011-10-12 11:42:04 -04:00
Liubov Dmitrieva
48882a1abe Fix up x86-32 section names for Atom code 2011-09-07 22:28:44 -04:00
Ulrich Drepper
ceaa0c5dc3 Move Atom-optimized code out of the way and together 2011-09-06 21:53:03 -04:00
Ulrich Drepper
b0fc1ff04e Fix whitespaces 2011-09-05 17:12:27 -04:00
Liubov Dmitrieva
693fb94884 Optimized strchr and strrchr with SSE2 on x86-32 2011-09-05 17:11:11 -04:00
Ulrich Drepper
5fc11f0d64 Fix whitespaces 2011-09-05 13:54:51 -04:00
Ulrich Drepper
1b48c53782 Add x86-32 optimized wcscmp 2011-09-05 13:53:27 -04:00
Andreas Schwab
2cae499541 Fix spurious nop at start of __strspn_ia32 2011-08-23 15:53:51 +02:00
Ulrich Drepper
b969a69b2e Fix whitespaces 2011-08-04 15:38:35 -04:00
Liubov Dmitrieva
5fa16e9b01 Improve x86-32 strcat functions with SSE2/SSSE3 2011-08-04 15:33:38 -04:00
Roland McGrath
661607b3dd Quash a warning in strstr-c.c built for static. 2011-07-14 20:47:54 -07:00
H.J. Lu
acb0d739c5 Fix unwind info in 32bit SSE2/SSSE3 strncpy 2011-06-25 01:32:27 -04:00
H.J. Lu
0b1cbaaef5 Optimized st{r,p}{,n}cpy for SSE2/SSSE3 on x86-32 2011-06-24 14:15:32 -04:00
Mike Frysinger
4c559bcdf3 Fix static linking with checking x86/x86-64 memcpy. 2011-04-17 22:20:47 -04:00
Ulrich Drepper
283007197c Undo accidental checkin. 2010-12-14 13:09:28 -05:00
Jakub Jelinek
42acbb92c8 Fix -D_FORTIFY_SOURCE memmove and bcop 2010-12-09 10:38:18 -05:00
H.J. Lu
3a4a2499ec Remove dead code from x86-32 SSSE3 strncmp. 2010-12-01 22:18:31 -05:00
Ulrich Drepper
c0dde15b5d 32bit memset-sse2.S fails with uneven cache size
32bit memset-sse2.S assumes cache size is multiple of 128 bytes.  If
it isn't true, memset-sse2.S will fail.  For example, a processor can
have 24576 KB L3 cache and 20 cores. That is 2516582 byte per core. Half
of it is 1258291, which isn't helpful for vector instructions.  This
patch rounds cache sizes to multiple of 256 bytes and adds "raw" cache
sizes.
2010-11-05 07:57:46 -04:00
Jakub Jelinek
5e908464b9 Implement accurate fma. 2010-10-13 22:27:03 -04:00
Jakub Jelinek
9ff8d36f27 Correct implementation of fmaf. 2010-10-11 09:27:05 -04:00