Ulrich Drepper
8ffcee4a04
Fix limit detection in x86-64 SSE2 strncasecmp.
2010-09-20 14:02:23 -07:00
Ulrich Drepper
9ea3de11f1
Move slow Atom code to separate section.
2010-08-26 22:17:03 -07:00
H.J. Lu
623aac7f84
Unroll x86-64 strlen
2010-08-26 22:09:34 -07:00
H.J. Lu
b416a90085
Missing comma in last commit.
2010-08-26 13:18:46 -07:00
Roland McGrath
8b2b771538
Clean up warnings in new x86_64/multiarch code.
2010-08-25 12:13:08 -07:00
H.J. Lu
e73015f2d6
Unroll 32bit SSE strlen and handle slow bsf
2010-08-25 10:07:37 -07:00
Ulrich Drepper
1cdfe7242f
Add missing copyright year updated and pretty printing.
2010-08-24 11:42:19 -07:00
Richard Henderson
73f27d5e72
Clean up SSE variable shifts
2010-08-24 11:35:01 -07:00
Ulrich Drepper
9da4bb316f
Fix two typos in x86-64 SSE4.2 strncasecmp implementation.
2010-08-19 09:20:44 -07:00
Ulrich Drepper
1feccb6caf
Fix fourth parameter of SSE4.2 strcmp for x86-64.
2010-08-15 20:46:09 -07:00
Ulrich Drepper
e9f82e0d1d
Add optimized strncasecmp versions for x86-64.
2010-08-14 22:04:01 -07:00
Ulrich Drepper
ca6bb004eb
Fix x86-64 build without multiarch.
2010-08-14 14:56:32 -07:00
Ulrich Drepper
73507d3ae0
Add support for SSSE3 and SSE4.2 versions of strcasecmp on x86-64.
2010-07-31 21:41:09 -07:00
Ulrich Drepper
66f6765a47
Pretty printing x86-64 SSE4.3 strcmp.
2010-07-30 12:54:37 -07:00
Ulrich Drepper
fe36dd025e
Fix tolower operation in strcasestr.
2010-07-30 00:09:07 -07:00
Ulrich Drepper
880113d91e
Avoid compiling unneeded file in ld.so.
2010-07-27 21:12:59 -07:00
Ulrich Drepper
8e96b93aa7
Speed up x86-64 strcasestr a bit moew.
...
Using the new SSE4.2 instructions is cool but not really the fastest.
Some older SSE instructions can do the trick faster.
2010-07-24 08:34:44 -07:00
Andreas Schwab
f6a31e0eb6
Add strcasestr-nonascii to i386 build
2010-07-21 07:26:18 -07:00
Ulrich Drepper
d02dc4ba08
Fix non-ASCII case of SSE4.2 strcasstr.
2010-07-16 16:00:22 -07:00
Ulrich Drepper
cc9f2e47a0
Speed up SSE4.2 strcasestr by avoiding indirect function call.
2010-07-16 15:37:38 -07:00
H.J. Lu
6fb8cbcb58
Improve 64bit memcpy/memmove for Atom, Core 2 and Core i7
...
This patch includes optimized 64bit memcpy/memmove for Atom, Core 2 and
Core i7. It improves memcpy by up to 3X on Atom, up to 4X on Core 2 and
up to 1X on Core i7. It also improves memmove by up to 3X on Atom, up to
4X on Core 2 and up to 2X on Core i7.
2010-06-30 08:26:11 -07:00
H.J. Lu
3c88fe1e3a
Incorrect x86 CPU family and model check.
2010-05-27 11:14:18 -07:00
H.J. Lu
df87f54923
Check DATA_CACHE_SIZE_HALF
2010-04-14 22:18:27 -07:00
H.J. Lu
dd37cd1a12
Optimie x86-64 SSE4 memcmp for unaligned data.
2010-04-14 17:53:44 -07:00
H.J. Lu
404a6e3201
x86-64 SSE4 optimized memcmp
...
This is 64bit SSE4 optimized memcmp. It improves memcmp by upto 3X
on Intel Core i7.
2010-04-14 00:12:53 -07:00
Ulrich Drepper
bbbdd77809
Update x86-64 cpu multiarch selection header.
2010-04-13 19:17:10 -07:00
Ulrich Drepper
22f4f44b67
Fix concurrent handling of __cpu_features.
2010-04-04 00:25:46 -07:00
H.J. Lu
7d9335ecd7
Don't define __strpbrk_sse42 in static library
2010-03-24 12:16:24 -07:00
H.J. Lu
5a7af22fbb
Unroll the loop x86-64 SSE4.2 strlen.
2010-01-13 07:51:48 -08:00
H.J. Lu
3af48cbdfa
Optimize 32bit memset/memcpy with SSE2/SSSE3.
2010-01-12 11:22:03 -08:00
H.J. Lu
2510d01ddb
Define bit_SSE2 and index_SSE2.
2009-12-13 15:23:02 -08:00
H.J. Lu
51ddd2c01e
Define bit_XXX and index_XXX.
...
This patch defines bit_XXX and index_XXX and use them to check processor
feature in assembly code. It can prevent typos in processor feature
check.
2009-12-13 09:47:02 -08:00
Ulrich Drepper
823bc6da65
Fix whitespaces.
2009-10-22 22:50:00 -07:00
H.J. Lu
001659f4d5
Implement SSE4.2 optimized strchr and strrchr.
2009-10-22 22:47:12 -07:00
Roland McGrath
b0f3a2e43f
Clean up unnecessary libc_hidden_builtin_def fiddling in x86 multiarch definitions.
2009-10-06 20:01:23 -07:00
Roland McGrath
9d6982d5d2
Clean up x86 multiarch HAS_FOO macros.
2009-10-06 19:59:03 -07:00
Jakub Jelinek
22bb992d51
Fix strstr/strcasestr/fma/fmaf on x86_64.
2009-09-02 19:43:04 -07:00
H.J. Lu
5a4eb7282e
Remove ENABLE_SSSE3_ON_ATOM.
...
It turns that SSSE3 isn't slow on Atom. The problem is bsf. This patch
removes ENABLE_SSSE3_ON_ATOM.
2009-08-28 14:54:46 -07:00
Ulrich Drepper
8e436522e1
Move SSE4.2 functions together.
2009-08-08 09:38:32 -07:00
Ulrich Drepper
0fda545d5f
Add SSSE3-optimized implementation of str{,n}cmp for x86-64.
2009-08-07 22:51:02 -07:00
Ulrich Drepper
57b378ac89
Avoid warning through fake initialization.
2009-08-07 16:19:54 -07:00
H.J. Lu
02cea47161
Add x86 32-bit SSE4.2 string functions.
...
This patch adds 32bit SSE4.2 string functions. It uses -16L instead of
0xfffffffffffffff0L, which works for both 32bit and 64bit long. Tested
on 32bit Core i7 and Core 2.
2009-08-04 12:13:43 -07:00
H.J. Lu
6f6f1215f6
Support multiarch for i686.
...
This patch adds multiarch support when configured for i686. I modified
some x86-64 functions to support 32bit. I will contribute 32bit SSE string
and memory functions later.
2009-07-31 11:53:35 -07:00
Ulrich Drepper
78c4ef475d
Add support for x86-64 fma instruction.
...
Use it to implement fma and fmaf, if possible.
2009-07-29 15:26:06 -07:00
Ulrich Drepper
9a1d2d4555
Prepare use if IFUNC functions outside libc.so.
...
We use a callback function into libc.so to get access to the data
structure with the information and have special versions of the test
macros which automatically use this function.
2009-07-29 15:22:28 -07:00
Ulrich Drepper
e83c1a8a72
Refine testing for xmm/ymm register use in x86-64 ld.so.
...
The test now takes the callgraph into account. Only code called
during runtime relocation is affected by the limitation. We now
determine the affected object files as closely as possible from
the outside. This allowed to remove some the specializations
for some of the string functions as they are only used in other
code paths.
2009-07-27 13:40:27 -07:00
Ulrich Drepper
16d2ea4c82
Make sure no code in ld.so uses xmm/ymm registers on x86-64.
...
This patch introduces a test to make sure no function modifies the
xmm/ymm registers. With the exception of the auditing functions.
The test is probably too pessimistic. All code linked into ld.so
is checked. Perhaps at some point the callgraph starting from
_dl_fixup and _dl_profile_fixup is checked and we can start using
faster SSE-using functions in parts of ld.so.
2009-07-26 16:10:00 -07:00
H.J. Lu
7956a3d27c
Add SSE2 support to str{,n}cmp for x86-64.
2009-07-26 13:32:28 -07:00
H.J. Lu
4e5b5821bf
Some some optimizations for x86-64 strcmp.
2009-07-25 19:15:14 -07:00
Ulrich Drepper
29e92fa5cd
Optimize x86-64 SSE4.2 strcmp.
...
The file contained some code which was never used. Don't compile it
in.
2009-07-25 12:02:47 -07:00