Ondrej Bilka
37bb363f03
Faster strlen on x64.
2013-03-18 07:39:12 +01:00
Ondrej Bilka
80f844c9d8
Remove Prefer_SSE_for_memop on x64
2013-03-11 15:39:08 +01:00
Ondrej Bilka
87bd9bc4bd
Revert " * sysdeps/x86_64/strlen.S: Replace with new SSE2 based implementation"
...
This reverts commit b79188d717
.
2013-03-06 22:27:18 +01:00
Ondrej Bilka
b79188d717
* sysdeps/x86_64/strlen.S: Replace with new SSE2 based implementation
...
which is faster on all x86_64 architectures.
Tested on AMD, Intel Nehalem, SNB, IVB.
2013-03-06 21:54:01 +01:00
Carlos O'Donell
1a0994f535
BZ#14059: Fix AVX and FMA4 detection.
...
Fix AVX and FMA4 detection by following the guidelines
set out by Intel and AMD for detecting these features.
2012-05-17 06:59:28 -07:00
Ulrich Drepper
1d3e4b618a
Optimized wcschr and wcscpy for x86-64 and x86-32
2011-12-17 14:39:23 -05:00
Liubov Dmitrieva
ce7dd29f28
Optimized strnlen and wcscmp for x86-64
2011-10-23 14:56:04 -04:00
Liubov Dmitrieva
be13f7bff6
Optimized memcmp and wmemcmp for x86-64 and x86-32
2011-10-15 11:10:08 -04:00
Liubov Dmitrieva
a5f524e479
Add Atom-optimized strchr and strrchr for x86-64
2011-09-05 21:34:03 -04:00
Liubov Dmitrieva
99710781cc
Improve 64 bit strcat functions with SSE2/SSSE3
2011-07-19 17:11:54 -04:00
H.J. Lu
8912479f9e
Improved st{r,p}{,n}cpy for SSE2 and SSSE3 on x86-64
2011-06-24 15:14:22 -04:00
H.J. Lu
ff02d5280b
Use IFUNC on x86-64 memset
2010-11-08 03:41:34 -05:00
H.J. Lu
623aac7f84
Unroll x86-64 strlen
2010-08-26 22:09:34 -07:00
Roland McGrath
8b2b771538
Clean up warnings in new x86_64/multiarch code.
2010-08-25 12:13:08 -07:00
Richard Henderson
73f27d5e72
Clean up SSE variable shifts
2010-08-24 11:35:01 -07:00
Ulrich Drepper
e9f82e0d1d
Add optimized strncasecmp versions for x86-64.
2010-08-14 22:04:01 -07:00
Ulrich Drepper
73507d3ae0
Add support for SSSE3 and SSE4.2 versions of strcasecmp on x86-64.
2010-07-31 21:41:09 -07:00
Ulrich Drepper
cc9f2e47a0
Speed up SSE4.2 strcasestr by avoiding indirect function call.
2010-07-16 15:37:38 -07:00
H.J. Lu
6fb8cbcb58
Improve 64bit memcpy/memmove for Atom, Core 2 and Core i7
...
This patch includes optimized 64bit memcpy/memmove for Atom, Core 2 and
Core i7. It improves memcpy by up to 3X on Atom, up to 4X on Core 2 and
up to 1X on Core i7. It also improves memmove by up to 3X on Atom, up to
4X on Core 2 and up to 2X on Core i7.
2010-06-30 08:26:11 -07:00
H.J. Lu
404a6e3201
x86-64 SSE4 optimized memcmp
...
This is 64bit SSE4 optimized memcmp. It improves memcmp by upto 3X
on Intel Core i7.
2010-04-14 00:12:53 -07:00
H.J. Lu
001659f4d5
Implement SSE4.2 optimized strchr and strrchr.
2009-10-22 22:47:12 -07:00
Ulrich Drepper
0fda545d5f
Add SSSE3-optimized implementation of str{,n}cmp for x86-64.
2009-08-07 22:51:02 -07:00
H.J. Lu
7956a3d27c
Add SSE2 support to str{,n}cmp for x86-64.
2009-07-26 13:32:28 -07:00
H.J. Lu
2b7a8664fa
SSE4.2 strstr/strcasestr for x86-64.
...
This patch implements SSE4.2 strstr/strcasestr, using Knuth-Morris-Pratt
string searching algorithm.
2009-07-20 21:06:50 -07:00
H.J. Lu
06e51c8f3d
Add SSE4.2 support for strcspn, strpbrk, and strspn on x86-64.
2009-07-03 02:48:56 -07:00
H.J. Lu
ab6a873fe0
SSSE3 strcpy/stpcpy for x86-64
...
This patch adds SSSE3 strcpy/stpcpy. I got up to 4X speed up on Core 2
and Core i7. I disabled it on Atom since SSSE3 version is slower for
shorter (<64byte) data.
2009-07-02 03:39:03 -07:00
H.J. Lu
772f4e6a1b
Add SSE4.2 support for strcmp and strncmp on x86-64.
2009-06-22 20:38:41 -07:00
Ulrich Drepper
3ab2d57a4d
Optimize x86-64 strlen for SSE4.2.
...
The SSE4.2 implementation is used in the DSO only. The patch also adds
some infrastructure to be used in similar code later one.
2009-06-05 11:32:00 -07:00
Ulrich Drepper
425ce2edb9
* config.h.in (USE_MULTIARCH): Define.
...
* configure.in: Handle --enable-multi-arch.
* elf/dl-runtime.c (_dl_fixup): Handle STT_GNU_IFUNC.
(_dl_fixup_profile): Likewise.
* elf/do-lookup.c (dl_lookup_x): Likewise.
* sysdeps/x86_64/dl-machine.h: Handle STT_GNU_IFUNC.
* elf/elf.h (STT_GNU_IFUNC): Define.
* include/libc-symbols.h (libc_ifunc): Define.
* sysdeps/x86_64/cacheinfo.c: If USE_MULTIARCH is defined, use the
framework in init-arch.h to get CPUID values.
* sysdeps/x86_64/multiarch/Makefile: New file.
* sysdeps/x86_64/multiarch/init-arch.c: New file.
* sysdeps/x86_64/multiarch/init-arch.h: New file.
* sysdeps/x86_64/multiarch/sched_cpucount.c: New file.
* config.make.in (experimental-malloc): Define.
* configure.in: Handle --enable-experimental-malloc.
* malloc/Makefile: Handle experimental-malloc flag.
* malloc/malloc.c: Implement PER_THREAD and ATOMIC_FASTBINS features.
* malloc/arena.c: Likewise.
* malloc/hooks.c: Likewise.
* malloc/malloc.h: Define M_ARENA_TEST and M_ARENA_MAX.
2009-03-13 23:53:18 +00:00