H.J. Lu
ede0236c86
Treat model numbers 0x4a/0x4d as Silvermont
...
* sysdeps/x86_64/multiarch/init-arch.c (__init_cpu_features):
Treat model numbers 0x4a/0x4d as Intel Silvermont architecture.
2015-01-23 18:08:10 -08:00
Joseph Myers
b168057aaa
Update copyright dates with scripts/update-copyrights.
2015-01-02 16:29:47 +00:00
Sihai Yao
f9281df995
Detect if AVX2 is usable
...
This patch checks and sets bit_AVX2_Usable in __cpu_features.feature.
* sysdeps/x86_64/multiarch/ifunc-defines.sym (COMMON_CPUID_INDEX_7):
New.
* sysdeps/x86_64/multiarch/init-arch.c (__init_cpu_features):
Check and set bit_AVX2_Usable.
* sysdeps/x86_64/multiarch/init-arch.h (bit_AVX2_Usable): New
macro.
(bit_AVX2): Likewise.
(index_AVX2_Usable): Likewise.
(CPUID_AVX2): Likewise.
(HAS_AVX2): Likewise.
2014-04-17 08:00:21 -07:00
Allan McRae
d4697bc93d
Update copyright notices with scripts/update-copyrights
2014-01-01 22:00:23 +10:00
Liubov Dmitrieva
6308fd9a46
Skip SSE4.2 versions on Intel Silvermont
...
SSE2/SSSE3 versions are faster than SSE4.2 versions on Intel Silvermont.
2013-06-28 15:31:40 -07:00
Liubov Dmitrieva
d086fc7ba0
Set fast unaligned load flag for new Intel microarchitecture
...
I have small patch for new Intel Silvermont machines.
http://newsroom.intel.com/community/intel_newsroom/blog/2013/05/06/intel-launches-low-power-high-performance-silvermont-microarchitecture
I checked this on my machine and see that strcpy, ... unaligned
versions are faster than ssse3 versions.
2013-06-14 20:46:15 +02:00
Ondrej Bilka
80f844c9d8
Remove Prefer_SSE_for_memop on x64
2013-03-11 15:39:08 +01:00
H.J. Lu
5d7dd1ca84
Add HAS_RTM
2013-01-03 09:38:20 -08:00
Joseph Myers
568035b787
Update copyright notices with scripts/update-copyrights.
2013-01-02 19:05:09 +00:00
H.J. Lu
0569936773
Define HAS_FMA with bit_FMA_Usable
2012-10-02 05:05:17 -07:00
Carlos O'Donell
1a0994f535
BZ#14059: Fix AVX and FMA4 detection.
...
Fix AVX and FMA4 detection by following the guidelines
set out by Intel and AMD for detecting these features.
2012-05-17 06:59:28 -07:00
Paul Eggert
59ba27a63a
Replace FSF snail mail address with URLs.
2012-02-09 23:18:22 +00:00
Ulrich Drepper
08cf777f9e
Really fix AVX tests
...
There is no problem with strcmp, it doesn't use the YMM registers.
The math routines might since gcc perhaps generates such code.
Introduce bit_YMM_USBALE and use it in the math routines.
2012-01-26 09:45:54 -05:00
Ulrich Drepper
afc5ed09cb
Reset bit_AVX in __cpu_features is OS support is missing
2012-01-26 07:45:14 -05:00
Ulrich Drepper
c196fed8f0
Fix compilation problems in x86-64 init-arch
2011-10-21 20:47:20 -04:00
Ulrich Drepper
ed72b6545f
Check for FMA4 support and generate appropriate fma functions
2011-10-20 22:43:15 -04:00
Liubov Dmitrieva
99710781cc
Improve 64 bit strcat functions with SSE2/SSSE3
2011-07-19 17:11:54 -04:00
H.J. Lu
0b1cbaaef5
Optimized st{r,p}{,n}cpy for SSE2/SSSE3 on x86-32
2011-06-24 14:15:32 -04:00
H.J. Lu
3d29045b5e
Assume Intel Core i3/i5/i7 processor if AVX is available
2011-06-03 07:01:25 -04:00
Harsha Jagasia
7e4ba49cd3
Enable SSE2 memset for AMD'supcoming Orochi processor.
...
This patch enables SSE2 memset for AMD's upcoming Orochi processor.
This patch also fixes the following bug:
For misaligned blocks larger than > 144 Bytes, memset branches into
the integer code path depending on the value of misalignment even if
the startup code chooses the SSE2 code path upfront, when multiarch
is enabled.
2011-03-04 23:30:08 -05:00
H.J. Lu
13b695749a
Support Intel processor model 6 and model 0x2.
2010-11-12 03:48:52 -05:00
H.J. Lu
ff02d5280b
Use IFUNC on x86-64 memset
2010-11-08 03:41:34 -05:00
H.J. Lu
e73015f2d6
Unroll 32bit SSE strlen and handle slow bsf
2010-08-25 10:07:37 -07:00
H.J. Lu
6fb8cbcb58
Improve 64bit memcpy/memmove for Atom, Core 2 and Core i7
...
This patch includes optimized 64bit memcpy/memmove for Atom, Core 2 and
Core i7. It improves memcpy by up to 3X on Atom, up to 4X on Core 2 and
up to 1X on Core i7. It also improves memmove by up to 3X on Atom, up to
4X on Core 2 and up to 2X on Core i7.
2010-06-30 08:26:11 -07:00
H.J. Lu
3c88fe1e3a
Incorrect x86 CPU family and model check.
2010-05-27 11:14:18 -07:00
Ulrich Drepper
22f4f44b67
Fix concurrent handling of __cpu_features.
2010-04-04 00:25:46 -07:00
H.J. Lu
3af48cbdfa
Optimize 32bit memset/memcpy with SSE2/SSSE3.
2010-01-12 11:22:03 -08:00
Roland McGrath
9d6982d5d2
Clean up x86 multiarch HAS_FOO macros.
2009-10-06 19:59:03 -07:00
H.J. Lu
5a4eb7282e
Remove ENABLE_SSSE3_ON_ATOM.
...
It turns that SSSE3 isn't slow on Atom. The problem is bsf. This patch
removes ENABLE_SSSE3_ON_ATOM.
2009-08-28 14:54:46 -07:00
H.J. Lu
6f6f1215f6
Support multiarch for i686.
...
This patch adds multiarch support when configured for i686. I modified
some x86-64 functions to support 32bit. I will contribute 32bit SSE string
and memory functions later.
2009-07-31 11:53:35 -07:00
Ulrich Drepper
9a1d2d4555
Prepare use if IFUNC functions outside libc.so.
...
We use a callback function into libc.so to get access to the data
structure with the information and have special versions of the test
macros which automatically use this function.
2009-07-29 15:22:28 -07:00
Ulrich Drepper
d28797e426
Perform test for Arom x86-64 in central place and handle it.
...
There will be more than one function which, in multiarch mode, wants
to use SSSE3. We should not test in each of them for Atoms with
slow SSSE3. Instead, disable the SSSE3 bit in the startup code for
such machines.
2009-07-23 13:15:17 -07:00
Ulrich Drepper
b38a2e2e64
Fix little checkin problem in last patch.
2009-06-30 04:41:38 -07:00
H.J. Lu
0181291385
Determine and store processor family and model on x86-64.
2009-06-30 04:39:09 -07:00
Ulrich Drepper
963cb6fcb4
Simplify CPUID value handling.
...
SO far Intel and AMD use exactly the same bits meaning the same
things in CPUID index 1. Simplify the code. Should an architecture
come along which doesn't use the same semantics then it must use a
different index value than COMMON_CPUID_INDEX_1.
2009-05-31 17:52:05 -07:00
Ulrich Drepper
425ce2edb9
* config.h.in (USE_MULTIARCH): Define.
...
* configure.in: Handle --enable-multi-arch.
* elf/dl-runtime.c (_dl_fixup): Handle STT_GNU_IFUNC.
(_dl_fixup_profile): Likewise.
* elf/do-lookup.c (dl_lookup_x): Likewise.
* sysdeps/x86_64/dl-machine.h: Handle STT_GNU_IFUNC.
* elf/elf.h (STT_GNU_IFUNC): Define.
* include/libc-symbols.h (libc_ifunc): Define.
* sysdeps/x86_64/cacheinfo.c: If USE_MULTIARCH is defined, use the
framework in init-arch.h to get CPUID values.
* sysdeps/x86_64/multiarch/Makefile: New file.
* sysdeps/x86_64/multiarch/init-arch.c: New file.
* sysdeps/x86_64/multiarch/init-arch.h: New file.
* sysdeps/x86_64/multiarch/sched_cpucount.c: New file.
* config.make.in (experimental-malloc): Define.
* configure.in: Handle --enable-experimental-malloc.
* malloc/Makefile: Handle experimental-malloc flag.
* malloc/malloc.c: Implement PER_THREAD and ATOMIC_FASTBINS features.
* malloc/arena.c: Likewise.
* malloc/hooks.c: Likewise.
* malloc/malloc.h: Define M_ARENA_TEST and M_ARENA_MAX.
2009-03-13 23:53:18 +00:00