glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-12-28 05:21:13 +00:00

Author	SHA1	Message	Date
Ulrich Drepper	73507d3ae0	Add support for SSSE3 and SSE4.2 versions of strcasecmp on x86-64.	2010-07-31 21:41:09 -07:00
Ulrich Drepper	66f6765a47	Pretty printing x86-64 SSE4.3 strcmp.	2010-07-30 12:54:37 -07:00
Ulrich Drepper	42e08a5438	Implement optimized strcaecmp for x86-64.	2010-07-30 00:14:04 -07:00
Ulrich Drepper	fe36dd025e	Fix tolower operation in strcasestr.	2010-07-30 00:09:07 -07:00
Ulrich Drepper	880113d91e	Avoid compiling unneeded file in ld.so.	2010-07-27 21:12:59 -07:00
Ulrich Drepper	24fb0f88ed	Add optimized x86-64 implementation of strnlen. While at it, beef up the test suite for strnlen and add performance tests for it, too.	2010-07-26 08:37:08 -07:00
Ulrich Drepper	8e96b93aa7	Speed up x86-64 strcasestr a bit moew. Using the new SSE4.2 instructions is cool but not really the fastest. Some older SSE instructions can do the trick faster.	2010-07-24 08:34:44 -07:00
Andreas Schwab	f6a31e0eb6	Add strcasestr-nonascii to i386 build	2010-07-21 07:26:18 -07:00
Ulrich Drepper	d02dc4ba08	Fix non-ASCII case of SSE4.2 strcasstr.	2010-07-16 16:00:22 -07:00
Ulrich Drepper	cc9f2e47a0	Speed up SSE4.2 strcasestr by avoiding indirect function call.	2010-07-16 15:37:38 -07:00
H.J. Lu	6fb8cbcb58	Improve 64bit memcpy/memmove for Atom, Core 2 and Core i7 This patch includes optimized 64bit memcpy/memmove for Atom, Core 2 and Core i7. It improves memcpy by up to 3X on Atom, up to 4X on Core 2 and up to 1X on Core i7. It also improves memmove by up to 3X on Atom, up to 4X on Core 2 and up to 2X on Core i7.	2010-06-30 08:26:11 -07:00
H.J. Lu	3c88fe1e3a	Incorrect x86 CPU family and model check.	2010-05-27 11:14:18 -07:00
Ulrich Drepper	94a27fabeb	Whitespace fix.	2010-04-14 22:29:51 -07:00
H.J. Lu	a11ec63713	Add x86-32 FMA support	2010-04-14 22:27:59 -07:00
H.J. Lu	df87f54923	Check DATA_CACHE_SIZE_HALF	2010-04-14 22:18:27 -07:00
H.J. Lu	dd37cd1a12	Optimie x86-64 SSE4 memcmp for unaligned data.	2010-04-14 17:53:44 -07:00
H.J. Lu	404a6e3201	x86-64 SSE4 optimized memcmp This is 64bit SSE4 optimized memcmp. It improves memcmp by upto 3X on Intel Core i7.	2010-04-14 00:12:53 -07:00
Ulrich Drepper	bbbdd77809	Update x86-64 cpu multiarch selection header.	2010-04-13 19:17:10 -07:00
Ulrich Drepper	22f4f44b67	Fix concurrent handling of __cpu_features.	2010-04-04 00:25:46 -07:00
H.J. Lu	7d9335ecd7	Don't define __strpbrk_sse42 in static library	2010-03-24 12:16:24 -07:00
Richard Guenther	e39acb1f16	Fix R_X86_64_PC32 overflow detection	2010-03-04 19:33:41 -08:00
Ulrich Drepper	4a1297d761	We can use the 64-bit register versions of the double functions.	2010-02-24 20:00:30 -08:00
Andreas Schwab	7eb22e757e	Avoid PLT call to fegetenv on s390	2010-02-09 22:34:17 -08:00
Ulrich Drepper	f69190e74a	Prevent silent errors should x86-64 strncmp be needed outside libc.	2010-01-14 08:09:32 -08:00
H.J. Lu	5a7af22fbb	Unroll the loop x86-64 SSE4.2 strlen.	2010-01-13 07:51:48 -08:00
H.J. Lu	3af48cbdfa	Optimize 32bit memset/memcpy with SSE2/SSSE3.	2010-01-12 11:22:03 -08:00
H.J. Lu	2510d01ddb	Define bit_SSE2 and index_SSE2.	2009-12-13 15:23:02 -08:00
H.J. Lu	51ddd2c01e	Define bit_XXX and index_XXX. This patch defines bit_XXX and index_XXX and use them to check processor feature in assembly code. It can prevent typos in processor feature check.	2009-12-13 09:47:02 -08:00
Ulrich Drepper	823bc6da65	Fix whitespaces.	2009-10-22 22:50:00 -07:00
H.J. Lu	001659f4d5	Implement SSE4.2 optimized strchr and strrchr.	2009-10-22 22:47:12 -07:00
Roland McGrath	b0f3a2e43f	Clean up unnecessary libc_hidden_builtin_def fiddling in x86 multiarch definitions.	2009-10-06 20:01:23 -07:00
Roland McGrath	9d6982d5d2	Clean up x86 multiarch HAS_FOO macros.	2009-10-06 19:59:03 -07:00
Roland McGrath	7967983fd4	configure tweaks, support $libc_add_on_config_subdirs	2009-09-15 14:14:42 -07:00
Jakub Jelinek	22bb992d51	Fix strstr/strcasestr/fma/fmaf on x86_64.	2009-09-02 19:43:04 -07:00
Jakub Jelinek	240441038f	Fix x86_64 bits/mathinline.h for -m32 compilation.	2009-09-01 15:30:12 -07:00
Andreas Schwab	c2735e958a	Fix parse error in bits/mathinline.h with --std=c99	2009-08-31 17:26:14 +02:00
H.J. Lu	5a4eb7282e	Remove ENABLE_SSSE3_ON_ATOM. It turns that SSSE3 isn't slow on Atom. The problem is bsf. This patch removes ENABLE_SSSE3_ON_ATOM.	2009-08-28 14:54:46 -07:00
Ulrich Drepper	65b14bcee2	Optimize out duplicated scalbln code for x86-64.	2009-08-25 16:46:34 -07:00
Ulrich Drepper	7423a3456a	Optimized signbit{,f} for x86-64.	2009-08-25 14:54:12 -07:00
Ulrich Drepper	84088310ce	Handle AVX saving on x86-64 in interrupted smbol lookups. If a signal arrived during a symbol lookup and the signal handler also required a symbol lookup, the end of the lookup in the signal handler reset the flag whether restoring AVX/SSE registers is needed. Resetting means in this case that the tail part of the outer lookup code will try to restore the registers and this can fail miserably. We now restore to the previous value which makes nesting calls possible.	2009-08-25 10:42:30 -07:00
Ulrich Drepper	cf00cc00bc	Add ceil implementation for 64-bit machines. On 64-bit machines we should not split doubles into two 32 bit integer and handle the words separately. We have wide registers. This patch implements a 64-bit ceil version. Ideally all other functions will be converted over time.	2009-08-24 18:05:48 -07:00
Ulrich Drepper	9a1ea1525e	Optimize float construction/extraction on x86-64.	2009-08-24 14:52:49 -07:00
Ulrich Drepper	ef72d5f1b9	Optimize x86-64 signbit{,f} a bit.	2009-08-24 10:20:58 -07:00
H.J. Lu	4e1e2f4247	Support mixed SSE/AVX audit and check AVX only once. This patch fixes mixed SSE/AVX audit and checks AVX only once in _dl_runtime_profile. When an AVX or SSE register value in pltenter is modified, we have to make sure that the SSE part value is the same in both lr_xmm and lr_vector fields so that pltexit will get the correct value from either lr_xmm or lr_vector fields. AVX-enabled pltenter should update both lr_xmm and lr_vector fields to support stacked AVX/SSE pltenter functions.	2009-08-08 10:54:42 -07:00
Ulrich Drepper	8e436522e1	Move SSE4.2 functions together.	2009-08-08 09:38:32 -07:00
Ulrich Drepper	0fda545d5f	Add SSSE3-optimized implementation of str{,n}cmp for x86-64.	2009-08-07 22:51:02 -07:00
Ulrich Drepper	57b378ac89	Avoid warning through fake initialization.	2009-08-07 16:19:54 -07:00
Ulrich Drepper	3aa2588d4a	Fix whitespaces in last checkin.	2009-08-07 09:47:12 -07:00
H.J. Lu	a546baa9cd	Properly count number of logical processors on Intel CPUs. The meaning of the 25-14 bits in EAX returned from cpuid with EAX = 4 has been changed from "the maximum number of threads sharing the cache" to "the maximum number of addressable IDs for logical processors sharing the cache" if cpuid takes EAX = 11. We need to use results from both EAX = 4 and EAX = 11 to get the number of threads sharing the cache. The 25-14 bits in EAX on Core i7 is 15 although the number of logical processors is 8. Here is a white paper on this: http://software.intel.com/en-us/articles/intel-64-architecture-processor-topology-enumeration/ This patch correctly counts number of logical processors on Intel CPUs with EAX = 11 support on cpuid. Tested on Dinnington, Core i7 and Nehalem EX/EP. It also fixed Pentium Ds workaround since EBX may not have the right value returned from cpuid with EAX = 1.	2009-08-07 09:39:36 -07:00
H.J. Lu	02cea47161	Add x86 32-bit SSE4.2 string functions. This patch adds 32bit SSE4.2 string functions. It uses -16L instead of 0xfffffffffffffff0L, which works for both 32bit and 64bit long. Tested on 32bit Core i7 and Core 2.	2009-08-04 12:13:43 -07:00
H.J. Lu	6f6f1215f6	Support multiarch for i686. This patch adds multiarch support when configured for i686. I modified some x86-64 functions to support 32bit. I will contribute 32bit SSE string and memory functions later.	2009-07-31 11:53:35 -07:00
Ulrich Drepper	98b1e6c866	____longjmp_chk is now OS-specific. We use sigaltstack internally which on some systems is a syscall and should be used as such. Move the x86-64 version to the Linux specific directory and create in its place a file which always causes compile errors.	2009-07-30 21:42:27 -07:00
Ulrich Drepper	8e80581787	Change code a bit to correct CFI.	2009-07-30 21:29:27 -07:00
Ulrich Drepper	07df809969	Optimize ____longjmp_chk for x86-64 a bit.	2009-07-30 20:09:30 -07:00
Ulrich Drepper	5ead9ce5c7	Fix x86-64 ____longjmp_chk to handle signal stacks. The simple test previously used might trigger if the longjmp jumps from the signal stack to the normal stack. We now explicitly test for this case.	2009-07-30 17:31:48 -07:00
Ulrich Drepper	78c4ef475d	Add support for x86-64 fma instruction. Use it to implement fma and fmaf, if possible.	2009-07-29 15:26:06 -07:00
Ulrich Drepper	9a1d2d4555	Prepare use if IFUNC functions outside libc.so. We use a callback function into libc.so to get access to the data structure with the information and have special versions of the test macros which automatically use this function.	2009-07-29 15:22:28 -07:00
Ulrich Drepper	649bf13320	Improve CFI in x86-64 ld.so trampoline code.	2009-07-29 08:50:03 -07:00
H.J. Lu	09e0389eb1	Properly restore AVX registers on x86-64. tst-audit4 and tst-audit5 fail under AVX emulator due to je instead of jne. This patch fixes them.	2009-07-29 08:40:54 -07:00
Ulrich Drepper	b48a267b8f	Preserve SSE registers in runtime relocations on x86-64. SSE registers are used for passing parameters and must be preserved in runtime relocations. This is inside ld.so enforced through the tests in tst-xmmymm.sh. But the malloc routines used after startup come from libc.so and can be arbitrarily complex. It's overkill to save the SSE registers all the time because of that. These calls are rare. Instead we save them on demand. The new infrastructure put in place in this patch makes this possible and efficient.	2009-07-29 08:33:03 -07:00
Ulrich Drepper	e83c1a8a72	Refine testing for xmm/ymm register use in x86-64 ld.so. The test now takes the callgraph into account. Only code called during runtime relocation is affected by the limitation. We now determine the affected object files as closely as possible from the outside. This allowed to remove some the specializations for some of the string functions as they are only used in other code paths.	2009-07-27 13:40:27 -07:00
Ulrich Drepper	009a69f0bc	No need for special strcmp for rtld.	2009-07-27 06:55:04 -07:00
Ulrich Drepper	16d2ea4c82	Make sure no code in ld.so uses xmm/ymm registers on x86-64. This patch introduces a test to make sure no function modifies the xmm/ymm registers. With the exception of the auditing functions. The test is probably too pessimistic. All code linked into ld.so is checked. Perhaps at some point the callgraph starting from _dl_fixup and _dl_profile_fixup is checked and we can start using faster SSE-using functions in parts of ld.so.	2009-07-26 16:10:00 -07:00
H.J. Lu	7956a3d27c	Add SSE2 support to str{,n}cmp for x86-64.	2009-07-26 13:32:28 -07:00
H.J. Lu	4e5b5821bf	Some some optimizations for x86-64 strcmp.	2009-07-25 19:15:14 -07:00
Ulrich Drepper	29e92fa5cd	Optimize x86-64 SSE4.2 strcmp. The file contained some code which was never used. Don't compile it in.	2009-07-25 12:02:47 -07:00
Ulrich Drepper	b2509a1e38	Avoid cpuid instructions in cache info discovery. When multiarch is enabled we have this information stored. Use it.	2009-07-23 14:03:53 -07:00
Ulrich Drepper	3e9099b4f6	Add more cache descriptors for L3 caches on x86 and x86-64. The most recent AP 485 describes a few more cache descriptors for L3 caches with 24-way associativity.	2009-07-23 13:42:46 -07:00
Ulrich Drepper	d28797e426	Perform test for Arom x86-64 in central place and handle it. There will be more than one function which, in multiarch mode, wants to use SSSE3. We should not test in each of them for Atoms with slow SSSE3. Instead, disable the SSSE3 bit in the startup code for such machines.	2009-07-23 13:15:17 -07:00
Ulrich Drepper	ae612b04cc	Minor cleanups in x86-64 strstr.	2009-07-21 07:52:12 -07:00
Ulrich Drepper	a8f895ebe1	Better check for optimization in new x86-64 strstr/strcasestr.	2009-07-20 21:18:28 -07:00
H.J. Lu	2b7a8664fa	SSE4.2 strstr/strcasestr for x86-64. This patch implements SSE4.2 strstr/strcasestr, using Knuth-Morris-Pratt string searching algorithm.	2009-07-20 21:06:50 -07:00
Ulrich Drepper	c8027cced1	Optimize restoring of ymm registers on x86-64. The patch mainly reduces the code size but also avoids some jumps.	2009-07-16 07:15:15 -07:00
Ulrich Drepper	24a12a5a5f	Fix up whitespaces in new memcmp for x86-64.	2009-07-16 07:02:27 -07:00
H.J. Lu	e26c9b8415	memcmp implementation for x86-64 using SSE2.	2009-07-16 07:00:34 -07:00
Ulrich Drepper	ca419225a3	Fix thinko in AVX audit patch. Don't use AVX instructions too often.	2009-07-15 17:59:14 -07:00
Ulrich Drepper	47fc9b710b	Fix typo in last change.	2009-07-15 17:51:11 -07:00
Ulrich Drepper	d7bd7a8ae8	Secure AVX changes for auditing code. The original AVX patch used a function pointer to handle the difference between machines with and without AVX support. This is insecure. A well-placed memory exploit could lead to redirection of the execution. Using a variable and several tests is a bit slower but cannot be exploited in this way.	2009-07-15 17:41:36 -07:00
H.J. Lu	b0ecde3a63	Add AVX support to ld.so auditing for x86-64.	2009-07-10 12:04:14 -07:00
Ulrich Drepper	cea4329592	Minor cleanups in recently added files.	2009-07-03 03:23:01 -07:00
Ulrich Drepper	d6485c981b	Align functions to 16-byte boundary. Some of the new multi-arch string functions for x86-64 were not aligned to 16 byte boundarie,s possibly creating unnecessary cache line misses and delays.	2009-07-03 03:01:57 -07:00
H.J. Lu	06e51c8f3d	Add SSE4.2 support for strcspn, strpbrk, and strspn on x86-64.	2009-07-03 02:48:56 -07:00
H.J. Lu	167d5ed5de	Fix handling of xmm6 in ld.so audit hooks on x86-64.	2009-07-02 04:33:12 -07:00
Ulrich Drepper	af263b8154	Whitespace fixes in last patch.	2009-07-02 03:43:05 -07:00
H.J. Lu	ab6a873fe0	SSSE3 strcpy/stpcpy for x86-64 This patch adds SSSE3 strcpy/stpcpy. I got up to 4X speed up on Core 2 and Core i7. I disabled it on Atom since SSSE3 version is slower for shorter (<64byte) data.	2009-07-02 03:39:03 -07:00
Ulrich Drepper	e6bd12ddf7	Regenerated.	2009-06-30 05:33:52 -07:00
Ulrich Drepper	b38a2e2e64	Fix little checkin problem in last patch.	2009-06-30 04:41:38 -07:00
H.J. Lu	0181291385	Determine and store processor family and model on x86-64.	2009-06-30 04:39:09 -07:00
Ulrich Drepper	059215ae21	Clean up whitespaces in last patch.	2009-06-22 20:39:37 -07:00
H.J. Lu	772f4e6a1b	Add SSE4.2 support for strcmp and strncmp on x86-64.	2009-06-22 20:38:41 -07:00
Jakub Jelinek	fab8238de6	Fix x86-64 memchr for large lengths.	2009-06-16 10:23:31 -07:00
Ulrich Drepper	eb0b6cb6e1	Fix warnings when using <sys/select.h>. gcc 4.4 is more picky. And the x86-64 version of <bits/select.h> contained a now unnecessary asm optimization. Remove it.	2009-06-14 16:09:42 -07:00
Ulrich Drepper	b77c932329	Add SSE4.2 optimized rawmemchr implementation for x86-64.	2009-06-05 16:54:50 -07:00
Ulrich Drepper	6f9eea15bf	Forgot some more cleanups for the SSE4.2 strlen on x86-64.	2009-06-05 11:51:59 -07:00
Ulrich Drepper	f85a9e72e2	Add missing cleanups from SSE4.2 x86-64 strlen.	2009-06-05 11:39:45 -07:00
Ulrich Drepper	3ab2d57a4d	Optimize x86-64 strlen for SSE4.2. The SSE4.2 implementation is used in the DSO only. The patch also adds some infrastructure to be used in similar code later one.	2009-06-05 11:32:00 -07:00
Ulrich Drepper	2f3f7b9da2	More small optimizations for x86-64 strlen.	2009-06-04 16:45:35 -07:00
Ulrich Drepper	747785f2b3	Tiny strlen for x86-64 optimization. I didn't remove an instruction from a previous version in the final version.	2009-06-04 10:54:29 -07:00
Ulrich Drepper	fd96f06208	Small optimization of STT_GNU_IFUNC handling. The test to call the indirect function now includes a subtest to checked whether the symbol is defined. When coming to that point this is almost always the case. The test for STT_GNU_IFUNC on the other hand rarely is true. Move it to the front means we don't have to perform the second test unless really necessary.	2009-06-01 11:49:05 -07:00
Ulrich Drepper	b7629ee33f	Better error message for invalid relocatio in static binary.	2009-06-01 11:39:24 -07:00

1 2 3 4 5 ...

377 Commits