glibc/sysdeps/x86_64
Joseph Myers ae8372d7e4 Add SSE4.1 trunc, truncf (bug 20142).
This patch adds SSE4.1 versions of trunc and truncf, using the roundsd
/ roundss instructions, similar to the versions of ceil, floor, rint
and nearbyint functions we already have.  In my testing with the glibc
benchtests these are about 30% faster than the C versions for double,
20% faster for float.

Tested for x86_64.

	[BZ #20142]
	* sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines):
	Add s_trunc-c, s_truncf-c, s_trunc-sse4_1 and s_truncf-sse4_1.
	* sysdeps/x86_64/fpu/multiarch/s_trunc-c.c: New file.
	* sysdeps/x86_64/fpu/multiarch/s_trunc-sse4_1.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_trunc.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_truncf-c.c: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_truncf-sse4_1.S: Likewise.
	* sysdeps/x86_64/fpu/multiarch/s_truncf.c: Likewise.
2017-09-20 16:54:05 +00:00
..
64 Move architecture shlib-versions files to Linux-specific directories. 2014-07-17 14:31:12 +00:00
fpu Add SSE4.1 trunc, truncf (bug 20142). 2017-09-20 16:54:05 +00:00
multiarch Remove remaining _HAVE_STRING_ARCH_* definitions (BZ #18858) 2017-09-06 14:35:23 -03:00
nptl Remove CALL_THREAD_FCT macro 2017-04-04 18:03:35 -03:00
x32 Remove CALL_THREAD_FCT macro 2017-04-04 18:03:35 -03:00
____longjmp_chk.S ____longjmp_chk is now OS-specific. 2009-07-30 21:42:27 -07:00
__longjmp.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
_mcount.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
abort-instr.h Update. 2001-09-19 10:37:31 +00:00
add_n.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
addmul_1.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
atomic-machine.h Optimize generic spinlock code and use C11 like atomic macros. 2017-06-06 09:41:56 +02:00
backtrace.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
bsd-_setjmp.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
bsd-setjmp.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
bzero.S Make an empty file. 2007-10-16 05:59:15 +00:00
configure Require binutils 2.25 or later to build glibc. 2017-06-28 11:31:50 +00:00
configure.ac Require binutils 2.25 or later to build glibc. 2017-06-28 11:31:50 +00:00
crti.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
crtn.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
dl-irel.h Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
dl-lookupcfg.h elf: Remove internal_function attribute 2017-08-31 16:59:37 +02:00
dl-machine.h PowerPC64 ELFv2 PPC64_OPT_LOCALENTRY 2017-06-14 10:47:25 +09:30
dl-procinfo.c Add sysdeps/x86/dl-procinfo.c 2017-04-10 12:01:45 -07:00
dl-runtime.c * elf/dl-runtime.c (reloc_offset): Define. 2009-03-15 00:26:14 +00:00
dl-tls.c x86-64: Align the stack in __tls_get_addr [BZ #21609] 2017-07-06 04:43:20 -07:00
dl-tls.h x86-64: Align the stack in __tls_get_addr [BZ #21609] 2017-07-06 04:43:20 -07:00
dl-tlsdesc.h elf: Remove internal_function attribute 2017-08-31 16:59:37 +02:00
dl-tlsdesc.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
dl-trampoline.h x86-64: Improve branch predication in _dl_runtime_resolve_avx512_opt [BZ #21258] 2017-03-21 11:00:12 -07:00
dl-trampoline.S x86-64: Improve branch predication in _dl_runtime_resolve_avx512_opt [BZ #21258] 2017-03-21 11:00:12 -07:00
ffs.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
ffsll.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
hp-timing.h Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
htonl.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
ifuncmain8.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
ifuncmod8.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
Implies Add float128 support for x86_64, x86. 2017-06-26 22:02:24 +00:00
jmpbuf-offsets.h Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
jmpbuf-unwind.h Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
l10nflist.c Minor optimization of popcount in l10nflist 2011-08-11 14:07:04 -04:00
ldbl2mpn.c [BZ #4586] 2007-06-08 02:50:59 +00:00
ldsodefs.h Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
link-defines.sym Replace __int128 with __int128_t 2014-05-30 10:50:21 -07:00
locale-defines.sym Implement optimized strcaecmp for x86-64. 2010-07-30 00:14:04 -07:00
localplt.data ld.so: Introduce struct dl_exception 2017-08-10 16:54:57 +02:00
lshift.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
machine-gmon.h Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
Makefile x86: Add x86_64 to x86-64 HWCAP [BZ #22093] 2017-09-11 08:18:32 -07:00
memchr.S x86-64: Optimize memchr/rawmemchr/wmemchr with SSE2/AVX2 2017-06-09 05:13:31 -07:00
memcmp.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
memcopy.h X86-64: Add dummy memcopy.h and wordcopy.c 2016-06-09 04:38:34 -07:00
memcpy_chk.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
memcpy.S X86-64: Remove previous default/SSE2/AVX2 memcpy/memmove 2016-06-08 13:58:08 -07:00
memmove_chk.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
memmove.S x86-64: Use IFUNC memcpy and mempcpy in libc.a 2017-08-04 12:27:18 -07:00
mempcpy_chk.S x86_64: fix static build of __mempcpy_chk for compilers defaulting to PIC/PIE 2017-03-15 16:10:05 -07:00
mempcpy.S X86-64: Remove previous default/SSE2/AVX2 memcpy/memmove 2016-06-08 13:58:08 -07:00
memrchr.S x86_64: Remove redundant REX bytes from memrchr.S 2017-06-05 07:41:26 -07:00
memset_chk.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
memset.S x86: Remove __memset_zero_constant_len_parameter [BZ #21790] 2017-08-04 10:56:51 -07:00
memusage.h Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
mp_clz_tab.c * sysdeps/x86_64/mp_clz_tab.c: New file. 2009-04-15 04:30:41 +00:00
mul_1.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
preconfigure rename configure.in to configure.ac 2013-10-30 17:32:08 +10:00
preconfigure.ac rename configure.in to configure.ac 2013-10-30 17:32:08 +10:00
rawmemchr.S x86_64: Remove L(return_null) from rawmemchr.S 2017-05-20 06:13:38 -07:00
rshift.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
rtld-offsets.sym x86-64: Align the stack in __tls_get_addr [BZ #21609] 2017-07-06 04:43:20 -07:00
sched_cpucount.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
setjmp.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
stack-aliasing.h Clean up stack-coloring macros. 2014-06-20 19:50:16 -07:00
stackguard-macros.h BZ #15754: CVE-2013-4788 2013-09-23 00:52:09 -04:00
stackinfo.h Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
start.S x86-64: Check PIC instead of SHARED in start.S 2017-08-02 10:27:34 -07:00
stpcpy.S Update. 2004-05-28 06:56:51 +00:00
strcasecmp_l-nonascii.c Use locale_t, not __locale_t, throughout glibc 2017-06-20 20:30:06 -04:00
strcasecmp_l.S Implement optimized strcaecmp for x86-64. 2010-07-30 00:14:04 -07:00
strcasecmp.S Implement optimized strcaecmp for x86-64. 2010-07-30 00:14:04 -07:00
strcat.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strchr.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strchrnul.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strcmp.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strcpy.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strcspn.S x86-64: Implement strcspn/strpbrk/strspn IFUNC selectors in C 2017-06-15 08:59:05 -07:00
strlen.S x86-64: Update strlen.S to support wcslen/wcsnlen 2017-06-05 07:58:23 -07:00
strncase_l-nonascii.c Use locale_t, not __locale_t, throughout glibc 2017-06-20 20:30:06 -04:00
strncase_l.S Add optimized strncasecmp versions for x86-64. 2010-08-14 22:04:01 -07:00
strncase.S Add optimized strncasecmp versions for x86-64. 2010-08-14 22:04:01 -07:00
strncmp.S Add SSE2 support to str{,n}cmp for x86-64. 2009-07-26 13:32:28 -07:00
strnlen.S Faster strlen on x64. 2013-03-18 07:39:12 +01:00
strpbrk.S x86-64: Implement strcspn/strpbrk/strspn IFUNC selectors in C 2017-06-15 08:59:05 -07:00
strrchr.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strspn.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
sub_n.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
submul_1.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
sysdep.h Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
tls_get_addr.S x86-64: Align the stack in __tls_get_addr [BZ #21609] 2017-07-06 04:43:20 -07:00
tls-macros.h Split tls-macros.h into sysdeps directories. 2012-07-17 11:30:58 +00:00
tlsdesc.c elf: Remove internal_function attribute 2017-08-31 16:59:37 +02:00
tlsdesc.sym x86-64: Align the stack in __tls_get_addr [BZ #21609] 2017-07-06 04:43:20 -07:00
tst-audit3.c Modify several tests to use test-skeleton.c 2014-11-05 15:24:08 +05:30
tst-audit4-aux.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
tst-audit4.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
tst-audit5.c Modify several tests to use test-skeleton.c 2014-11-05 15:24:08 +05:30
tst-audit6.c Modify several tests to use test-skeleton.c 2015-07-15 15:10:23 +05:30
tst-audit7.c Move x86_64-specific audit tests to sysdeps/x86_64/. 2013-04-25 19:23:11 +00:00
tst-audit10-aux.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
tst-audit10.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
tst-audit.h Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
tst-auditmod3a.c Move x86_64-specific audit tests to sysdeps/x86_64/. 2013-04-25 19:23:11 +00:00
tst-auditmod3b.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod4a.c Move x86_64-specific audit tests to sysdeps/x86_64/. 2013-04-25 19:23:11 +00:00
tst-auditmod4b.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod5a.c Move x86_64-specific audit tests to sysdeps/x86_64/. 2013-04-25 19:23:11 +00:00
tst-auditmod5b.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod6a.c Move x86_64-specific audit tests to sysdeps/x86_64/. 2013-04-25 19:23:11 +00:00
tst-auditmod6b.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod6c.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod7a.c Move x86_64-specific audit tests to sysdeps/x86_64/. 2013-04-25 19:23:11 +00:00
tst-auditmod7b.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod10a.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
tst-auditmod10b.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-avx512-aux.c x86-64: Verify that _dl_runtime_resolve preserves vector registers 2017-02-09 12:19:58 -08:00
tst-avx512.c x86-64: Verify that _dl_runtime_resolve preserves vector registers 2017-02-09 12:19:58 -08:00
tst-avx512mod.c x86-64: Verify that _dl_runtime_resolve preserves vector registers 2017-02-09 12:19:58 -08:00
tst-avx-aux.c x86-64: Verify that _dl_runtime_resolve preserves vector registers 2017-02-09 12:19:58 -08:00
tst-avx.c x86-64: Verify that _dl_runtime_resolve preserves vector registers 2017-02-09 12:19:58 -08:00
tst-avxmod.c x86-64: Verify that _dl_runtime_resolve preserves vector registers 2017-02-09 12:19:58 -08:00
tst-mallocalign1.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
tst-quad1.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
tst-quad1pie.c Handle R_X86_64_RELATIVE64 and R_X86_64_64 for x32 2012-05-10 17:05:06 -07:00
tst-quad2.c Handle R_X86_64_RELATIVE64 and R_X86_64_64 for x32 2012-05-10 17:05:06 -07:00
tst-quad2pie.c Handle R_X86_64_RELATIVE64 and R_X86_64_64 for x32 2012-05-10 17:05:06 -07:00
tst-quadmod1.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
tst-quadmod1pie.S Handle R_X86_64_RELATIVE64 and R_X86_64_64 for x32 2012-05-10 17:05:06 -07:00
tst-quadmod2.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
tst-quadmod2pie.S Handle R_X86_64_RELATIVE64 and R_X86_64_64 for x32 2012-05-10 17:05:06 -07:00
tst-split-dynreloc.c Fix dynamic linker issue with bind-now 2015-08-19 05:37:01 -07:00
tst-split-dynreloc.lds Fix dynamic linker issue with bind-now 2015-08-19 05:37:01 -07:00
tst-sse.c x86-64: Verify that _dl_runtime_resolve preserves vector registers 2017-02-09 12:19:58 -08:00
tst-ssemod.c x86-64: Verify that _dl_runtime_resolve preserves vector registers 2017-02-09 12:19:58 -08:00
tst-stack-align.h Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
tst-x86_64-1.c x86: Add x86_64 to x86-64 HWCAP [BZ #22093] 2017-09-11 08:18:32 -07:00
tst-x86_64mod-1.c x86: Add x86_64 to x86-64 HWCAP [BZ #22093] 2017-09-11 08:18:32 -07:00
Versions Work around old buggy program which cannot cope with memcpy semantics. 2011-04-01 19:38:21 -04:00
wcschr.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
wcscmp.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
wcslen.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
wcsrchr.S Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
wmemset_chk.S x86-64: Optimize wmemset with SSE2/AVX2/AVX512 2017-06-05 11:09:59 -07:00
wmemset.S x86-64: Optimize wmemset with SSE2/AVX2/AVX512 2017-06-05 11:09:59 -07:00
wordcopy.c X86-64: Add dummy memcopy.h and wordcopy.c 2016-06-09 04:38:34 -07:00