glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-11-26 23:10:06 +00:00

Author	SHA1	Message	Date
H.J. Lu	c15f8eb50c	x86-64: Improve branch predication in _dl_runtime_resolve_avx512_opt [BZ #21258 ] On Skylake server, _dl_runtime_resolve_avx512_opt is used to preserve the first 8 vector registers. The code layout is if only %xmm0 - %xmm7 registers are used preserve %xmm0 - %xmm7 registers if only %ymm0 - %ymm7 registers are used preserve %ymm0 - %ymm7 registers preserve %zmm0 - %zmm7 registers Branch predication always executes the fallthrough code path to preserve %zmm0 - %zmm7 registers speculatively, even though only %xmm0 - %xmm7 registers are used. This leads to lower CPU frequency on Skylake server. This patch changes the fallthrough code path to preserve %xmm0 - %xmm7 registers instead: if whole %zmm0 - %zmm7 registers are used preserve %zmm0 - %zmm7 registers if only %ymm0 - %ymm7 registers are used preserve %ymm0 - %ymm7 registers preserve %xmm0 - %xmm7 registers Tested on Skylake server. [BZ #21258] * sysdeps/x86_64/dl-trampoline.S (_dl_runtime_resolve_opt): Define only if _dl_runtime_resolve is defined to _dl_runtime_resolve_sse_vex. * sysdeps/x86_64/dl-trampoline.h (_dl_runtime_resolve_opt): Fallthrough to _dl_runtime_resolve_sse_vex.	2017-03-21 11:00:12 -07:00
Joseph Myers	bfff8b1bec	Update copyright dates with scripts/update-copyrights.	2017-01-01 00:14:16 +00:00
H.J. Lu	fb0f7a6755	X86-64: Add _dl_runtime_resolve_avx[512]_{opt\|slow} [BZ #20508 ] There is transition penalty when SSE instructions are mixed with 256-bit AVX or 512-bit AVX512 load instructions. Since _dl_runtime_resolve_avx and _dl_runtime_profile_avx512 save/restore 256-bit YMM/512-bit ZMM registers, there is transition penalty when SSE instructions are used with lazy binding on AVX and AVX512 processors. To avoid SSE transition penalty, if only the lower 128 bits of the first 8 vector registers are non-zero, we can preserve %xmm0 - %xmm7 registers with the zero upper bits. For AVX and AVX512 processors which support XGETBV with ECX == 1, we can use XGETBV with ECX == 1 to check if the upper 128 bits of YMM registers or the upper 256 bits of ZMM registers are zero. We can restore only the non-zero portion of vector registers with AVX/AVX512 load instructions which will zero-extend upper bits of vector registers. This patch adds _dl_runtime_resolve_sse_vex which saves and restores XMM registers with 128-bit AVX store/load instructions. It is used to preserve YMM/ZMM registers when only the lower 128 bits are non-zero. _dl_runtime_resolve_avx_opt and _dl_runtime_resolve_avx512_opt are added and used on AVX/AVX512 processors supporting XGETBV with ECX == 1 so that we store and load only the non-zero portion of vector registers. This avoids SSE transition penalty caused by _dl_runtime_resolve_avx and _dl_runtime_profile_avx512 when only the lower 128 bits of vector registers are used. _dl_runtime_resolve_avx_slow is added and used for AVX processors which don't support XGETBV with ECX == 1. Since there is no SSE transition penalty on AVX512 processors which don't support XGETBV with ECX == 1, _dl_runtime_resolve_avx512_slow isn't provided. [BZ #20495] [BZ #20508] * sysdeps/x86/cpu-features.c (init_cpu_features): For Intel processors, set Use_dl_runtime_resolve_slow and set Use_dl_runtime_resolve_opt if XGETBV suports ECX == 1. * sysdeps/x86/cpu-features.h (bit_arch_Use_dl_runtime_resolve_opt): New. (bit_arch_Use_dl_runtime_resolve_slow): Likewise. (index_arch_Use_dl_runtime_resolve_opt): Likewise. (index_arch_Use_dl_runtime_resolve_slow): Likewise. * sysdeps/x86_64/dl-machine.h (elf_machine_runtime_setup): Use _dl_runtime_resolve_avx512_opt and _dl_runtime_resolve_avx_opt if Use_dl_runtime_resolve_opt is set. Use _dl_runtime_resolve_slow if Use_dl_runtime_resolve_slow is set. * sysdeps/x86_64/dl-trampoline.S: Include <cpu-features.h>. (_dl_runtime_resolve_opt): New. Defined for AVX and AVX512. (_dl_runtime_resolve): Add one for _dl_runtime_resolve_sse_vex. * sysdeps/x86_64/dl-trampoline.h (_dl_runtime_resolve_avx_slow): New. (_dl_runtime_resolve_opt): Likewise. (_dl_runtime_profile): Define only if _dl_runtime_profile is defined.	2016-09-06 08:51:07 -07:00
H.J. Lu	f43cb35c9b	Require binutils 2.24 to build x86-64 glibc [BZ #20139 ] If assembler doesn't support AVX512DQ, _dl_runtime_resolve_avx is used to save the first 8 vector registers, which only saves the lower 256 bits of vector register, for lazy binding. When it is called on AVX512 platform, the upper 256 bits of ZMM registers are clobbered. Parameters passed in ZMM registers will be wrong when the function is called the first time. This patch requires binutils 2.24, whose assembler can store and load ZMM registers, to build x86-64 glibc. Since mathvec library needs assembler support for AVX512DQ, we disable mathvec if assembler doesn't support AVX512DQ. [BZ #20139] * config.h.in (HAVE_AVX512_ASM_SUPPORT): Renamed to ... (HAVE_AVX512DQ_ASM_SUPPORT): This. * sysdeps/x86_64/configure.ac: Require assembler from binutils 2.24 or above. (HAVE_AVX512_ASM_SUPPORT): Removed. (HAVE_AVX512DQ_ASM_SUPPORT): New. * sysdeps/x86_64/configure: Regenerated. * sysdeps/x86_64/dl-trampoline.S: Make HAVE_AVX512_ASM_SUPPORT check unconditional. * sysdeps/x86_64/multiarch/ifunc-impl-list.c: Likewise. * sysdeps/x86_64/multiarch/memcpy.S: Likewise. * sysdeps/x86_64/multiarch/memcpy_chk.S: Likewise. * sysdeps/x86_64/multiarch/memmove-avx512-no-vzeroupper.S: Likewise. * sysdeps/x86_64/multiarch/memmove-avx512-unaligned-erms.S: Likewise. * sysdeps/x86_64/multiarch/memmove.S: Likewise. * sysdeps/x86_64/multiarch/memmove_chk.S: Likewise. * sysdeps/x86_64/multiarch/mempcpy.S: Likewise. * sysdeps/x86_64/multiarch/mempcpy_chk.S: Likewise. * sysdeps/x86_64/multiarch/memset-avx512-no-vzeroupper.S: Likewise. * sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S: Likewise. * sysdeps/x86_64/multiarch/memset.S: Likewise. * sysdeps/x86_64/multiarch/memset_chk.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core_avx512.S: Check HAVE_AVX512DQ_ASM_SUPPORT instead of HAVE_AVX512_ASM_SUPPORT. * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core_avx512.: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core_avx51: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core_avx512.S: Likewise.	2016-07-01 06:03:05 -07:00
H.J. Lu	8d9c92017d	[x86_64] Set DL_RUNTIME_UNALIGNED_VEC_SIZE to 8 Due to GCC bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58066 __tls_get_addr may be called with 8-byte stack alignment. Although this bug has been fixed in GCC 4.9.4, 5.3 and 6, we can't assume that stack will be always aligned at 16 bytes. Since SSE optimized memory/string functions with aligned SSE register load and store are used in the dynamic linker, we must set DL_RUNTIME_UNALIGNED_VEC_SIZE to 8 so that _dl_runtime_resolve_sse will align the stack before calling _dl_fixup: Dump of assembler code for function _dl_runtime_resolve_sse: 0x00007ffff7deea90 <+0>: push %rbx 0x00007ffff7deea91 <+1>: mov %rsp,%rbx 0x00007ffff7deea94 <+4>: and $0xfffffffffffffff0,%rsp ^^^^^^^^^^^ Align stack to 16 bytes 0x00007ffff7deea98 <+8>: sub $0x100,%rsp 0x00007ffff7deea9f <+15>: mov %rax,0xc0(%rsp) 0x00007ffff7deeaa7 <+23>: mov %rcx,0xc8(%rsp) 0x00007ffff7deeaaf <+31>: mov %rdx,0xd0(%rsp) 0x00007ffff7deeab7 <+39>: mov %rsi,0xd8(%rsp) 0x00007ffff7deeabf <+47>: mov %rdi,0xe0(%rsp) 0x00007ffff7deeac7 <+55>: mov %r8,0xe8(%rsp) 0x00007ffff7deeacf <+63>: mov %r9,0xf0(%rsp) 0x00007ffff7deead7 <+71>: movaps %xmm0,(%rsp) 0x00007ffff7deeadb <+75>: movaps %xmm1,0x10(%rsp) 0x00007ffff7deeae0 <+80>: movaps %xmm2,0x20(%rsp) 0x00007ffff7deeae5 <+85>: movaps %xmm3,0x30(%rsp) 0x00007ffff7deeaea <+90>: movaps %xmm4,0x40(%rsp) 0x00007ffff7deeaef <+95>: movaps %xmm5,0x50(%rsp) 0x00007ffff7deeaf4 <+100>: movaps %xmm6,0x60(%rsp) 0x00007ffff7deeaf9 <+105>: movaps %xmm7,0x70(%rsp) [BZ #19679] * sysdeps/x86_64/dl-trampoline.S (DL_RUNIME_UNALIGNED_VEC_SIZE): Renamed to ... (DL_RUNTIME_UNALIGNED_VEC_SIZE): This. Set to 8. (DL_RUNIME_RESOLVE_REALIGN_STACK): Renamed to ... (DL_RUNTIME_RESOLVE_REALIGN_STACK): This. Updated. (DL_RUNIME_RESOLVE_REALIGN_STACK): Renamed to ... (DL_RUNTIME_RESOLVE_REALIGN_STACK): This. * sysdeps/x86_64/dl-trampoline.h (DL_RUNIME_RESOLVE_REALIGN_STACK): Renamed to ... (DL_RUNTIME_RESOLVE_REALIGN_STACK): This.	2016-02-19 15:45:09 -08:00
Joseph Myers	f7a9f785e5	Update copyright dates with scripts/update-copyrights.	2016-01-04 16:05:18 +00:00
H.J. Lu	0a5768fe9c	Support x86-64 assmebler without AVX512 When x86-64 assmebler doesn't support AVX512, we should make _dl_runtime_resolve_avx512/_dl_runtime_profile_avx512 as aliases of _dl_runtime_resolve_avx/_dl_runtime_profile_avx. Tested on x86-64 using GCC 5.2 with binutils 20151008 and GCC 4.8 with binutils 20130219. There are no differences in ld.so with binutils 20151008. There are no unexpected failures with binutils 20130219 and 20151008. [BZ #19124] * sysdeps/x86_64/dl-trampoline.S [!HAVE_AVX512_ASM_SUPPORT] (_dl_runtime_resolve_avx512): Make it a hidden alias of _dl_runtime_resolve_avx. (_dl_runtime_profile_avx512): Make it a hidden alias of _dl_runtime_profile_avx.	2015-10-13 10:36:27 -07:00
H.J. Lu	f3dcae82d5	Save and restore vector registers in x86-64 ld.so This patch adds SSE, AVX and AVX512 versions of _dl_runtime_resolve and _dl_runtime_profile, which save and restore the first 8 vector registers used for parameter passing. elf_machine_runtime_setup selects the proper _dl_runtime_resolve or _dl_runtime_profile based on _dl_x86_cpu_features. It avoids race condition caused by FOREIGN_CALL macros, which are only used for x86-64. Performance impact of saving and restoring 8 vector registers are negligible on Nehalem, Sandy Bridge, Ivy Bridge and Haswell when ld.so is optimized with SSE2. [BZ #15128] * sysdeps/x86_64/Makefile [$(subdir) == elf] (tests): Add ifuncmain8. (modules-names): Add ifuncmod8. ($(objpfx)ifuncmain8): New rule. * sysdeps/x86_64/dl-machine.h: Include <dl-procinfo.h> and <cpuid.h>. (elf_machine_runtime_setup): Use _dl_runtime_resolve_sse, _dl_runtime_resolve_avx, or _dl_runtime_resolve_avx512, _dl_runtime_profile_sse, _dl_runtime_profile_avx, or _dl_runtime_profile_avx512, based on HAS_ARCH_FEATURE. * sysdeps/x86_64/dl-trampoline.S: Rewrite. * sysdeps/x86_64/dl-trampoline.h: Likewise. * sysdeps/x86_64/ifuncmain8.c: New file. * sysdeps/x86_64/ifuncmod8.c: Likewise. * sysdeps/x86_64/nptl/tcb-offsets.sym (RTLD_SAVESPACE_SSE): Removed. * sysdeps/x86_64/nptl/tls.h (__128bits): Removed. (tcbhead_t): Change rtld_must_xmm_save to __glibc_unused1. Change rtld_savespace_sse to __glibc_unused2. (RTLD_CHECK_FOREIGN_CALL): Removed. (RTLD_ENABLE_FOREIGN_CALL): Likewise. (RTLD_PREPARE_FOREIGN_CALL): Likewise. (RTLD_FINALIZE_FOREIGN_CALL): Likewise.	2015-08-25 04:34:13 -07:00
H.J. Lu	2eb9ef29b6	Improve bndmov encoding with zero displacement If x86-64 assembler doesn't support MPX, we encode bndmov instruction by hand. When displacement is zero, assembler generates shorter encoding. This patch improves bndmov encoding with zero displacement so that ld.so is identical when using assemblers with and without MPX support. * sysdeps/x86_64/dl-trampoline.S (_dl_runtime_resolve): Improve bndmov encoding with zero displacement.	2015-07-09 09:30:20 -07:00
Igor Zamyatin	14c5cbabc2	Preserve bound registers for pointer pass/return We need to save/restore bound registers and add a BND prefix before branches in _dl_runtime_profile so that bound registers for pointer pass and return are preserved when LD_AUDIT is used. [BZ #18134] * sysdeps/i386/configure.ac: Set HAVE_MPX_SUPPORT. * sysdeps/i386/configure: Regenerated. * sysdeps/i386/dl-trampoline.S (PRESERVE_BND_REGS_PREFIX): New. (_dl_runtime_profile): Save and restore Intel MPX return bound registers when calling _dl_call_pltexit. Add PRESERVE_BND_REGS_PREFIX before return. * sysdeps/i386/link-defines.sym (LRV_BND0_OFFSET): New. (LRV_BND1_OFFSET): Likewise. * sysdeps/x86/bits/link.h (La_i86_retval): Add lrv_bnd0 and lrv_bnd1. * sysdeps/x86_64/dl-trampoline.S (_dl_runtime_profile): Fix typo in bndmov encoding. * sysdeps/x86_64/dl-trampoline.h: Properly save and restore Intel MPX bound registers. Add PRESERVE_BND_REGS_PREFIX before branch instructions to preserve bounds.	2015-07-09 06:50:12 -07:00
H.J. Lu	b97eb2bdb1	Preserve bound registers in _dl_runtime_resolve We need to add a BND prefix before indirect branch at the end of _dl_runtime_resolve to preserve bound registers. [BZ #18134] * sysdeps/x86_64/dl-trampoline.S (PRESERVE_BND_REGS_PREFIX): New. (_dl_runtime_resolve): Add a BND prefix before indirect branch.	2015-03-16 14:59:14 -07:00
Joseph Myers	b168057aaa	Update copyright dates with scripts/update-copyrights.	2015-01-02 16:29:47 +00:00
Igor Zamyatin	ea8ba7cd14	Save/restore bound registers for _dl_runtime_profile This patch saves and restores bound registers in x86-64 PLT for ld.so profile and LD_AUDIT: * sysdeps/x86_64/bits/link.h (La_x86_64_regs): Add lr_bnd. (La_x86_64_retval): Add lrv_bnd0 and lrv_bnd1. * sysdeps/x86_64/dl-trampoline.S (_dl_runtime_profile): Save Intel MPX bound registers before _dl_profile_fixup. * sysdeps/x86_64/dl-trampoline.h: Restore Intel MPX bound registers after _dl_profile_fixup. Save and restore bound registers bnd0/bnd1 when calling _dl_call_pltexit. * sysdeps/x86_64/link-defines.sym (BND_SIZE): New. (LR_BND_OFFSET): Likewise. (LRV_BND0_OFFSET): Likewise. (LRV_BND1_OFFSET): Likewise.	2014-04-16 14:46:49 -07:00
Igor Zamyatin	a4c75cfd56	Save/restore bound registers in _dl_runtime_resolve This patch saves and restores bound registers in symbol lookup for x86-64: 1. Branches without BND prefix clear bound registers. 2. x86-64 pass bounds in bound registers as specified in MPX psABI extension on hjl/mpx/master branch at https://github.com/hjl-tools/x86-64-psABI https://groups.google.com/forum/#!topic/x86-64-abi/KFsB0XTgWYc Binutils has been updated to create an alternate PLT to add BND prefix when branching to ld.so. * config.h.in (HAVE_MPX_SUPPORT): New #undef. * sysdeps/x86_64/configure.ac: Set HAVE_MPX_SUPPORT. * sysdeps/x86_64/configure: Regenerated. * sysdeps/x86_64/dl-trampoline.S (REGISTER_SAVE_AREA): New macro. (REGISTER_SAVE_RAX): Likewise. (REGISTER_SAVE_RCX): Likewise. (REGISTER_SAVE_RDX): Likewise. (REGISTER_SAVE_RSI): Likewise. (REGISTER_SAVE_RDI): Likewise. (REGISTER_SAVE_R8): Likewise. (REGISTER_SAVE_R9): Likewise. (REGISTER_SAVE_BND0): Likewise. (REGISTER_SAVE_BND1): Likewise. (REGISTER_SAVE_BND2): Likewise. (_dl_runtime_resolve): Use them. Save and restore Intel MPX bound registers when calling _dl_fixup.	2014-04-09 15:38:09 -07:00
Igor Zamyatin	2d63a517e4	Save and restore AVX-512 zmm registers to x86-64 ld.so AVX-512 ISA adds 512-bit zmm registers. This patch updates _dl_runtime_profile to pass zmm registers to run-time audit. It also changes _dl_x86_64_save_sse and _dl_x86_64_restore_sse to upport zmm registers, which are called when only when RTLD_PREPARE_FOREIGN_CALL is used. Its performance impact is minimum. * config.h.in (HAVE_AVX512_SUPPORT): New #undef. (HAVE_AVX512_ASM_SUPPORT): Likewise. * sysdeps/x86_64/bits/link.h (La_x86_64_zmm): New. (La_x86_64_vector): Add zmm. * sysdeps/x86_64/Makefile (tests): Add tst-audit10. (modules-names): Add tst-auditmod10a and tst-auditmod10b. ($(objpfx)tst-audit10): New target. ($(objpfx)tst-audit10.out): Likewise. (tst-audit10-ENV): New. (AVX512-CFLAGS): Likewise. (CFLAGS-tst-audit10.c): Likewise. (CFLAGS-tst-auditmod10a.c): Likewise. (CFLAGS-tst-auditmod10b.c): Likewise. * sysdeps/x86_64/configure.ac: Set config-cflags-avx512, HAVE_AVX512_SUPPORT and HAVE_AVX512_ASM_SUPPORT. * sysdeps/x86_64/configure: Regenerated. * sysdeps/x86_64/dl-trampoline.S (_dl_runtime_profile): Add AVX-512 zmm register support. (_dl_x86_64_save_sse): Likewise. (_dl_x86_64_restore_sse): Likewise. * sysdeps/x86_64/dl-trampoline.h: Updated to support different size vector registers. * sysdeps/x86_64/link-defines.sym (YMM_SIZE): New. (ZMM_SIZE): Likewise. * sysdeps/x86_64/tst-audit10.c: New file. * sysdeps/x86_64/tst-auditmod10a.c: Likewise. * sysdeps/x86_64/tst-auditmod10b.c: Likewise.	2014-03-13 11:19:08 -07:00
Allan McRae	d4697bc93d	Update copyright notices with scripts/update-copyrights	2014-01-01 22:00:23 +10:00
Ondřej Bílka	382466e04e	Fix typos.	2013-08-30 18:08:59 +02:00
Joseph Myers	568035b787	Update copyright notices with scripts/update-copyrights.	2013-01-02 19:05:09 +00:00
H.J. Lu	1cf463cd4e	Check if RTLD_SAVESPACE_SSE is aligned to 32 bytes	2012-05-11 11:50:11 -07:00
Paul Eggert	59ba27a63a	Replace FSF snail mail address with URLs.	2012-02-09 23:18:22 +00:00
H.J. Lu	08a300c956	Simplify AVX check	2011-09-07 21:38:23 -04:00
Ulrich Drepper	0276a718c0	Fix minor CFI problem in regular x86-64 trampoline	2011-08-20 08:58:44 -04:00
Ulrich Drepper	c88f17668b	Fix CFI info in x86-64 trampolines for non-AVX code	2011-08-20 08:56:30 -04:00
Ulrich Drepper	bba33c289b	One more typo in AVX test	2011-07-23 15:18:13 -04:00
Ulrich Drepper	1aae088a8a	One more change to XSAVE patch	2011-07-22 23:33:22 -04:00
Andreas Schwab	1d002f2539	Fix AVX check	2011-07-22 14:33:47 -04:00
Ulrich Drepper	5644ef5461	Fix check for AVX enablement The AVX bit is set if the CPU supports AVX. But this doesn't mean the kernel does. Add checks according to Intel's documentation.	2011-07-20 21:21:03 -04:00
Ulrich Drepper	84088310ce	Handle AVX saving on x86-64 in interrupted smbol lookups. If a signal arrived during a symbol lookup and the signal handler also required a symbol lookup, the end of the lookup in the signal handler reset the flag whether restoring AVX/SSE registers is needed. Resetting means in this case that the tail part of the outer lookup code will try to restore the registers and this can fail miserably. We now restore to the previous value which makes nesting calls possible.	2009-08-25 10:42:30 -07:00
H.J. Lu	4e1e2f4247	Support mixed SSE/AVX audit and check AVX only once. This patch fixes mixed SSE/AVX audit and checks AVX only once in _dl_runtime_profile. When an AVX or SSE register value in pltenter is modified, we have to make sure that the SSE part value is the same in both lr_xmm and lr_vector fields so that pltexit will get the correct value from either lr_xmm or lr_vector fields. AVX-enabled pltenter should update both lr_xmm and lr_vector fields to support stacked AVX/SSE pltenter functions.	2009-08-08 10:54:42 -07:00
Ulrich Drepper	649bf13320	Improve CFI in x86-64 ld.so trampoline code.	2009-07-29 08:50:03 -07:00
H.J. Lu	09e0389eb1	Properly restore AVX registers on x86-64. tst-audit4 and tst-audit5 fail under AVX emulator due to je instead of jne. This patch fixes them.	2009-07-29 08:40:54 -07:00
Ulrich Drepper	b48a267b8f	Preserve SSE registers in runtime relocations on x86-64. SSE registers are used for passing parameters and must be preserved in runtime relocations. This is inside ld.so enforced through the tests in tst-xmmymm.sh. But the malloc routines used after startup come from libc.so and can be arbitrarily complex. It's overkill to save the SSE registers all the time because of that. These calls are rare. Instead we save them on demand. The new infrastructure put in place in this patch makes this possible and efficient.	2009-07-29 08:33:03 -07:00
Ulrich Drepper	c8027cced1	Optimize restoring of ymm registers on x86-64. The patch mainly reduces the code size but also avoids some jumps.	2009-07-16 07:15:15 -07:00
Ulrich Drepper	ca419225a3	Fix thinko in AVX audit patch. Don't use AVX instructions too often.	2009-07-15 17:59:14 -07:00
Ulrich Drepper	47fc9b710b	Fix typo in last change.	2009-07-15 17:51:11 -07:00
Ulrich Drepper	d7bd7a8ae8	Secure AVX changes for auditing code. The original AVX patch used a function pointer to handle the difference between machines with and without AVX support. This is insecure. A well-placed memory exploit could lead to redirection of the execution. Using a variable and several tests is a bit slower but cannot be exploited in this way.	2009-07-15 17:41:36 -07:00
H.J. Lu	b0ecde3a63	Add AVX support to ld.so auditing for x86-64.	2009-07-10 12:04:14 -07:00
H.J. Lu	167d5ed5de	Fix handling of xmm6 in ld.so audit hooks on x86-64.	2009-07-02 04:33:12 -07:00
Ulrich Drepper	906dd40db3	[BZ #9881 ] * inet/inet6_rth.c (inet6_rth_add): Add some error checking. Patch mostly by Yang Hongyang <yanghy@cn.fujitsu.com>. * inet/Makefile (tests): Add tst-inet6_rth. * inet/tst-inet6_rth.c: New file. alignment of La_x86_64_regs. Store xmm parameters.	2009-03-15 19:16:16 +00:00
Ulrich Drepper	a42ad61bae	* elf/dl-runtime.c (reloc_offset): Define. (reloc_index): Define. (_dl_fixup): Rename reloc_offset parameter to reloc_arg. (_dl_fixup_profile): Likewise. Use reloc_index instead of computing index from reloc_offset. (_dl_call_pltexit): Likewise. * sysdeps/x86_64/dl-trampoline.S (_dl_runtime_resolve): Just pass the relocation index to _dl_fixup. (_dl_runtime_profile): Likewise for _dl_fixup_profile and _dl_call_pltexit. * sysdeps/x86_64/dl-runtime.c: New file.	2009-03-15 00:26:14 +00:00
Ulrich Drepper	1f7c90a722	[BZ #9893 ] * sysdeps/x86_64/dl-trampoline.S (_dl_runtime_profile): Fix alignement of La_x86_64_regs. Store xmm parameters. Patch mostly by Jiri Olsa <olsajiri@gmail.com>.	2009-03-14 23:57:33 +00:00
Ulrich Drepper	41ff2a4999	* sysdeps/x86_64/dl-trampoline.S (_dl_runtime_profile): Make sure stack is properly aligned for the target function. Correct unwind info.	2007-10-31 19:25:15 +00:00
Ulrich Drepper	44b2e5815b	* sysdeps/x86_64/dl-trampoline.S (_dl_runtime_profile): Correctly align stack for call if pltexit is to be used.	2007-08-24 03:57:56 +00:00
Ulrich Drepper	9f0d7b6df9	* elf/dl-reloc.c [PROF] (_dl_relocate_object): Define consider_profiling always to zero. Don't count of compiler to remove unreached if block. * sysdeps/x86_64/dl-trampoline.S [PROF] (_dl_runtime_profile): Don't compile. * sysdeps/i386/dl-trampoline.S [PROF] (_dl_runtime_profile): Likewise. * sysdeps/ia64/dl-trampoline.S [PROF] (_dl_runtime_profile): Likewise. * sysdeps/s390/s390-64/dl-trampoline.S [PROF] (_dl_runtime_profile): Likewise. * sysdeps/s390/s390-32/dl-trampoline.S [PROF] (_dl_runtime_profile): Likewise. * sysdeps/powerpc/powerpc64/dl-trampoline.S [PROF] (_dl_profile_resolve): Likewise. * sysdeps/powerpc/powerpc32/dl-trampoline.S [PROF] (_dl_profile_resolve): Likewise. * gmon/Makefile: Add rules to build and run tst-profile-static. * gmon/tst-profile-static.c: New file. * Makeconfig (+link-static): Allow passing program-specific flags.	2005-07-07 02:39:45 +00:00
Ulrich Drepper	9dcafc5597	* csu/elf-init.c (__libc_csu_fini): Don't do anything here. * sysdeps/generic/libc-start.c: Don't register program destructor here. * dlfcn/Makefile: Add rules to build dlfcn.c. (LDFLAGS-dl.so): Removed. * dlfcn/dlclose.c: _dl_close is now in ld.so, use function pointer table. * dlfcn/dlmopen.c: Likewise for _dl_open. * dlfcn/dlopen.c: Likewise. * dlfcn/dlopenold.c: Likewise. * elf/dl-libc.c: Likewise for _dl_open and _dl_close. * elf/Makefile (routines): Remove dl-open and dl-close. (dl-routines): Add dl-open, dl-close, and dl-trampoline. Add rules to build and run tst-audit1. * elf/tst-audit1.c: New file. * elf/tst-auditmod1.c: New file. * elf/Versions [libc]: Remove _dl_open and _dl_close. * elf/dl-close.c: Change for use inside ld.so instead of libc.so. * elf/dl-open.c: Likewise. * elf/dl-debug.c (_dl_debug_initialize): Allow reinitialization, signaled by nonzero parameter. * elf/dl-init.c: Fix use of r_state. * elf/dl-load.c: Likewise. * elf/dl-close.c: Add auditing checkpoints. * elf/dl-open.c: Likewise. * elf/dl-fini.c: Likewise. * elf/dl-load.c: Likewise. * elf/dl-sym.c: Likewise. * sysdeps/generic/libc-start.c: Likewise. * elf/dl-object.c: Allocate memory for auditing information. * elf/dl-reloc.c: Remove RESOLV. We now always need the map. Correctly initialize slotinfo. * elf/dynamic-link.h: Adjust after removal of RESOLV. * sysdeps/hppa/dl-lookupcfg.h: Likewise. * sysdeps/ia64/dl-lookupcfg.h: Likewise. * sysdeps/powerpc/powerpc64/dl-lookupcfg.h: Removed. * elf/dl-runtime.c (_dl_fixup): Little cleanup. (_dl_profile_fixup): New parameters to point to register struct and variable for frame size. Add auditing checkpoints. (_dl_call_pltexit): New function. Don't define trampoline code here. * elf/rtld.c: Recognize LD_AUDIT. Load modules on startup. Remove all the functions from _rtld_global_ro which only _dl_open and _dl_close needed. Add auditing checkpoints. * elf/link.h: Define symbols for auditing interfaces. * include/link.h: Likewise. * include/dlfcn.h: Define __RTLD_AUDIT. Remove prototypes for _dl_open and _dl_close. Adjust access to argc and argv in libdl. * dlfcn/dlfcn.c: New file. * sysdeps/generic/dl-lookupcfg.h: Remove all content now that RESOLVE is gone. * sysdeps/generic/ldsodefs.h: Add definitions for auditing interfaces. * sysdeps/generic/unsecvars.h: Add LD_AUDIT. * sysdeps/i386/dl-machine.h: Remove trampoline code here. Adjust for removal of RESOLVE. * sysdeps/x86_64/dl-machine.h: Likewise. * sysdeps/generic/dl-trampoline.c: New file. * sysdeps/i386/dl-trampoline.c: New file. * sysdeps/x86_64/dl-trampoline.c: New file. * sysdeps/generic/dl-tls.c: Cleanups. Fixup for dtv_t change. Fix updating of DTV. * sysdeps/generic/libc-tls.c: Likewise. * sysdeps/arm/bits/link.h: Renamed to ... * sysdeps/arm/buts/linkmap.h: ...this. * sysdeps/generic/bits/link.h: Renamed to... * sysdeps/generic/bits/linkmap.h: ...this. * sysdeps/hppa/bits/link.h: Renamed to... * sysdeps/hppa/bits/linkmap.h: ...this. * sysdeps/hppa/i386/link.h: Renamed to... * sysdeps/hppa/i386/linkmap.h: ...this. * sysdeps/hppa/ia64/link.h: Renamed to... * sysdeps/hppa/ia64/linkmap.h: ...this. * sysdeps/hppa/s390/link.h: Renamed to... * sysdeps/hppa/s390/linkmap.h: ...this. * sysdeps/hppa/sh/link.h: Renamed to... * sysdeps/hppa/sh/linkmap.h: ...this. * sysdeps/hppa/x86_64/link.h: Renamed to... * sysdeps/hppa/x86_64/linkmap.h: ...this. 2005-01-06 Ulrich Drepper <drepper@redhat.com> * allocatestack.c (init_one_static_tls): Adjust initialization of DTV entry for static tls deallocation fix. * sysdeps/alpha/tls.h (dtv_t): Change pointer type to be struct which also contains information whether the memory pointed to is static TLS or not. * sysdeps/i386/tls.h: Likewise. * sysdeps/ia64/tls.h: Likewise. * sysdeps/powerpc/tls.h: Likewise. * sysdeps/s390/tls.h: Likewise. * sysdeps/sh/tls.h: Likewise. * sysdeps/sparc/tls.h: Likewise. * sysdeps/x86_64/tls.h: Likewise.	2005-01-06 22:40:27 +00:00
Ulrich Drepper	a334319f65	(CFLAGS-tst-align.c): Add -mpreferred-stack-boundary=4.	2004-12-22 20:10:10 +00:00
Jakub Jelinek	0ecb606cb6	2.5-18.1	2007-07-12 18:26:36 +00:00

47 Commits