glibc/sysdeps/i386/dl-trampoline.S
H.J. Lu f753fa7dea x86: Support IBT and SHSTK in Intel CET [BZ #21598]
Intel Control-flow Enforcement Technology (CET) instructions:

https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-en
forcement-technology-preview.pdf

includes Indirect Branch Tracking (IBT) and Shadow Stack (SHSTK).

GNU_PROPERTY_X86_FEATURE_1_IBT is added to GNU program property to
indicate that all executable sections are compatible with IBT when
ENDBR instruction starts each valid target where an indirect branch
instruction can land.  Linker sets GNU_PROPERTY_X86_FEATURE_1_IBT on
output only if it is set on all relocatable inputs.

On an IBT capable processor, the following steps should be taken:

1. When loading an executable without an interpreter, enable IBT and
lock IBT if GNU_PROPERTY_X86_FEATURE_1_IBT is set on the executable.
2. When loading an executable with an interpreter, enable IBT if
GNU_PROPERTY_X86_FEATURE_1_IBT is set on the interpreter.
  a. If GNU_PROPERTY_X86_FEATURE_1_IBT isn't set on the executable,
     disable IBT.
  b. Lock IBT.
3. If IBT is enabled, when loading a shared object without
GNU_PROPERTY_X86_FEATURE_1_IBT:
  a. If legacy interwork is allowed, then mark all pages in executable
     PT_LOAD segments in legacy code page bitmap.  Failure of legacy code
     page bitmap allocation causes an error.
  b. If legacy interwork isn't allowed, it causes an error.

GNU_PROPERTY_X86_FEATURE_1_SHSTK is added to GNU program property to
indicate that all executable sections are compatible with SHSTK where
return address popped from shadow stack always matches return address
popped from normal stack.  Linker sets GNU_PROPERTY_X86_FEATURE_1_SHSTK
on output only if it is set on all relocatable inputs.

On a SHSTK capable processor, the following steps should be taken:

1. When loading an executable without an interpreter, enable SHSTK if
GNU_PROPERTY_X86_FEATURE_1_SHSTK is set on the executable.
2. When loading an executable with an interpreter, enable SHSTK if
GNU_PROPERTY_X86_FEATURE_1_SHSTK is set on interpreter.
  a. If GNU_PROPERTY_X86_FEATURE_1_SHSTK isn't set on the executable
     or any shared objects loaded via the DT_NEEDED tag, disable SHSTK.
  b. Otherwise lock SHSTK.
3. After SHSTK is enabled, it is an error to load a shared object
without GNU_PROPERTY_X86_FEATURE_1_SHSTK.

To enable CET support in glibc, --enable-cet is required to configure
glibc.  When CET is enabled, both compiler and assembler must support
CET.  Otherwise, it is a configure-time error.

To support CET run-time control,

1. _dl_x86_feature_1 is added to the writable ld.so namespace to indicate
if IBT or SHSTK are enabled at run-time.  It should be initialized by
init_cpu_features.
2. For dynamic executables:
   a. A l_cet field is added to struct link_map to indicate if IBT or
      SHSTK is enabled in an ELF module.  _dl_process_pt_note or
      _rtld_process_pt_note is called to process PT_NOTE segment for
      GNU program property and set l_cet.
   b. _dl_open_check is added to check IBT and SHSTK compatibilty when
      dlopening a shared object.
3. Replace i386 _dl_runtime_resolve and _dl_runtime_profile with
_dl_runtime_resolve_shstk and _dl_runtime_profile_shstk, respectively if
SHSTK is enabled.

CET run-time control can be changed via GLIBC_TUNABLES with

$ export GLIBC_TUNABLES=glibc.tune.x86_shstk=[permissive|on|off]
$ export GLIBC_TUNABLES=glibc.tune.x86_ibt=[permissive|on|off]

1. permissive: SHSTK is disabled when dlopening a legacy ELF module.
2. on: IBT or SHSTK are always enabled, regardless if there are IBT or
SHSTK bits in GNU program property.
3. off: IBT or SHSTK are always disabled, regardless if there are IBT or
SHSTK bits in GNU program property.

<cet.h> from CET-enabled GCC is automatically included by assembly codes
to add GNU_PROPERTY_X86_FEATURE_1_IBT and GNU_PROPERTY_X86_FEATURE_1_SHSTK
to GNU program property.  _CET_ENDBR is added at the entrance of all
assembly functions whose address may be taken.  _CET_NOTRACK is used to
insert NOTRACK prefix with indirect jump table to support IBT.  It is
defined as notrack when _CET_NOTRACK is defined in <cet.h>.

	 [BZ #21598]
	* configure.ac: Add --enable-cet.
	* configure: Regenerated.
	* elf/Makefille (all-built-dso): Add a comment.
	* elf/dl-load.c (filebuf): Moved before "dynamic-link.h".
	Include <dl-prop.h>.
	(_dl_map_object_from_fd): Call _dl_process_pt_note on PT_NOTE
	segment.
	* elf/dl-open.c: Include <dl-prop.h>.
	(dl_open_worker): Call _dl_open_check.
	* elf/rtld.c: Include <dl-prop.h>.
	(dl_main): Call _rtld_process_pt_note on PT_NOTE segment.  Call
	_rtld_main_check.
	* sysdeps/generic/dl-prop.h: New file.
	* sysdeps/i386/dl-cet.c: Likewise.
	* sysdeps/unix/sysv/linux/x86/cpu-features.c: Likewise.
	* sysdeps/unix/sysv/linux/x86/dl-cet.h: Likewise.
	* sysdeps/x86/cet-tunables.h: Likewise.
	* sysdeps/x86/check-cet.awk: Likewise.
	* sysdeps/x86/configure: Likewise.
	* sysdeps/x86/configure.ac: Likewise.
	* sysdeps/x86/dl-cet.c: Likewise.
	* sysdeps/x86/dl-procruntime.c: Likewise.
	* sysdeps/x86/dl-prop.h: Likewise.
	* sysdeps/x86/libc-start.h: Likewise.
	* sysdeps/x86/link_map.h: Likewise.
	* sysdeps/i386/dl-trampoline.S (_dl_runtime_resolve): Add
	_CET_ENDBR.
	(_dl_runtime_profile): Likewise.
	(_dl_runtime_resolve_shstk): New.
	(_dl_runtime_profile_shstk): Likewise.
	* sysdeps/linux/x86/Makefile (sysdep-dl-routines): Add dl-cet
	if CET is enabled.
	(CFLAGS-.o): Add -fcf-protection if CET is enabled.
	(CFLAGS-.os): Likewise.
	(CFLAGS-.op): Likewise.
	(CFLAGS-.oS): Likewise.
	(asm-CPPFLAGS): Add -fcf-protection -include cet.h if CET
	is enabled.
	(tests-special): Add $(objpfx)check-cet.out.
	(cet-built-dso): New.
	(+$(cet-built-dso:=.note)): Likewise.
	(common-generated): Add $(cet-built-dso:$(common-objpfx)%=%.note).
	($(objpfx)check-cet.out): New.
	(generated): Add check-cet.out.
	* sysdeps/x86/cpu-features.c: Include <dl-cet.h> and
	<cet-tunables.h>.
	(TUNABLE_CALLBACK (set_x86_ibt)): New prototype.
	(TUNABLE_CALLBACK (set_x86_shstk)): Likewise.
	(init_cpu_features): Call get_cet_status to check CET status
	and update dl_x86_feature_1 with CET status.  Call
	TUNABLE_CALLBACK (set_x86_ibt) and TUNABLE_CALLBACK
	(set_x86_shstk).  Disable and lock CET in libc.a.
	* sysdeps/x86/cpu-tunables.c: Include <cet-tunables.h>.
	(TUNABLE_CALLBACK (set_x86_ibt)): New function.
	(TUNABLE_CALLBACK (set_x86_shstk)): Likewise.
	* sysdeps/x86/sysdep.h (_CET_NOTRACK): New.
	(_CET_ENDBR): Define if not defined.
	(ENTRY): Add _CET_ENDBR.
	* sysdeps/x86/dl-tunables.list (glibc.tune): Add x86_ibt and
	x86_shstk.
	* sysdeps/x86_64/dl-trampoline.h (_dl_runtime_resolve): Add
	_CET_ENDBR.
	(_dl_runtime_profile): Likewise.
2018-07-16 14:08:27 -07:00

288 lines
7.7 KiB
ArmAsm

/* PLT trampolines. i386 version.
Copyright (C) 2004-2018 Free Software Foundation, Inc.
This file is part of the GNU C Library.
The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.
The GNU C Library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with the GNU C Library; if not, see
<http://www.gnu.org/licenses/>. */
#include <sysdep.h>
#include <link-defines.h>
#ifdef HAVE_MPX_SUPPORT
# define PRESERVE_BND_REGS_PREFIX bnd
#else
# define PRESERVE_BND_REGS_PREFIX .byte 0xf2
#endif
.text
.globl _dl_runtime_resolve
.type _dl_runtime_resolve, @function
cfi_startproc
.align 16
_dl_runtime_resolve:
cfi_adjust_cfa_offset (8)
_CET_ENDBR
pushl %eax # Preserve registers otherwise clobbered.
cfi_adjust_cfa_offset (4)
pushl %ecx
cfi_adjust_cfa_offset (4)
pushl %edx
cfi_adjust_cfa_offset (4)
movl 16(%esp), %edx # Copy args pushed by PLT in register. Note
movl 12(%esp), %eax # that `fixup' takes its parameters in regs.
call _dl_fixup # Call resolver.
popl %edx # Get register content back.
cfi_adjust_cfa_offset (-4)
movl (%esp), %ecx
movl %eax, (%esp) # Store the function address.
movl 4(%esp), %eax
ret $12 # Jump to function address.
cfi_endproc
.size _dl_runtime_resolve, .-_dl_runtime_resolve
# The SHSTK compatible version.
.text
.globl _dl_runtime_resolve_shstk
.type _dl_runtime_resolve_shstk, @function
cfi_startproc
.align 16
_dl_runtime_resolve_shstk:
cfi_adjust_cfa_offset (8)
_CET_ENDBR
pushl %eax # Preserve registers otherwise clobbered.
cfi_adjust_cfa_offset (4)
pushl %edx
cfi_adjust_cfa_offset (4)
movl 12(%esp), %edx # Copy args pushed by PLT in register. Note
movl 8(%esp), %eax # that `fixup' takes its parameters in regs.
call _dl_fixup # Call resolver.
movl (%esp), %edx # Get register content back.
movl %eax, %ecx # Store the function address.
movl 4(%esp), %eax # Get register content back.
addl $16, %esp # Adjust stack: PLT1 + PLT2 + %eax + %edx
cfi_adjust_cfa_offset (-16)
jmp *%ecx # Jump to function address.
cfi_endproc
.size _dl_runtime_resolve_shstk, .-_dl_runtime_resolve_shstk
#ifndef PROF
# The SHSTK compatible version.
.globl _dl_runtime_profile_shstk
.type _dl_runtime_profile_shstk, @function
cfi_startproc
.align 16
_dl_runtime_profile_shstk:
cfi_adjust_cfa_offset (8)
_CET_ENDBR
pushl %esp
cfi_adjust_cfa_offset (4)
addl $8, (%esp) # Account for the pushed PLT data
pushl %ebp
cfi_adjust_cfa_offset (4)
pushl %eax # Preserve registers otherwise clobbered.
cfi_adjust_cfa_offset (4)
pushl %ecx
cfi_adjust_cfa_offset (4)
pushl %edx
cfi_adjust_cfa_offset (4)
movl %esp, %ecx
subl $8, %esp
cfi_adjust_cfa_offset (8)
movl $-1, 4(%esp)
leal 4(%esp), %edx
movl %edx, (%esp)
pushl %ecx # Address of the register structure
cfi_adjust_cfa_offset (4)
movl 40(%esp), %ecx # Load return address
movl 36(%esp), %edx # Copy args pushed by PLT in register. Note
movl 32(%esp), %eax # that `fixup' takes its parameters in regs.
call _dl_profile_fixup # Call resolver.
cfi_adjust_cfa_offset (-8)
movl (%esp), %edx
testl %edx, %edx
jns 1f
movl 4(%esp), %edx # Get register content back.
movl %eax, %ecx # Store the function address.
movl 12(%esp), %eax # Get register content back.
# Adjust stack: PLT1 + PLT2 + %esp + %ebp + %eax + %ecx + %edx
# + free.
addl $32, %esp
cfi_adjust_cfa_offset (-32)
jmp *%ecx # Jump to function address.
cfi_endproc
.size _dl_runtime_profile_shstk, .-_dl_runtime_profile_shstk
.globl _dl_runtime_profile
.type _dl_runtime_profile, @function
cfi_startproc
.align 16
_dl_runtime_profile:
cfi_adjust_cfa_offset (8)
_CET_ENDBR
pushl %esp
cfi_adjust_cfa_offset (4)
addl $8, (%esp) # Account for the pushed PLT data
pushl %ebp
cfi_adjust_cfa_offset (4)
pushl %eax # Preserve registers otherwise clobbered.
cfi_adjust_cfa_offset (4)
pushl %ecx
cfi_adjust_cfa_offset (4)
pushl %edx
cfi_adjust_cfa_offset (4)
movl %esp, %ecx
subl $8, %esp
cfi_adjust_cfa_offset (8)
movl $-1, 4(%esp)
leal 4(%esp), %edx
movl %edx, (%esp)
pushl %ecx # Address of the register structure
cfi_adjust_cfa_offset (4)
movl 40(%esp), %ecx # Load return address
movl 36(%esp), %edx # Copy args pushed by PLT in register. Note
movl 32(%esp), %eax # that `fixup' takes its parameters in regs.
call _dl_profile_fixup # Call resolver.
cfi_adjust_cfa_offset (-8)
movl (%esp), %edx
testl %edx, %edx
jns 1f
popl %edx
cfi_adjust_cfa_offset (-4)
popl %edx # Get register content back.
cfi_adjust_cfa_offset (-4)
movl (%esp), %ecx
movl %eax, (%esp) # Store the function address.
movl 4(%esp), %eax
ret $20 # Jump to function address.
/*
+32 return address
+28 PLT1
+24 PLT2
+20 %esp
+16 %ebp
+12 %eax
+8 %ecx
+4 %edx
%esp free
*/
cfi_adjust_cfa_offset (8)
1: movl %ebx, (%esp)
cfi_rel_offset (ebx, 0)
movl %edx, %ebx # This is the frame buffer size
pushl %edi
cfi_adjust_cfa_offset (4)
cfi_rel_offset (edi, 0)
pushl %esi
cfi_adjust_cfa_offset (4)
cfi_rel_offset (esi, 0)
leal 44(%esp), %esi
movl %ebx, %ecx
orl $4, %ebx # Increase frame size if necessary to align
# stack for the function call
andl $~3, %ebx
movl %esp, %edi
subl %ebx, %edi
movl %esp, %ebx
cfi_def_cfa_register (ebx)
movl %edi, %esp
shrl $2, %ecx
rep
movsl
movl (%ebx), %esi
cfi_restore (esi)
movl 4(%ebx), %edi
cfi_restore (edi)
/*
%ebx+40 return address
%ebx+36 PLT1
%ebx+32 PLT2
%ebx+28 %esp
%ebx+24 %ebp
%ebx+20 %eax
%ebx+16 %ecx
%ebx+12 %edx
%ebx+8 %ebx
%ebx+4 free
%ebx free
%esp copied stack frame
*/
movl %eax, (%ebx)
movl 12(%ebx), %edx
movl 16(%ebx), %ecx
movl 20(%ebx), %eax
call *(%ebx)
movl %ebx, %esp
cfi_def_cfa_register (esp)
movl 8(%esp), %ebx
cfi_restore (ebx)
/*
+40 return address
+36 PLT1
+32 PLT2
+28 %esp
+24 %ebp
+20 %eax
+16 %ecx
+12 %edx
+8 free
+4 free
%esp free
*/
#if LONG_DOUBLE_SIZE != 12
# error "long double size must be 12 bytes"
#endif
# Allocate space for La_i86_retval and subtract 12 free bytes.
subl $(LRV_SIZE - 12), %esp
cfi_adjust_cfa_offset (LRV_SIZE - 12)
movl %eax, LRV_EAX_OFFSET(%esp)
movl %edx, LRV_EDX_OFFSET(%esp)
fstpt LRV_ST0_OFFSET(%esp)
fstpt LRV_ST1_OFFSET(%esp)
#ifdef HAVE_MPX_SUPPORT
bndmov %bnd0, LRV_BND0_OFFSET(%esp)
bndmov %bnd1, LRV_BND1_OFFSET(%esp)
#else
.byte 0x66,0x0f,0x1b,0x44,0x24,LRV_BND0_OFFSET
.byte 0x66,0x0f,0x1b,0x4c,0x24,LRV_BND1_OFFSET
#endif
pushl %esp
cfi_adjust_cfa_offset (4)
# Address of La_i86_regs area.
leal (LRV_SIZE + 4)(%esp), %ecx
# PLT2
movl (LRV_SIZE + 4 + LR_SIZE)(%esp), %eax
# PLT1
movl (LRV_SIZE + 4 + LR_SIZE + 4)(%esp), %edx
call _dl_call_pltexit
movl LRV_EAX_OFFSET(%esp), %eax
movl LRV_EDX_OFFSET(%esp), %edx
fldt LRV_ST1_OFFSET(%esp)
fldt LRV_ST0_OFFSET(%esp)
#ifdef HAVE_MPX_SUPPORT
bndmov LRV_BND0_OFFSET(%esp), %bnd0
bndmov LRV_BND1_OFFSET(%esp), %bnd1
#else
.byte 0x66,0x0f,0x1a,0x44,0x24,LRV_BND0_OFFSET
.byte 0x66,0x0f,0x1a,0x4c,0x24,LRV_BND1_OFFSET
#endif
# Restore stack before return.
addl $(LRV_SIZE + 4 + LR_SIZE + 4), %esp
cfi_adjust_cfa_offset (-(LRV_SIZE + 4 + LR_SIZE + 4))
PRESERVE_BND_REGS_PREFIX
ret
cfi_endproc
.size _dl_runtime_profile, .-_dl_runtime_profile
#endif