2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
/* PLT trampolines. hppa version.
|
2020-01-01 00:14:33 +00:00
|
|
|
Copyright (C) 2005-2020 Free Software Foundation, Inc.
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
This file is part of the GNU C Library.
|
|
|
|
|
|
|
|
The GNU C Library is free software; you can redistribute it and/or
|
|
|
|
modify it under the terms of the GNU Lesser General Public
|
|
|
|
License as published by the Free Software Foundation; either
|
|
|
|
version 2.1 of the License, or (at your option) any later version.
|
|
|
|
|
|
|
|
The GNU C Library is distributed in the hope that it will be useful,
|
|
|
|
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
|
|
|
Lesser General Public License for more details.
|
|
|
|
|
|
|
|
You should have received a copy of the GNU Lesser General Public
|
2012-03-09 23:56:38 +00:00
|
|
|
License along with the GNU C Library. If not, see
|
Prefer https to http for gnu.org and fsf.org URLs
Also, change sources.redhat.com to sourceware.org.
This patch was automatically generated by running the following shell
script, which uses GNU sed, and which avoids modifying files imported
from upstream:
sed -ri '
s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g
s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g
' \
$(find $(git ls-files) -prune -type f \
! -name '*.po' \
! -name 'ChangeLog*' \
! -path COPYING ! -path COPYING.LIB \
! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \
! -path manual/texinfo.tex ! -path scripts/config.guess \
! -path scripts/config.sub ! -path scripts/install-sh \
! -path scripts/mkinstalldirs ! -path scripts/move-if-change \
! -path INSTALL ! -path locale/programs/charmap-kw.h \
! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \
! '(' -name configure \
-execdir test -f configure.ac -o -f configure.in ';' ')' \
! '(' -name preconfigure \
-execdir test -f preconfigure.ac ';' ')' \
-print)
and then by running 'make dist-prepare' to regenerate files built
from the altered files, and then executing the following to cleanup:
chmod a+x sysdeps/unix/sysv/linux/riscv/configure
# Omit irrelevant whitespace and comment-only changes,
# perhaps from a slightly-different Autoconf version.
git checkout -f \
sysdeps/csky/configure \
sysdeps/hppa/configure \
sysdeps/riscv/configure \
sysdeps/unix/sysv/linux/csky/configure
# Omit changes that caused a pre-commit check to fail like this:
# remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines
git checkout -f \
sysdeps/powerpc/powerpc64/ppc-mcount.S \
sysdeps/unix/sysv/linux/s390/s390-64/syscall.S
# Omit change that caused a pre-commit check to fail like this:
# remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline
git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
2019-09-07 05:40:42 +00:00
|
|
|
<https://www.gnu.org/licenses/>. */
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
|
|
|
|
#include <sysdep.h>
|
|
|
|
|
|
|
|
/* This code gets called via the .plt stub, and is used in
|
2013-06-05 20:26:40 +00:00
|
|
|
dl-runtime.c to call the `_dl_fixup' function and then redirect
|
|
|
|
to the address it returns. `_dl_fixup' takes two arguments, however
|
|
|
|
`_dl_profile_fixup' takes a number of parameters for use with
|
2006-09-07 16:34:43 +00:00
|
|
|
library auditing (LA).
|
2013-06-05 20:26:40 +00:00
|
|
|
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
WARNING: This template is also used by gcc's __cffc, and expects
|
|
|
|
that the "bl" for _dl_runtime_resolve exist at a particular offset.
|
|
|
|
Do not change this template without changing gcc, while the prefix
|
|
|
|
"bl" should fix everything so gcc finds the right spot, it will
|
|
|
|
slow down __cffc when it attempts to call fixup to resolve function
|
|
|
|
descriptor references. Please refer to gcc/gcc/config/pa/fptr.c
|
2013-06-05 20:26:40 +00:00
|
|
|
|
Fix data race in setting function descriptors during lazy binding on hppa.
This addresses an issue that is present mainly on SMP machines running
threaded code. In a typical indirect call or PLT import stub, the
target address is loaded first. Then the global pointer is loaded into
the PIC register in the delay slot of a branch to the target address.
During lazy binding, the target address is a trampoline which transfers
to _dl_runtime_resolve().
_dl_runtime_resolve() uses the relocation offset stored in the global
pointer and the linkage map stored in the trampoline to find the
relocation. Then, the function descriptor is updated.
In a multi-threaded application, it is possible for the global pointer
to be updated between the load of the target address and the global
pointer. When this happens, the relocation offset has been replaced
by the new global pointer. The function pointer has probably been
updated as well but there is no way to find the address of the function
descriptor and to transfer to the target. So, _dl_runtime_resolve()
typically crashes.
HP-UX addressed this problem by adding an extra pc-relative branch to
the trampoline. The descriptor is initially setup to point to the
branch. The branch then transfers to the trampoline. This allowed
the trampoline code to figure out which descriptor was being used
without any modification to user code. I didn't use this approach
as it is more complex and changes function pointer canonicalization.
The order of loading the target address and global pointer in
indirect calls was not consistent with the order used in import stubs.
In particular, $$dyncall and some inline versions of it loaded the
global pointer first. This was inconsistent with the global pointer
being updated first in dl-machine.h. Assuming the accesses are
ordered, we want elf_machine_fixup_plt() to store the global pointer
first and calls to load it last. Then, the global pointer will be
correct when the target function is entered.
However, just to make things more fun, HP added support for
out-of-order execution of accesses in PA 2.0. The accesses used by
calls are weakly ordered. So, it's possibly under some circumstances
that a function might be entered with the wrong global pointer.
However, HP uses weakly ordered accesses in 64-bit HP-UX, so I assume
that loading the global pointer in the delay slot of the branch must
work consistently.
The basic fix for the race is a combination of modifying user code to
preserve the address of the function descriptor in register %r22 and
setting the least-significant bit in the relocation offset. The
latter was suggested by Carlos as a way to distinguish relocation
offsets from global pointer values. Conventionally, %r22 is used
as the address of the function descriptor in calls to $$dyncall.
So, it wasn't hard to preserve the address in %r22.
I have updated gcc trunk and gcc-9 branch to not clobber %r22 in
$$dyncall and inline indirect calls. I have also modified the import
stubs in binutils trunk and the 2.33 branch to preserve %r22. This
required making the stubs one instruction longer but we save one
relocation. I also modified binutils to align the .plt section on
a 8-byte boundary. This allows descriptors to be updated atomically
with a floting-point store.
With these changes, _dl_runtime_resolve() can fallback to an alternate
mechanism to find the relocation offset when it has been clobbered.
There's just one additional instruction in the fast path. I tested
the fallback function, _dl_fix_reloc_arg(), by changing the branch to
always use the fallback. Old code still runs as it did before.
Fixes bug 23296.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-03-30 20:36:49 +00:00
|
|
|
Enter with r19 = reloc offset, r20 = got-8, r21 = fixup ltp, r22 = fp. */
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
|
2006-09-07 16:34:43 +00:00
|
|
|
/* RELOCATION MARKER: bl to provide gcc's __cffc with fixup loc. */
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
.text
|
2006-09-07 16:34:43 +00:00
|
|
|
/* THIS CODE DOES NOT EXECUTE */
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
bl _dl_fixup, %r2
|
|
|
|
.text
|
|
|
|
.global _dl_runtime_resolve
|
|
|
|
.type _dl_runtime_resolve,@function
|
2006-09-07 16:34:43 +00:00
|
|
|
cfi_startproc
|
|
|
|
.align 4
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
_dl_runtime_resolve:
|
|
|
|
.PROC
|
|
|
|
.CALLINFO FRAME=128,CALLS,SAVE_RP,ENTRY_GR=3
|
|
|
|
.ENTRY
|
|
|
|
/* SAVE_RP says we do */
|
2006-09-07 16:34:43 +00:00
|
|
|
stw %rp, -20(%sp)
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
|
|
|
|
/* Save static link register */
|
|
|
|
stw %r29,-16(%sp)
|
2014-04-29 07:08:48 +00:00
|
|
|
/* Save argument registers */
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
stw %r26,-36(%sp)
|
|
|
|
stw %r25,-40(%sp)
|
|
|
|
stw %r24,-44(%sp)
|
|
|
|
stw %r23,-48(%sp)
|
|
|
|
|
|
|
|
/* Build a call frame, and save structure pointer. */
|
2006-09-07 16:34:43 +00:00
|
|
|
copy %sp, %r1 /* Copy previous sp */
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
/* Save function result address (on entry) */
|
|
|
|
stwm %r28,128(%sp)
|
Fix data race in setting function descriptors during lazy binding on hppa.
This addresses an issue that is present mainly on SMP machines running
threaded code. In a typical indirect call or PLT import stub, the
target address is loaded first. Then the global pointer is loaded into
the PIC register in the delay slot of a branch to the target address.
During lazy binding, the target address is a trampoline which transfers
to _dl_runtime_resolve().
_dl_runtime_resolve() uses the relocation offset stored in the global
pointer and the linkage map stored in the trampoline to find the
relocation. Then, the function descriptor is updated.
In a multi-threaded application, it is possible for the global pointer
to be updated between the load of the target address and the global
pointer. When this happens, the relocation offset has been replaced
by the new global pointer. The function pointer has probably been
updated as well but there is no way to find the address of the function
descriptor and to transfer to the target. So, _dl_runtime_resolve()
typically crashes.
HP-UX addressed this problem by adding an extra pc-relative branch to
the trampoline. The descriptor is initially setup to point to the
branch. The branch then transfers to the trampoline. This allowed
the trampoline code to figure out which descriptor was being used
without any modification to user code. I didn't use this approach
as it is more complex and changes function pointer canonicalization.
The order of loading the target address and global pointer in
indirect calls was not consistent with the order used in import stubs.
In particular, $$dyncall and some inline versions of it loaded the
global pointer first. This was inconsistent with the global pointer
being updated first in dl-machine.h. Assuming the accesses are
ordered, we want elf_machine_fixup_plt() to store the global pointer
first and calls to load it last. Then, the global pointer will be
correct when the target function is entered.
However, just to make things more fun, HP added support for
out-of-order execution of accesses in PA 2.0. The accesses used by
calls are weakly ordered. So, it's possibly under some circumstances
that a function might be entered with the wrong global pointer.
However, HP uses weakly ordered accesses in 64-bit HP-UX, so I assume
that loading the global pointer in the delay slot of the branch must
work consistently.
The basic fix for the race is a combination of modifying user code to
preserve the address of the function descriptor in register %r22 and
setting the least-significant bit in the relocation offset. The
latter was suggested by Carlos as a way to distinguish relocation
offsets from global pointer values. Conventionally, %r22 is used
as the address of the function descriptor in calls to $$dyncall.
So, it wasn't hard to preserve the address in %r22.
I have updated gcc trunk and gcc-9 branch to not clobber %r22 in
$$dyncall and inline indirect calls. I have also modified the import
stubs in binutils trunk and the 2.33 branch to preserve %r22. This
required making the stubs one instruction longer but we save one
relocation. I also modified binutils to align the .plt section on
a 8-byte boundary. This allows descriptors to be updated atomically
with a floting-point store.
With these changes, _dl_runtime_resolve() can fallback to an alternate
mechanism to find the relocation offset when it has been clobbered.
There's just one additional instruction in the fast path. I tested
the fallback function, _dl_fix_reloc_arg(), by changing the branch to
always use the fallback. Old code still runs as it did before.
Fixes bug 23296.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-03-30 20:36:49 +00:00
|
|
|
/* Fill in some frame info to follow ABI */
|
2006-09-07 16:34:43 +00:00
|
|
|
stw %r1,-4(%sp) /* Previous sp */
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
stw %r21,-32(%sp) /* PIC register value */
|
2006-09-07 16:34:43 +00:00
|
|
|
|
|
|
|
/* Save input floating point registers. This must be done
|
|
|
|
in the new frame since the previous frame doesn't have
|
|
|
|
enough space */
|
Fix data race in setting function descriptors during lazy binding on hppa.
This addresses an issue that is present mainly on SMP machines running
threaded code. In a typical indirect call or PLT import stub, the
target address is loaded first. Then the global pointer is loaded into
the PIC register in the delay slot of a branch to the target address.
During lazy binding, the target address is a trampoline which transfers
to _dl_runtime_resolve().
_dl_runtime_resolve() uses the relocation offset stored in the global
pointer and the linkage map stored in the trampoline to find the
relocation. Then, the function descriptor is updated.
In a multi-threaded application, it is possible for the global pointer
to be updated between the load of the target address and the global
pointer. When this happens, the relocation offset has been replaced
by the new global pointer. The function pointer has probably been
updated as well but there is no way to find the address of the function
descriptor and to transfer to the target. So, _dl_runtime_resolve()
typically crashes.
HP-UX addressed this problem by adding an extra pc-relative branch to
the trampoline. The descriptor is initially setup to point to the
branch. The branch then transfers to the trampoline. This allowed
the trampoline code to figure out which descriptor was being used
without any modification to user code. I didn't use this approach
as it is more complex and changes function pointer canonicalization.
The order of loading the target address and global pointer in
indirect calls was not consistent with the order used in import stubs.
In particular, $$dyncall and some inline versions of it loaded the
global pointer first. This was inconsistent with the global pointer
being updated first in dl-machine.h. Assuming the accesses are
ordered, we want elf_machine_fixup_plt() to store the global pointer
first and calls to load it last. Then, the global pointer will be
correct when the target function is entered.
However, just to make things more fun, HP added support for
out-of-order execution of accesses in PA 2.0. The accesses used by
calls are weakly ordered. So, it's possibly under some circumstances
that a function might be entered with the wrong global pointer.
However, HP uses weakly ordered accesses in 64-bit HP-UX, so I assume
that loading the global pointer in the delay slot of the branch must
work consistently.
The basic fix for the race is a combination of modifying user code to
preserve the address of the function descriptor in register %r22 and
setting the least-significant bit in the relocation offset. The
latter was suggested by Carlos as a way to distinguish relocation
offsets from global pointer values. Conventionally, %r22 is used
as the address of the function descriptor in calls to $$dyncall.
So, it wasn't hard to preserve the address in %r22.
I have updated gcc trunk and gcc-9 branch to not clobber %r22 in
$$dyncall and inline indirect calls. I have also modified the import
stubs in binutils trunk and the 2.33 branch to preserve %r22. This
required making the stubs one instruction longer but we save one
relocation. I also modified binutils to align the .plt section on
a 8-byte boundary. This allows descriptors to be updated atomically
with a floting-point store.
With these changes, _dl_runtime_resolve() can fallback to an alternate
mechanism to find the relocation offset when it has been clobbered.
There's just one additional instruction in the fast path. I tested
the fallback function, _dl_fix_reloc_arg(), by changing the branch to
always use the fallback. Old code still runs as it did before.
Fixes bug 23296.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-03-30 20:36:49 +00:00
|
|
|
ldo -64(%sp),%r1
|
2006-09-07 16:34:43 +00:00
|
|
|
fstd,ma %fr4,-8(%r1)
|
|
|
|
fstd,ma %fr5,-8(%r1)
|
|
|
|
fstd,ma %fr6,-8(%r1)
|
Fix data race in setting function descriptors during lazy binding on hppa.
This addresses an issue that is present mainly on SMP machines running
threaded code. In a typical indirect call or PLT import stub, the
target address is loaded first. Then the global pointer is loaded into
the PIC register in the delay slot of a branch to the target address.
During lazy binding, the target address is a trampoline which transfers
to _dl_runtime_resolve().
_dl_runtime_resolve() uses the relocation offset stored in the global
pointer and the linkage map stored in the trampoline to find the
relocation. Then, the function descriptor is updated.
In a multi-threaded application, it is possible for the global pointer
to be updated between the load of the target address and the global
pointer. When this happens, the relocation offset has been replaced
by the new global pointer. The function pointer has probably been
updated as well but there is no way to find the address of the function
descriptor and to transfer to the target. So, _dl_runtime_resolve()
typically crashes.
HP-UX addressed this problem by adding an extra pc-relative branch to
the trampoline. The descriptor is initially setup to point to the
branch. The branch then transfers to the trampoline. This allowed
the trampoline code to figure out which descriptor was being used
without any modification to user code. I didn't use this approach
as it is more complex and changes function pointer canonicalization.
The order of loading the target address and global pointer in
indirect calls was not consistent with the order used in import stubs.
In particular, $$dyncall and some inline versions of it loaded the
global pointer first. This was inconsistent with the global pointer
being updated first in dl-machine.h. Assuming the accesses are
ordered, we want elf_machine_fixup_plt() to store the global pointer
first and calls to load it last. Then, the global pointer will be
correct when the target function is entered.
However, just to make things more fun, HP added support for
out-of-order execution of accesses in PA 2.0. The accesses used by
calls are weakly ordered. So, it's possibly under some circumstances
that a function might be entered with the wrong global pointer.
However, HP uses weakly ordered accesses in 64-bit HP-UX, so I assume
that loading the global pointer in the delay slot of the branch must
work consistently.
The basic fix for the race is a combination of modifying user code to
preserve the address of the function descriptor in register %r22 and
setting the least-significant bit in the relocation offset. The
latter was suggested by Carlos as a way to distinguish relocation
offsets from global pointer values. Conventionally, %r22 is used
as the address of the function descriptor in calls to $$dyncall.
So, it wasn't hard to preserve the address in %r22.
I have updated gcc trunk and gcc-9 branch to not clobber %r22 in
$$dyncall and inline indirect calls. I have also modified the import
stubs in binutils trunk and the 2.33 branch to preserve %r22. This
required making the stubs one instruction longer but we save one
relocation. I also modified binutils to align the .plt section on
a 8-byte boundary. This allows descriptors to be updated atomically
with a floting-point store.
With these changes, _dl_runtime_resolve() can fallback to an alternate
mechanism to find the relocation offset when it has been clobbered.
There's just one additional instruction in the fast path. I tested
the fallback function, _dl_fix_reloc_arg(), by changing the branch to
always use the fallback. Old code still runs as it did before.
Fixes bug 23296.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-03-30 20:36:49 +00:00
|
|
|
|
|
|
|
/* Test PA_GP_RELOC bit. */
|
|
|
|
bb,>= %r19,31,2f /* branch if not reloc offset */
|
2006-09-07 16:34:43 +00:00
|
|
|
fstd,ma %fr7,-8(%r1)
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
|
2014-04-29 07:08:48 +00:00
|
|
|
/* Set up args to fixup func, needs only two arguments */
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
ldw 8+4(%r20),%r26 /* (1) got[1] == struct link_map */
|
|
|
|
copy %r19,%r25 /* (2) reloc offset */
|
|
|
|
|
2014-04-29 07:08:48 +00:00
|
|
|
/* Call the real address resolver. */
|
Fix data race in setting function descriptors during lazy binding on hppa.
This addresses an issue that is present mainly on SMP machines running
threaded code. In a typical indirect call or PLT import stub, the
target address is loaded first. Then the global pointer is loaded into
the PIC register in the delay slot of a branch to the target address.
During lazy binding, the target address is a trampoline which transfers
to _dl_runtime_resolve().
_dl_runtime_resolve() uses the relocation offset stored in the global
pointer and the linkage map stored in the trampoline to find the
relocation. Then, the function descriptor is updated.
In a multi-threaded application, it is possible for the global pointer
to be updated between the load of the target address and the global
pointer. When this happens, the relocation offset has been replaced
by the new global pointer. The function pointer has probably been
updated as well but there is no way to find the address of the function
descriptor and to transfer to the target. So, _dl_runtime_resolve()
typically crashes.
HP-UX addressed this problem by adding an extra pc-relative branch to
the trampoline. The descriptor is initially setup to point to the
branch. The branch then transfers to the trampoline. This allowed
the trampoline code to figure out which descriptor was being used
without any modification to user code. I didn't use this approach
as it is more complex and changes function pointer canonicalization.
The order of loading the target address and global pointer in
indirect calls was not consistent with the order used in import stubs.
In particular, $$dyncall and some inline versions of it loaded the
global pointer first. This was inconsistent with the global pointer
being updated first in dl-machine.h. Assuming the accesses are
ordered, we want elf_machine_fixup_plt() to store the global pointer
first and calls to load it last. Then, the global pointer will be
correct when the target function is entered.
However, just to make things more fun, HP added support for
out-of-order execution of accesses in PA 2.0. The accesses used by
calls are weakly ordered. So, it's possibly under some circumstances
that a function might be entered with the wrong global pointer.
However, HP uses weakly ordered accesses in 64-bit HP-UX, so I assume
that loading the global pointer in the delay slot of the branch must
work consistently.
The basic fix for the race is a combination of modifying user code to
preserve the address of the function descriptor in register %r22 and
setting the least-significant bit in the relocation offset. The
latter was suggested by Carlos as a way to distinguish relocation
offsets from global pointer values. Conventionally, %r22 is used
as the address of the function descriptor in calls to $$dyncall.
So, it wasn't hard to preserve the address in %r22.
I have updated gcc trunk and gcc-9 branch to not clobber %r22 in
$$dyncall and inline indirect calls. I have also modified the import
stubs in binutils trunk and the 2.33 branch to preserve %r22. This
required making the stubs one instruction longer but we save one
relocation. I also modified binutils to align the .plt section on
a 8-byte boundary. This allows descriptors to be updated atomically
with a floting-point store.
With these changes, _dl_runtime_resolve() can fallback to an alternate
mechanism to find the relocation offset when it has been clobbered.
There's just one additional instruction in the fast path. I tested
the fallback function, _dl_fix_reloc_arg(), by changing the branch to
always use the fallback. Old code still runs as it did before.
Fixes bug 23296.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-03-30 20:36:49 +00:00
|
|
|
3: bl _dl_fixup,%rp
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
copy %r21,%r19 /* set fixup func ltp */
|
|
|
|
|
2017-07-23 16:50:44 +00:00
|
|
|
/* While the linker will set a function pointer to NULL when it
|
|
|
|
encounters an undefined weak function, we need to dynamically
|
|
|
|
detect removed weak functions. The issue arises because a weak
|
|
|
|
__gmon_start__ function was added to shared executables to work
|
|
|
|
around issues in _init that are now resolved. The presence of
|
|
|
|
__gmon_start__ in every shared library breaks the linker
|
|
|
|
`--as-needed' option. This __gmon_start__ function does nothing
|
|
|
|
but removal is tricky. Depending on the binding, removal can
|
|
|
|
cause an application using it to fault. The call to _dl_fixup
|
|
|
|
returns NULL when a function isn't resolved. In order to help
|
|
|
|
with __gmon_start__ removal, we return directly to the caller
|
|
|
|
when _dl_fixup returns NULL. This check could be removed when
|
|
|
|
BZ 19170 is fixed. */
|
2017-07-16 16:59:00 +00:00
|
|
|
comib,= 0,%r28,1f
|
|
|
|
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
/* Load up the returned func descriptor */
|
2006-09-07 16:34:43 +00:00
|
|
|
copy %r28, %r22
|
|
|
|
copy %r29, %r19
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
|
|
|
|
/* Reload arguments fp args */
|
Fix data race in setting function descriptors during lazy binding on hppa.
This addresses an issue that is present mainly on SMP machines running
threaded code. In a typical indirect call or PLT import stub, the
target address is loaded first. Then the global pointer is loaded into
the PIC register in the delay slot of a branch to the target address.
During lazy binding, the target address is a trampoline which transfers
to _dl_runtime_resolve().
_dl_runtime_resolve() uses the relocation offset stored in the global
pointer and the linkage map stored in the trampoline to find the
relocation. Then, the function descriptor is updated.
In a multi-threaded application, it is possible for the global pointer
to be updated between the load of the target address and the global
pointer. When this happens, the relocation offset has been replaced
by the new global pointer. The function pointer has probably been
updated as well but there is no way to find the address of the function
descriptor and to transfer to the target. So, _dl_runtime_resolve()
typically crashes.
HP-UX addressed this problem by adding an extra pc-relative branch to
the trampoline. The descriptor is initially setup to point to the
branch. The branch then transfers to the trampoline. This allowed
the trampoline code to figure out which descriptor was being used
without any modification to user code. I didn't use this approach
as it is more complex and changes function pointer canonicalization.
The order of loading the target address and global pointer in
indirect calls was not consistent with the order used in import stubs.
In particular, $$dyncall and some inline versions of it loaded the
global pointer first. This was inconsistent with the global pointer
being updated first in dl-machine.h. Assuming the accesses are
ordered, we want elf_machine_fixup_plt() to store the global pointer
first and calls to load it last. Then, the global pointer will be
correct when the target function is entered.
However, just to make things more fun, HP added support for
out-of-order execution of accesses in PA 2.0. The accesses used by
calls are weakly ordered. So, it's possibly under some circumstances
that a function might be entered with the wrong global pointer.
However, HP uses weakly ordered accesses in 64-bit HP-UX, so I assume
that loading the global pointer in the delay slot of the branch must
work consistently.
The basic fix for the race is a combination of modifying user code to
preserve the address of the function descriptor in register %r22 and
setting the least-significant bit in the relocation offset. The
latter was suggested by Carlos as a way to distinguish relocation
offsets from global pointer values. Conventionally, %r22 is used
as the address of the function descriptor in calls to $$dyncall.
So, it wasn't hard to preserve the address in %r22.
I have updated gcc trunk and gcc-9 branch to not clobber %r22 in
$$dyncall and inline indirect calls. I have also modified the import
stubs in binutils trunk and the 2.33 branch to preserve %r22. This
required making the stubs one instruction longer but we save one
relocation. I also modified binutils to align the .plt section on
a 8-byte boundary. This allows descriptors to be updated atomically
with a floting-point store.
With these changes, _dl_runtime_resolve() can fallback to an alternate
mechanism to find the relocation offset when it has been clobbered.
There's just one additional instruction in the fast path. I tested
the fallback function, _dl_fix_reloc_arg(), by changing the branch to
always use the fallback. Old code still runs as it did before.
Fixes bug 23296.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-03-30 20:36:49 +00:00
|
|
|
ldo -64(%sp),%r1
|
2006-09-07 16:34:43 +00:00
|
|
|
fldd,ma -8(%r1),%fr4
|
|
|
|
fldd,ma -8(%r1),%fr5
|
|
|
|
fldd,ma -8(%r1),%fr6
|
|
|
|
fldd,ma -8(%r1),%fr7
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
|
|
|
|
/* Adjust sp, and restore function result address*/
|
|
|
|
ldwm -128(%sp),%r28
|
|
|
|
|
|
|
|
/* Reload static link register */
|
|
|
|
ldw -16(%sp),%r29
|
|
|
|
/* Reload general args */
|
|
|
|
ldw -36(%sp),%r26
|
|
|
|
ldw -40(%sp),%r25
|
|
|
|
ldw -44(%sp),%r24
|
|
|
|
ldw -48(%sp),%r23
|
|
|
|
|
|
|
|
/* Jump to new function, but return to previous function */
|
|
|
|
bv %r0(%r22)
|
|
|
|
ldw -20(%sp),%rp
|
2017-07-16 16:59:00 +00:00
|
|
|
|
|
|
|
1:
|
|
|
|
/* Return to previous function */
|
|
|
|
ldw -148(%sp),%rp
|
|
|
|
bv %r0(%rp)
|
|
|
|
ldo -128(%sp),%sp
|
|
|
|
|
Fix data race in setting function descriptors during lazy binding on hppa.
This addresses an issue that is present mainly on SMP machines running
threaded code. In a typical indirect call or PLT import stub, the
target address is loaded first. Then the global pointer is loaded into
the PIC register in the delay slot of a branch to the target address.
During lazy binding, the target address is a trampoline which transfers
to _dl_runtime_resolve().
_dl_runtime_resolve() uses the relocation offset stored in the global
pointer and the linkage map stored in the trampoline to find the
relocation. Then, the function descriptor is updated.
In a multi-threaded application, it is possible for the global pointer
to be updated between the load of the target address and the global
pointer. When this happens, the relocation offset has been replaced
by the new global pointer. The function pointer has probably been
updated as well but there is no way to find the address of the function
descriptor and to transfer to the target. So, _dl_runtime_resolve()
typically crashes.
HP-UX addressed this problem by adding an extra pc-relative branch to
the trampoline. The descriptor is initially setup to point to the
branch. The branch then transfers to the trampoline. This allowed
the trampoline code to figure out which descriptor was being used
without any modification to user code. I didn't use this approach
as it is more complex and changes function pointer canonicalization.
The order of loading the target address and global pointer in
indirect calls was not consistent with the order used in import stubs.
In particular, $$dyncall and some inline versions of it loaded the
global pointer first. This was inconsistent with the global pointer
being updated first in dl-machine.h. Assuming the accesses are
ordered, we want elf_machine_fixup_plt() to store the global pointer
first and calls to load it last. Then, the global pointer will be
correct when the target function is entered.
However, just to make things more fun, HP added support for
out-of-order execution of accesses in PA 2.0. The accesses used by
calls are weakly ordered. So, it's possibly under some circumstances
that a function might be entered with the wrong global pointer.
However, HP uses weakly ordered accesses in 64-bit HP-UX, so I assume
that loading the global pointer in the delay slot of the branch must
work consistently.
The basic fix for the race is a combination of modifying user code to
preserve the address of the function descriptor in register %r22 and
setting the least-significant bit in the relocation offset. The
latter was suggested by Carlos as a way to distinguish relocation
offsets from global pointer values. Conventionally, %r22 is used
as the address of the function descriptor in calls to $$dyncall.
So, it wasn't hard to preserve the address in %r22.
I have updated gcc trunk and gcc-9 branch to not clobber %r22 in
$$dyncall and inline indirect calls. I have also modified the import
stubs in binutils trunk and the 2.33 branch to preserve %r22. This
required making the stubs one instruction longer but we save one
relocation. I also modified binutils to align the .plt section on
a 8-byte boundary. This allows descriptors to be updated atomically
with a floting-point store.
With these changes, _dl_runtime_resolve() can fallback to an alternate
mechanism to find the relocation offset when it has been clobbered.
There's just one additional instruction in the fast path. I tested
the fallback function, _dl_fix_reloc_arg(), by changing the branch to
always use the fallback. Old code still runs as it did before.
Fixes bug 23296.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-03-30 20:36:49 +00:00
|
|
|
2:
|
|
|
|
/* Set up args for _dl_fix_reloc_arg. */
|
|
|
|
copy %r22,%r26 /* (1) function pointer */
|
|
|
|
depi 0,31,2,%r26 /* clear least significant bits */
|
|
|
|
ldw 8+4(%r20),%r25 /* (2) got[1] == struct link_map */
|
|
|
|
|
|
|
|
/* Save ltp and link map arg for _dl_fixup. */
|
|
|
|
stw %r21,-56(%sp) /* ltp */
|
|
|
|
stw %r25,-60(%sp) /* struct link map */
|
|
|
|
|
|
|
|
/* Find reloc offset. */
|
|
|
|
bl _dl_fix_reloc_arg,%rp
|
|
|
|
copy %r21,%r19 /* set func ltp */
|
|
|
|
|
|
|
|
/* Set up args for _dl_fixup. */
|
|
|
|
ldw -56(%sp),%r21 /* ltp */
|
|
|
|
ldw -60(%sp),%r26 /* (1) struct link map */
|
|
|
|
b 3b
|
|
|
|
copy %ret0,%r25 /* (2) reloc offset */
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
.EXIT
|
|
|
|
.PROCEND
|
2006-09-07 16:34:43 +00:00
|
|
|
cfi_endproc
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
.size _dl_runtime_resolve, . - _dl_runtime_resolve
|
|
|
|
|
|
|
|
.text
|
|
|
|
.global _dl_runtime_profile
|
|
|
|
.type _dl_runtime_profile,@function
|
2006-09-07 16:34:43 +00:00
|
|
|
cfi_startproc
|
|
|
|
.align 4
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
_dl_runtime_profile:
|
|
|
|
.PROC
|
2006-09-07 16:34:43 +00:00
|
|
|
.CALLINFO FRAME=192,CALLS,SAVE_RP,ENTRY_GR=3
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
.ENTRY
|
|
|
|
|
|
|
|
/* SAVE_RP says we do */
|
2006-09-07 16:34:43 +00:00
|
|
|
stw %rp, -20(%sp)
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
/* Save static link register */
|
|
|
|
stw %r29,-16(%sp)
|
|
|
|
|
|
|
|
/* Build a call frame, and save structure pointer. */
|
2006-09-07 16:34:43 +00:00
|
|
|
copy %sp, %r1 /* Copy previous sp */
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
/* Save function result address (on entry) */
|
2006-09-07 16:34:43 +00:00
|
|
|
stwm %r28,192(%sp)
|
Fix data race in setting function descriptors during lazy binding on hppa.
This addresses an issue that is present mainly on SMP machines running
threaded code. In a typical indirect call or PLT import stub, the
target address is loaded first. Then the global pointer is loaded into
the PIC register in the delay slot of a branch to the target address.
During lazy binding, the target address is a trampoline which transfers
to _dl_runtime_resolve().
_dl_runtime_resolve() uses the relocation offset stored in the global
pointer and the linkage map stored in the trampoline to find the
relocation. Then, the function descriptor is updated.
In a multi-threaded application, it is possible for the global pointer
to be updated between the load of the target address and the global
pointer. When this happens, the relocation offset has been replaced
by the new global pointer. The function pointer has probably been
updated as well but there is no way to find the address of the function
descriptor and to transfer to the target. So, _dl_runtime_resolve()
typically crashes.
HP-UX addressed this problem by adding an extra pc-relative branch to
the trampoline. The descriptor is initially setup to point to the
branch. The branch then transfers to the trampoline. This allowed
the trampoline code to figure out which descriptor was being used
without any modification to user code. I didn't use this approach
as it is more complex and changes function pointer canonicalization.
The order of loading the target address and global pointer in
indirect calls was not consistent with the order used in import stubs.
In particular, $$dyncall and some inline versions of it loaded the
global pointer first. This was inconsistent with the global pointer
being updated first in dl-machine.h. Assuming the accesses are
ordered, we want elf_machine_fixup_plt() to store the global pointer
first and calls to load it last. Then, the global pointer will be
correct when the target function is entered.
However, just to make things more fun, HP added support for
out-of-order execution of accesses in PA 2.0. The accesses used by
calls are weakly ordered. So, it's possibly under some circumstances
that a function might be entered with the wrong global pointer.
However, HP uses weakly ordered accesses in 64-bit HP-UX, so I assume
that loading the global pointer in the delay slot of the branch must
work consistently.
The basic fix for the race is a combination of modifying user code to
preserve the address of the function descriptor in register %r22 and
setting the least-significant bit in the relocation offset. The
latter was suggested by Carlos as a way to distinguish relocation
offsets from global pointer values. Conventionally, %r22 is used
as the address of the function descriptor in calls to $$dyncall.
So, it wasn't hard to preserve the address in %r22.
I have updated gcc trunk and gcc-9 branch to not clobber %r22 in
$$dyncall and inline indirect calls. I have also modified the import
stubs in binutils trunk and the 2.33 branch to preserve %r22. This
required making the stubs one instruction longer but we save one
relocation. I also modified binutils to align the .plt section on
a 8-byte boundary. This allows descriptors to be updated atomically
with a floting-point store.
With these changes, _dl_runtime_resolve() can fallback to an alternate
mechanism to find the relocation offset when it has been clobbered.
There's just one additional instruction in the fast path. I tested
the fallback function, _dl_fix_reloc_arg(), by changing the branch to
always use the fallback. Old code still runs as it did before.
Fixes bug 23296.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-03-30 20:36:49 +00:00
|
|
|
/* Fill in some frame info to follow ABI */
|
2006-09-07 16:34:43 +00:00
|
|
|
stw %r1,-4(%sp) /* Previous sp */
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
stw %r21,-32(%sp) /* PIC register value */
|
2006-09-07 16:34:43 +00:00
|
|
|
|
|
|
|
/* Create La_hppa_retval */
|
2013-06-05 20:26:40 +00:00
|
|
|
/* -140, lrv_r28
|
2006-09-07 16:34:43 +00:00
|
|
|
-136, lrv_r29
|
2013-06-05 20:26:40 +00:00
|
|
|
-132, 4 byte pad
|
2006-09-07 16:34:43 +00:00
|
|
|
-128, lr_fr4 (8 bytes) */
|
|
|
|
|
|
|
|
/* Create save space for _dl_profile_fixup arguments
|
2013-06-05 20:26:40 +00:00
|
|
|
-120, Saved reloc offset
|
|
|
|
-116, Saved struct link_map
|
2006-09-07 16:34:43 +00:00
|
|
|
-112, *framesizep */
|
|
|
|
|
|
|
|
/* Create La_hppa_regs */
|
|
|
|
/* 32-bit registers */
|
|
|
|
stw %r26,-108(%sp)
|
|
|
|
stw %r25,-104(%sp)
|
|
|
|
stw %r24,-100(%sp)
|
|
|
|
stw %r23,-96(%sp)
|
|
|
|
/* -92, 4 byte pad */
|
|
|
|
/* 64-bit floating point registers */
|
|
|
|
ldo -88(%sp),%r1
|
|
|
|
fstd,ma %fr4,8(%r1)
|
|
|
|
fstd,ma %fr5,8(%r1)
|
|
|
|
fstd,ma %fr6,8(%r1)
|
|
|
|
fstd,ma %fr7,8(%r1)
|
|
|
|
|
Fix data race in setting function descriptors during lazy binding on hppa.
This addresses an issue that is present mainly on SMP machines running
threaded code. In a typical indirect call or PLT import stub, the
target address is loaded first. Then the global pointer is loaded into
the PIC register in the delay slot of a branch to the target address.
During lazy binding, the target address is a trampoline which transfers
to _dl_runtime_resolve().
_dl_runtime_resolve() uses the relocation offset stored in the global
pointer and the linkage map stored in the trampoline to find the
relocation. Then, the function descriptor is updated.
In a multi-threaded application, it is possible for the global pointer
to be updated between the load of the target address and the global
pointer. When this happens, the relocation offset has been replaced
by the new global pointer. The function pointer has probably been
updated as well but there is no way to find the address of the function
descriptor and to transfer to the target. So, _dl_runtime_resolve()
typically crashes.
HP-UX addressed this problem by adding an extra pc-relative branch to
the trampoline. The descriptor is initially setup to point to the
branch. The branch then transfers to the trampoline. This allowed
the trampoline code to figure out which descriptor was being used
without any modification to user code. I didn't use this approach
as it is more complex and changes function pointer canonicalization.
The order of loading the target address and global pointer in
indirect calls was not consistent with the order used in import stubs.
In particular, $$dyncall and some inline versions of it loaded the
global pointer first. This was inconsistent with the global pointer
being updated first in dl-machine.h. Assuming the accesses are
ordered, we want elf_machine_fixup_plt() to store the global pointer
first and calls to load it last. Then, the global pointer will be
correct when the target function is entered.
However, just to make things more fun, HP added support for
out-of-order execution of accesses in PA 2.0. The accesses used by
calls are weakly ordered. So, it's possibly under some circumstances
that a function might be entered with the wrong global pointer.
However, HP uses weakly ordered accesses in 64-bit HP-UX, so I assume
that loading the global pointer in the delay slot of the branch must
work consistently.
The basic fix for the race is a combination of modifying user code to
preserve the address of the function descriptor in register %r22 and
setting the least-significant bit in the relocation offset. The
latter was suggested by Carlos as a way to distinguish relocation
offsets from global pointer values. Conventionally, %r22 is used
as the address of the function descriptor in calls to $$dyncall.
So, it wasn't hard to preserve the address in %r22.
I have updated gcc trunk and gcc-9 branch to not clobber %r22 in
$$dyncall and inline indirect calls. I have also modified the import
stubs in binutils trunk and the 2.33 branch to preserve %r22. This
required making the stubs one instruction longer but we save one
relocation. I also modified binutils to align the .plt section on
a 8-byte boundary. This allows descriptors to be updated atomically
with a floting-point store.
With these changes, _dl_runtime_resolve() can fallback to an alternate
mechanism to find the relocation offset when it has been clobbered.
There's just one additional instruction in the fast path. I tested
the fallback function, _dl_fix_reloc_arg(), by changing the branch to
always use the fallback. Old code still runs as it did before.
Fixes bug 23296.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-03-30 20:36:49 +00:00
|
|
|
/* Test PA_GP_RELOC bit. */
|
|
|
|
bb,>= %r19,31,2f /* branch if not reloc offset */
|
|
|
|
/* 32-bit stack pointer */
|
|
|
|
stw %sp,-56(%sp)
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
|
2014-04-29 07:08:48 +00:00
|
|
|
/* Set up args to fixup func, needs five arguments */
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
ldw 8+4(%r20),%r26 /* (1) got[1] == struct link_map */
|
2006-09-07 16:34:43 +00:00
|
|
|
stw %r26,-116(%sp) /* Save struct link_map */
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
copy %r19,%r25 /* (2) reloc offset */
|
2006-09-07 16:34:43 +00:00
|
|
|
stw %r25,-120(%sp) /* Save reloc offset */
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
copy %rp,%r24 /* (3) profile_fixup needs rp */
|
2006-09-07 16:34:43 +00:00
|
|
|
ldo -56(%sp),%r23 /* (4) La_hppa_regs */
|
|
|
|
ldo -112(%sp), %r1
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
stw %r1, -52(%sp) /* (5) long int *framesizep */
|
|
|
|
|
2014-04-29 07:08:48 +00:00
|
|
|
/* Call the real address resolver. */
|
Fix data race in setting function descriptors during lazy binding on hppa.
This addresses an issue that is present mainly on SMP machines running
threaded code. In a typical indirect call or PLT import stub, the
target address is loaded first. Then the global pointer is loaded into
the PIC register in the delay slot of a branch to the target address.
During lazy binding, the target address is a trampoline which transfers
to _dl_runtime_resolve().
_dl_runtime_resolve() uses the relocation offset stored in the global
pointer and the linkage map stored in the trampoline to find the
relocation. Then, the function descriptor is updated.
In a multi-threaded application, it is possible for the global pointer
to be updated between the load of the target address and the global
pointer. When this happens, the relocation offset has been replaced
by the new global pointer. The function pointer has probably been
updated as well but there is no way to find the address of the function
descriptor and to transfer to the target. So, _dl_runtime_resolve()
typically crashes.
HP-UX addressed this problem by adding an extra pc-relative branch to
the trampoline. The descriptor is initially setup to point to the
branch. The branch then transfers to the trampoline. This allowed
the trampoline code to figure out which descriptor was being used
without any modification to user code. I didn't use this approach
as it is more complex and changes function pointer canonicalization.
The order of loading the target address and global pointer in
indirect calls was not consistent with the order used in import stubs.
In particular, $$dyncall and some inline versions of it loaded the
global pointer first. This was inconsistent with the global pointer
being updated first in dl-machine.h. Assuming the accesses are
ordered, we want elf_machine_fixup_plt() to store the global pointer
first and calls to load it last. Then, the global pointer will be
correct when the target function is entered.
However, just to make things more fun, HP added support for
out-of-order execution of accesses in PA 2.0. The accesses used by
calls are weakly ordered. So, it's possibly under some circumstances
that a function might be entered with the wrong global pointer.
However, HP uses weakly ordered accesses in 64-bit HP-UX, so I assume
that loading the global pointer in the delay slot of the branch must
work consistently.
The basic fix for the race is a combination of modifying user code to
preserve the address of the function descriptor in register %r22 and
setting the least-significant bit in the relocation offset. The
latter was suggested by Carlos as a way to distinguish relocation
offsets from global pointer values. Conventionally, %r22 is used
as the address of the function descriptor in calls to $$dyncall.
So, it wasn't hard to preserve the address in %r22.
I have updated gcc trunk and gcc-9 branch to not clobber %r22 in
$$dyncall and inline indirect calls. I have also modified the import
stubs in binutils trunk and the 2.33 branch to preserve %r22. This
required making the stubs one instruction longer but we save one
relocation. I also modified binutils to align the .plt section on
a 8-byte boundary. This allows descriptors to be updated atomically
with a floting-point store.
With these changes, _dl_runtime_resolve() can fallback to an alternate
mechanism to find the relocation offset when it has been clobbered.
There's just one additional instruction in the fast path. I tested
the fallback function, _dl_fix_reloc_arg(), by changing the branch to
always use the fallback. Old code still runs as it did before.
Fixes bug 23296.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-03-30 20:36:49 +00:00
|
|
|
3: bl _dl_profile_fixup,%rp
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
copy %r21,%r19 /* set fixup func ltp */
|
|
|
|
|
2006-09-07 16:34:43 +00:00
|
|
|
/* Load up the returned function descriptor */
|
|
|
|
copy %r28, %r22
|
|
|
|
copy %r29, %r19
|
|
|
|
|
|
|
|
/* Restore gr/fr/sp/rp */
|
|
|
|
ldw -108(%sp),%r26
|
|
|
|
ldw -104(%sp),%r25
|
|
|
|
ldw -100(%sp),%r24
|
|
|
|
ldw -96(%sp),%r23
|
|
|
|
/* -92, 4 byte pad, skip */
|
|
|
|
ldo -88(%sp),%r1
|
|
|
|
fldd,ma 8(%r1),%fr4
|
|
|
|
fldd,ma 8(%r1),%fr5
|
|
|
|
fldd,ma 8(%r1),%fr6
|
|
|
|
fldd,ma 8(%r1),%fr7
|
Fix data race in setting function descriptors during lazy binding on hppa.
This addresses an issue that is present mainly on SMP machines running
threaded code. In a typical indirect call or PLT import stub, the
target address is loaded first. Then the global pointer is loaded into
the PIC register in the delay slot of a branch to the target address.
During lazy binding, the target address is a trampoline which transfers
to _dl_runtime_resolve().
_dl_runtime_resolve() uses the relocation offset stored in the global
pointer and the linkage map stored in the trampoline to find the
relocation. Then, the function descriptor is updated.
In a multi-threaded application, it is possible for the global pointer
to be updated between the load of the target address and the global
pointer. When this happens, the relocation offset has been replaced
by the new global pointer. The function pointer has probably been
updated as well but there is no way to find the address of the function
descriptor and to transfer to the target. So, _dl_runtime_resolve()
typically crashes.
HP-UX addressed this problem by adding an extra pc-relative branch to
the trampoline. The descriptor is initially setup to point to the
branch. The branch then transfers to the trampoline. This allowed
the trampoline code to figure out which descriptor was being used
without any modification to user code. I didn't use this approach
as it is more complex and changes function pointer canonicalization.
The order of loading the target address and global pointer in
indirect calls was not consistent with the order used in import stubs.
In particular, $$dyncall and some inline versions of it loaded the
global pointer first. This was inconsistent with the global pointer
being updated first in dl-machine.h. Assuming the accesses are
ordered, we want elf_machine_fixup_plt() to store the global pointer
first and calls to load it last. Then, the global pointer will be
correct when the target function is entered.
However, just to make things more fun, HP added support for
out-of-order execution of accesses in PA 2.0. The accesses used by
calls are weakly ordered. So, it's possibly under some circumstances
that a function might be entered with the wrong global pointer.
However, HP uses weakly ordered accesses in 64-bit HP-UX, so I assume
that loading the global pointer in the delay slot of the branch must
work consistently.
The basic fix for the race is a combination of modifying user code to
preserve the address of the function descriptor in register %r22 and
setting the least-significant bit in the relocation offset. The
latter was suggested by Carlos as a way to distinguish relocation
offsets from global pointer values. Conventionally, %r22 is used
as the address of the function descriptor in calls to $$dyncall.
So, it wasn't hard to preserve the address in %r22.
I have updated gcc trunk and gcc-9 branch to not clobber %r22 in
$$dyncall and inline indirect calls. I have also modified the import
stubs in binutils trunk and the 2.33 branch to preserve %r22. This
required making the stubs one instruction longer but we save one
relocation. I also modified binutils to align the .plt section on
a 8-byte boundary. This allows descriptors to be updated atomically
with a floting-point store.
With these changes, _dl_runtime_resolve() can fallback to an alternate
mechanism to find the relocation offset when it has been clobbered.
There's just one additional instruction in the fast path. I tested
the fallback function, _dl_fix_reloc_arg(), by changing the branch to
always use the fallback. Old code still runs as it did before.
Fixes bug 23296.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-03-30 20:36:49 +00:00
|
|
|
|
|
|
|
/* Reload rp register -(192+20) without adjusting stack */
|
|
|
|
ldw -212(%sp),%rp
|
2006-09-07 16:34:43 +00:00
|
|
|
|
|
|
|
/* Reload static link register -(192+16) without adjusting stack */
|
|
|
|
ldw -208(%sp),%r29
|
|
|
|
|
|
|
|
/* *framesizep is >= 0 if we have to run pltexit */
|
|
|
|
ldw -112(%sp),%r28
|
|
|
|
cmpb,>>=,N %r0,%r28,L(cpe)
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
|
|
|
|
/* Adjust sp, and restore function result address*/
|
2006-09-07 16:34:43 +00:00
|
|
|
ldwm -192(%sp),%r28
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
/* Jump to new function, but return to previous function */
|
|
|
|
bv %r0(%r22)
|
|
|
|
ldw -20(%sp),%rp
|
2006-09-07 16:34:43 +00:00
|
|
|
/* NO RETURN */
|
|
|
|
|
|
|
|
L(nf):
|
|
|
|
/* Call the returned function descriptor */
|
|
|
|
bv %r0(%r22)
|
|
|
|
nop
|
|
|
|
b,n L(cont)
|
|
|
|
|
|
|
|
L(cpe):
|
2013-06-05 20:26:40 +00:00
|
|
|
/* We are going to call the resolved function, but we have a
|
2006-09-07 16:34:43 +00:00
|
|
|
stack frame in the middle. We use the value of framesize to
|
|
|
|
guess how much extra frame we need, and how much frame to
|
|
|
|
copy forward. */
|
|
|
|
|
|
|
|
/* Round to nearest multiple of 64 */
|
|
|
|
addi 63, %r28, %r28
|
|
|
|
depi 0, 27, 6, %r28
|
|
|
|
|
|
|
|
/* Calcualte start of stack copy */
|
|
|
|
ldo -192(%sp),%r2
|
|
|
|
|
|
|
|
/* Increate the stack by *framesizep */
|
|
|
|
copy %sp, %r1
|
|
|
|
add %sp, %r28, %sp
|
|
|
|
/* Save stack pointer */
|
|
|
|
stw %r1, -4(%sp)
|
|
|
|
|
|
|
|
/* Single byte copy of prevous stack onto newly allocated stack */
|
|
|
|
1: ldb %r28(%r2), %r1
|
|
|
|
add %r28, %sp, %r26
|
|
|
|
stb %r1, 0(%r26)
|
|
|
|
addi,< -1,%r28,%r28
|
|
|
|
b,n 1b
|
|
|
|
|
|
|
|
/* Retore r28 and r27 and r2 already points at -192(%sp) */
|
|
|
|
ldw 0(%r2),%r28
|
|
|
|
ldw 84(%r2),%r26
|
|
|
|
|
|
|
|
/* Calculate address of L(cont) */
|
|
|
|
b,l L(nf),%r2
|
|
|
|
depwi 0,31,2,%r2
|
|
|
|
L(cont):
|
|
|
|
/* Undo fake stack */
|
|
|
|
ldw -4(%sp),%r1
|
|
|
|
copy %r1, %sp
|
|
|
|
|
|
|
|
/* Arguments to _dl_call_pltexit */
|
2013-06-05 20:26:40 +00:00
|
|
|
ldw -116(%sp), %r26 /* (1) got[1] == struct link_map */
|
2014-04-29 07:08:48 +00:00
|
|
|
ldw -120(%sp), %r25 /* (2) reloc offsets */
|
2006-09-07 16:34:43 +00:00
|
|
|
ldo -56(%sp), %r24 /* (3) *La_hppa_regs */
|
|
|
|
ldo -124(%sp), %r23 /* (4) *La_hppa_retval */
|
|
|
|
|
|
|
|
/* Fill *La_hppa_retval */
|
|
|
|
stw %r28,-140(%sp)
|
|
|
|
stw %r29,-136(%sp)
|
|
|
|
ldo -128(%sp), %r1
|
|
|
|
fstd %fr4,0(%r1)
|
|
|
|
|
|
|
|
/* Call _dl_call_pltexit */
|
|
|
|
bl _dl_call_pltexit,%rp
|
|
|
|
nop
|
|
|
|
|
|
|
|
/* Restore *La_hppa_retval */
|
|
|
|
ldw -140(%sp), %r28
|
|
|
|
ldw -136(%sp), %r29
|
|
|
|
ldo -128(%sp), %r1
|
|
|
|
fldd 0(%r1), %fr4
|
|
|
|
|
|
|
|
/* Unwind the stack */
|
|
|
|
ldo 192(%sp),%sp
|
|
|
|
/* Retore callers rp */
|
|
|
|
ldw -20(%sp),%rp
|
|
|
|
/* Return */
|
|
|
|
bv,n 0(%r2)
|
Fix data race in setting function descriptors during lazy binding on hppa.
This addresses an issue that is present mainly on SMP machines running
threaded code. In a typical indirect call or PLT import stub, the
target address is loaded first. Then the global pointer is loaded into
the PIC register in the delay slot of a branch to the target address.
During lazy binding, the target address is a trampoline which transfers
to _dl_runtime_resolve().
_dl_runtime_resolve() uses the relocation offset stored in the global
pointer and the linkage map stored in the trampoline to find the
relocation. Then, the function descriptor is updated.
In a multi-threaded application, it is possible for the global pointer
to be updated between the load of the target address and the global
pointer. When this happens, the relocation offset has been replaced
by the new global pointer. The function pointer has probably been
updated as well but there is no way to find the address of the function
descriptor and to transfer to the target. So, _dl_runtime_resolve()
typically crashes.
HP-UX addressed this problem by adding an extra pc-relative branch to
the trampoline. The descriptor is initially setup to point to the
branch. The branch then transfers to the trampoline. This allowed
the trampoline code to figure out which descriptor was being used
without any modification to user code. I didn't use this approach
as it is more complex and changes function pointer canonicalization.
The order of loading the target address and global pointer in
indirect calls was not consistent with the order used in import stubs.
In particular, $$dyncall and some inline versions of it loaded the
global pointer first. This was inconsistent with the global pointer
being updated first in dl-machine.h. Assuming the accesses are
ordered, we want elf_machine_fixup_plt() to store the global pointer
first and calls to load it last. Then, the global pointer will be
correct when the target function is entered.
However, just to make things more fun, HP added support for
out-of-order execution of accesses in PA 2.0. The accesses used by
calls are weakly ordered. So, it's possibly under some circumstances
that a function might be entered with the wrong global pointer.
However, HP uses weakly ordered accesses in 64-bit HP-UX, so I assume
that loading the global pointer in the delay slot of the branch must
work consistently.
The basic fix for the race is a combination of modifying user code to
preserve the address of the function descriptor in register %r22 and
setting the least-significant bit in the relocation offset. The
latter was suggested by Carlos as a way to distinguish relocation
offsets from global pointer values. Conventionally, %r22 is used
as the address of the function descriptor in calls to $$dyncall.
So, it wasn't hard to preserve the address in %r22.
I have updated gcc trunk and gcc-9 branch to not clobber %r22 in
$$dyncall and inline indirect calls. I have also modified the import
stubs in binutils trunk and the 2.33 branch to preserve %r22. This
required making the stubs one instruction longer but we save one
relocation. I also modified binutils to align the .plt section on
a 8-byte boundary. This allows descriptors to be updated atomically
with a floting-point store.
With these changes, _dl_runtime_resolve() can fallback to an alternate
mechanism to find the relocation offset when it has been clobbered.
There's just one additional instruction in the fast path. I tested
the fallback function, _dl_fix_reloc_arg(), by changing the branch to
always use the fallback. Old code still runs as it did before.
Fixes bug 23296.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-03-30 20:36:49 +00:00
|
|
|
|
|
|
|
2:
|
|
|
|
/* Set up args for _dl_fix_reloc_arg. */
|
|
|
|
copy %r22,%r26 /* (1) function pointer */
|
|
|
|
depi 0,31,2,%r26 /* clear least significant bits */
|
|
|
|
ldw 8+4(%r20),%r25 /* (2) got[1] == struct link_map */
|
|
|
|
|
|
|
|
/* Save ltp and link map arg for _dl_fixup. */
|
|
|
|
stw %r21,-92(%sp) /* ltp */
|
|
|
|
stw %r25,-116(%sp) /* struct link map */
|
|
|
|
|
|
|
|
/* Find reloc offset. */
|
|
|
|
bl _dl_fix_reloc_arg,%rp
|
|
|
|
copy %r21,%r19 /* set func ltp */
|
|
|
|
|
|
|
|
/* Restore fixup ltp. */
|
|
|
|
ldw -92(%sp),%r21 /* ltp */
|
|
|
|
|
|
|
|
/* Set up args to fixup func, needs five arguments */
|
|
|
|
ldw -116(%sp),%r26 /* (1) struct link map */
|
|
|
|
copy %ret0,%r25 /* (2) reloc offset */
|
|
|
|
stw %r25,-120(%sp) /* Save reloc offset */
|
|
|
|
ldw -212(%sp),%r24 /* (3) profile_fixup needs rp */
|
|
|
|
ldo -56(%sp),%r23 /* (4) La_hppa_regs */
|
|
|
|
ldo -112(%sp), %r1
|
|
|
|
b 3b
|
|
|
|
stw %r1, -52(%sp) /* (5) long int *framesizep */
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
.EXIT
|
|
|
|
.PROCEND
|
2007-02-02 21:50:19 +00:00
|
|
|
cfi_endproc
|
2006-05-15 Carlos O'Donell <carlos@systemhalted.org>
* sysdeps/hppa/dl-machine.h: Include tls.h
(elf_machine_fixup_plt): Returns fdesc.
(elf_machine_profile_fixup_plt): Remove.
(elf_machine_plt_value): Returns fdesc.
(elf_machine_runtime_setup): Check that dl_profile != NULL.
(ARCH_LA_PLTENT, ARCH_LA_PLTEXIT): Define.
(RTLD_START): Use iitlbp with sr0.
(elf_machine_type_class): Include TLS relocs.
(reassemble_21, reassemble_14): Define.
(elf_machine_rela): Add DIR21L, DIR14R, PLABEL21L, PLABEL14R,
TLS_DTPMOD32, TLS_TPREL32, TLS_DTPOFF32 support.
(TRAMPOLINE_TEMPLATE): Move to ...
* sysdeps/hppa/dl-trampoline.S: ... here.
* sysdeps/hppa/abort-instr.h: Use iitlbp with sr0.
* sysdeps/hppa/dl-lookupcfg.h: Inlcude dl-fptr.h.
(DL_FIXUP_VALUE_TYPE, DL_FIXUP_MAKE_VALUE, DL_FIXUP_VALUE_CODE_ADDR,
DL_FIXUP_VALUE_ADD, DL_FIXUP_ADDR_VALUE): Define.
* sysdeps/hppa/sysdep.h: Use "!" as a separator. Cleanup comments.
* sysdeps/hppa/bits/link.h (La_hppa_regs, La_hppa_retval): Define.
Define prototypes for la_hppa_gnu_pltenter and la_hppa_gnu_pltexit.
2006-05-14 23:54:47 +00:00
|
|
|
.size _dl_runtime_profile, . - _dl_runtime_profile
|