The ELFv2 ABI changes the calling convention by passing and returning
structures in registers in more cases than the old ABI:
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01145.htmlhttp://gcc.gnu.org/ml/gcc-patches/2013-11/msg01147.html
For the most part, this does not affect glibc, since glibc assembler
files do not use structure parameters / return values. However, one
place is affected: the LD_AUDIT interface provides a structure to
the audit routine that contains all registers holding function
argument and return values for the intercepted PLT call.
Since the new ABI now sometimes uses registers to return values
that were never used for this purpose in the old ABI, this structure
has to be extended. To force audit routines to be modified for the
new ABI if necessary, the patch defines v2 variants of the la_ppc64
types and routines.
In addition, the patch contains two unrelated changes to the
PLT trampoline routines: it fixes a bug where FPR return values
were stored in the wrong place, and it removes the unnecessary
save/restore of CR.
This updates glibc for the changes in the ELFv2 relating to the
stack frame layout. These are described in more detail here:
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01149.htmlhttp://gcc.gnu.org/ml/gcc-patches/2013-11/msg01146.html
Specifically, the "compiler and linker doublewords" were removed,
which has the effect that the save slot for the TOC register is
now at offset 24 rather than 40 to the stack pointer.
In addition, a function may now no longer necessarily assume that
its caller has set up a 64-byte register save area its use.
To address the first change, the patch goes through all assembler
files and replaces immediate offsets in instructions accessing the
ABI-defined stack slots by symbolic offsets. Those already were
defined in ucontext_i.sym and used in some of the context routines,
but that doesn't really seem like the right place for those defines.
The patch instead defines those symbolic offsets in sysdeps.h,
in two variants for the old and new ABI, and uses them systematically
in all assembler files, not just the context routines.
The second change only affected a few assembler files that used
the save area to temporarily store some registers. In those
cases where this happens within a leaf function, this patch
changes the code to store those registers to the "red zone"
below the stack pointer. Otherwise, the functions already allocate
a stack frame, and the patch changes them to add extra space in
these frames as temporary space for the ELFv2 ABI.
This is a follow-on to the previous patch to support the ELFv2 ABI in the
dynamic loader, split off into its own patch since it is just an optional
optimization.
In the ELFv2 ABI, most functions define both a global and a local entry
point; the local entry requires r2 to be already set up by the caller
to point to the callee's TOC; while the global entry does not require
the caller to know about the callee's TOC, but it needs to set up r12
to the callee's entry point address.
Now, when setting up a PLT slot, the dynamic linker will usually need
to enter the target function's global entry point. However, if the
linker can prove that the target function is in the same DSO as the
PLT slot itself, and the whole DSO only uses a single TOC (which the
linker will let ld.so know via a DT_PPC64_OPT entry), then it is
possible to actually enter the local entry point address into the
PLT slot, for a slight improvement in performance.
Note that this uncovered a problem on the first call via _dl_runtime_resolve,
because that routine neglected to restore the caller's TOC before calling
the target function for the first time, since it assumed that function
would always reload its own TOC anyway ...
This is the first patch to support the new ELFv2 ABI in glibc.
As preparation, this patch simply refactors some of the powerpc64 assembler
code to move all code related to creating function descriptors (.opd section)
or using function descriptors (function pointer call) into a central place
in sysdep.h.
Note that most locations creating .opd entries were already using macros
in sysdep.h, this patch simply extends this to the remaining places.
No relevant change in generated code expected.
of non-volatile floating-point registers to the stack (fp14-fp31).
* sysdeps/powerpc/powerpc32/gprsave0.S: Add cfi_offset for spilling of
non-volatile general-purpose registers to the stack (gpr13-gpr31).
* sysdeps/powerpc/powerpc64/dl-trampoline.S: Add cfi_offset
for non-volatiles gpr30 - grp31 spilled to the stack.
* sysdeps/powerpc/powerpc64/memcpy.S: Add cfi_offset for non-volatile
gpr31 spill to the stack.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/setcontext-common.S:
Add cfi_offset for non-volatile gpr31 spill to the stack.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S: Add cfi_offset
for non-volatiles gpr28 - grp31 spilled to the stack.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/getcontext.S: Add
cfi_adjust_cfa_offset when a frame is stacked.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/setcontext.S:
(__novec_setcontext) : Add cfi_offset for non-volatile gpr31 spill
add LR saved to the stack. Add cfi_adjust_cfa_offset when frame is
stacked.
(__setcontext) : Add cfi_offset for non-volatile gpr31 spill to
the stack.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/swapcontext.S:
(__novec_swapcontext) : Add cfi_offset for non-volatile gpr31 spill
add LR saved to the stack.
(__swapcontext) : Add cfi_offset for non-volatile gpr31 spill add LR
saved to the stack. Add cfi_adjust_cfa_offset when frame is stacked.
(La_ppc64_retval): Correct size of lrc_fp.
* sysdeps/powerpc/powerpc64/dl-trampoline.S (_dl_profile_resolve):
Fix up ABI problems and complete function.
* sysdeps/powerpc/powerpc64/dl-machine.h
(elf_machine_runtime_setup): If profile != 0 does not anymore mean
GLRO(dl_profile) != NULL.
* sysdeps/powerpc/powerpc64/bits/link.h (struct la_ppc64_regs): Add
padding.
* sysdeps/powerpc/powerpc64/dl-trampoline.S: (_dl_profile_resolve):
Extend _dl_prof_resolve to add pass extra parameters to
_dl_profile_fixup and set up structure with register content.
* sysdeps/powerpc/powerpc32/dl-trampoline.S (_dl_prof_resolve):
Extend _dl_prof_resolve to add pass extra parameters to
_dl_profile_fixup and set up structure with register content.