glibc/sysdeps/x86/dl-tunables.list
H.J. Lu 848746e88e elf: Add ELF_DYNAMIC_AFTER_RELOC to rewrite PLT
Add ELF_DYNAMIC_AFTER_RELOC to allow target specific processing after
relocation.

For x86-64, add

 #define DT_X86_64_PLT     (DT_LOPROC + 0)
 #define DT_X86_64_PLTSZ   (DT_LOPROC + 1)
 #define DT_X86_64_PLTENT  (DT_LOPROC + 3)

1. DT_X86_64_PLT: The address of the procedure linkage table.
2. DT_X86_64_PLTSZ: The total size, in bytes, of the procedure linkage
table.
3. DT_X86_64_PLTENT: The size, in bytes, of a procedure linkage table
entry.

With the r_addend field of the R_X86_64_JUMP_SLOT relocation set to the
memory offset of the indirect branch instruction.

Define ELF_DYNAMIC_AFTER_RELOC for x86-64 to rewrite the PLT section
with direct branch after relocation when the lazy binding is disabled.

PLT rewrite is disabled by default since SELinux may disallow modifying
code pages and ld.so can't detect it in all cases.  Use

$ export GLIBC_TUNABLES=glibc.cpu.plt_rewrite=1

to enable PLT rewrite with 32-bit direct jump at run-time or

$ export GLIBC_TUNABLES=glibc.cpu.plt_rewrite=2

to enable PLT rewrite with 32-bit direct jump and on APX processors with
64-bit absolute jump at run-time.

Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2024-01-05 05:49:49 -08:00

76 lines
2.7 KiB
Plaintext

# x86 specific tunables.
# Copyright (C) 2017-2024 Free Software Foundation, Inc.
# This file is part of the GNU C Library.
# The GNU C Library is free software; you can redistribute it and/or
# modify it under the terms of the GNU Lesser General Public
# License as published by the Free Software Foundation; either
# version 2.1 of the License, or (at your option) any later version.
# The GNU C Library is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# Lesser General Public License for more details.
# You should have received a copy of the GNU Lesser General Public
# License along with the GNU C Library; if not, see
# <https://www.gnu.org/licenses/>.
glibc {
cpu {
hwcaps {
type: STRING
}
x86_ibt {
type: STRING
}
x86_shstk {
type: STRING
}
x86_non_temporal_threshold {
type: SIZE_T
}
x86_rep_movsb_threshold {
type: SIZE_T
# Since there is overhead to set up REP MOVSB operation, REP
# MOVSB isn't faster on short data. The memcpy micro benchmark
# in glibc shows that 2KB is the approximate value above which
# REP MOVSB becomes faster than SSE2 optimization on processors
# with Enhanced REP MOVSB. Since larger register size can move
# more data with a single load and store, the threshold is
# higher with larger register size. Micro benchmarks show AVX
# REP MOVSB becomes faster apprximately at 8KB. The AVX512
# threshold is extrapolated to 16KB. For machines with FSRM the
# threshold is universally set at 2112 bytes. Note: Since the
# REP MOVSB threshold must be greater than 8 times of vector
# size and the default value is 4096 * (vector size / 16), the
# default value and the minimum value must be updated at
# run-time. NB: Don't set the default value since we can't tell
# if the tunable value is set by user or not [BZ #27069].
minval: 1
}
x86_rep_stosb_threshold {
type: SIZE_T
# Since there is overhead to set up REP STOSB operation, REP STOSB
# isn't faster on short data. The memset micro benchmark in glibc
# shows that 2KB is the approximate value above which REP STOSB
# becomes faster on processors with Enhanced REP STOSB. Since the
# stored value is fixed, larger register size has minimal impact
# on threshold.
minval: 1
default: 2048
}
x86_data_cache_size {
type: SIZE_T
}
x86_shared_cache_size {
type: SIZE_T
}
plt_rewrite {
type: INT_32
minval: 0
maxval: 2
}
}
}