The memcpy optimization (commit 587a1290a1) has a series
of mistakes:
- The implementation is wrong: the chunk size calculation is wrong
leading to invalid memory access.
- It adds ifunc supports as default, so --disable-multi-arch does
not work as expected for riscv.
- It mixes Linux files (memcpy ifunc selection which requires the
vDSO/syscall mechanism) with generic support (the memcpy
optimization itself).
- There is no __libc_ifunc_impl_list, which makes testing only
check the selected implementation instead of all supported
by the system.
This patch also simplifies the required bits to enable ifunc: there
is no need to memcopy.h; nor to add Linux-specific files.
The __memcpy_noalignment tail handling now uses a branchless strategy
similar to aarch64 (overlap 32-bits copies for sizes 4..7 and byte
copies for size 1..3).
Checked on riscv64 and riscv32 by explicitly enabling the function
on __libc_ifunc_impl_list on qemu-system.
Changes from v1:
* Implement the memcpy in assembly to correctly handle RISCV
strict-alignment.
Reviewed-by: Evan Green <evan@rivosinc.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
For CPU implementations that can perform unaligned accesses with little
or no performance penalty, create a memcpy implementation that does not
bother aligning buffers. It will use a block of integer registers, a
single integer register, and fall back to bytewise copy for the
remainder.
Signed-off-by: Evan Green <evan@rivosinc.com>
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Add awareness and a thin wrapper function around a new Linux system call
that allows callers to get architecture and microarchitecture
information about the CPUs from the kernel. This can be used to
do things like dynamically choose a memcpy implementation.
Signed-off-by: Evan Green <evan@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
This patch lays out the top-level organisation of the RISC-V 32-bit port.
It provides all the Implies files as well as various other fragments of
the build infrastructure.
Reviewed-by: Maciej W. Rozycki <macro@wdc.com>
This patch moves the vDSO setup from libc to loader code, just after
the vDSO link_map setup. For static case the initialization
is moved to _dl_non_dynamic_init instead.
Instead of using the mangled pointer, the vDSO data is set as
attribute_relro (on _rtld_global_ro for shared or _dl_vdso_* for
static). It is read-only even with partial relro.
It fixes BZ#24967 now that the vDSO pointer is setup earlier than
malloc interposition is called.
Also, vDSO calls should not be a problem for static dlopen as
indicated by BZ#20802. The vDSO pointer would be zero-initialized
and the syscall will be issued instead.
Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu,
arm-linux-gnueabihf, powerpc64le-linux-gnu, powerpc64-linux-gnu,
powerpc-linux-gnu, s390x-linux-gnu, sparc64-linux-gnu, and
sparcv9-linux-gnu. I also run some tests on mips.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
_dl_var_init is used to patch the read-only data section after
relocation. Several architectures use this to update
GLRO(page_size) with the correct value for the static dlopen case,
where _rtld_global_ro has not been initialized by the dynamic
loader.
RISC-V does not need this. The RISC-V Instruction Set Manual,
Volume II: Privileged Architecture, Document Version
20190608-Priv-MSU-Ratified says this:
After much deliberation, we have settled on a conventional
page size of 4 KiB for both RV32 and RV64. We expect this
decision to ease the porting of low-level runtime software
and device drivers. The TLB reach problem is ameliorated by
transparent superpage support in modern operating systems
[2]. Additionally, multi-level TLB hierarchies are quite
inexpensive relative to the multi-level cache hierarchies
whose address space they map.
[2] Juan Navarro, Sitaram Iyer, Peter Druschel, and
Alan Cox. Practical, transparent operating system support
for superpages. SIGOPS Oper. Syst. Rev., 36(SI):89–104,
December 2002.
This means that the initialization of
_rtld_global_ro._dl_page_size in elf/rtld.c with EXEC_PAGESIZE
is sufficient for RISC-V.
This patch lays out the top-level orginazition of the RISC-V port. It
contains all the Implies files as well as various other fragments of
build infastructure for the RISC-V port. This contains the only change
to a shared file: config.h.in.
RISC-V is a family of base ISAs with optional extensions. The base ISAs
are RV32I and RV64I, which are 32-bit and 64-bit integer-only ISAs, but
this port currently only supports RV64I based systems. Support for
RISC-V lives in in sysdeps/riscv. In addition to these ISAs, our glibc
port supports most of the currently-defined extensions: the A extension
for atomics, the M extension for multiplication, the C extension for
compressed instructions, and the F/D extensions for single/double
precision IEEE floating-point. Most of these extensions are handled by
GCC, but glibc defines various floating-point wrappers and emulation
routines as well as some atomic wrappers.
We support running glibc-based programs on Linux, the support for which
lives in sysdeps/unix/sysv/linux/riscv.
2018-01-29 Palmer Dabbelt <palmer@sifive.com>
* sysdeps/riscv/Implies: New file.
* sysdeps/riscv/Makefile: Likewise.
* sysdeps/riscv/configure: Likewise.
* sysdeps/riscv/configure.ac: Likewise.
* sysdeps/riscv/nptl/Makefile: Likewise.
* sysdeps/riscv/preconfigure: Likewise.
* sysdeps/riscv/rv64/Implies-after: Likewise.
* sysdeps/riscv/rv64/rvd/Implies: Likewise.
* sysdeps/riscv/rv64/rvf/Implies: Likewise.
* sysdeps/unix/sysv/linux/riscv/Implies: Likewise.
* sysdeps/unix/sysv/linux/riscv/Makefile: Likewise.
* sysdeps/unix/sysv/linux/riscv/Versions: Likewise.
* sysdeps/unix/sysv/linux/riscv/configure: Likewise.
* sysdeps/unix/sysv/linux/riscv/configure.ac: Likewise.
* sysdeps/unix/sysv/linux/riscv/ldd-rewrite.sed: Likewise.
* sysdeps/unix/sysv/linux/riscv/rv64/Implies: Likewise.
* sysdeps/unix/sysv/linux/riscv/rv64/Makefile: Likewise.
* sysdeps/unix/sysv/linux/riscv/shlib-versions: Likewise.