I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
Linux kernel v5.9 added support for system calls using the scv
instruction for POWER9 and later. The new codepath provides better
performance (see below) if compared to using sc. For the
foreseeable future, both sc and scv mechanisms will co-exist, so this
patch enables glibc to do a runtime check and use scv when it is
available.
Before issuing the system call to the kernel, we check hwcap2 in the TCB
for PPC_FEATURE2_SCV to see if scv is supported by the kernel. If not,
we fallback to sc and keep the old behavior.
The kernel implements a different error return convention for scv, so
when returning from a system call we need to handle the return value
differently depending on the instruction we used to enter the kernel.
For syscalls implemented in ASM, entry and exit are implemented by
different macros (PSEUDO and PSEUDO_RET, resp.), which may be used in
sequence (e.g. for templated syscalls) or with other instructions in
between (e.g. clone). To avoid accessing the TCB a second time on
PSEUDO_RET to check which instruction we used, the value read from
hwcap2 is cached on a non-volatile register.
This is not needed when using INTERNAL_SYSCALL macro, since entry and
exit are bundled into the same inline asm directive.
The dynamic loader may issue syscalls before the TCB has been setup
so it always uses sc with no extra checks. For the static case, there
is no compile-time way to determine if we are inside startup code,
so we also check the value of the thread pointer before effectively
accessing the TCB. For such situations in which the availability of
scv cannot be determined, sc is always used.
Support for scv in syscalls implemented in their own ASM file (clone and
vfork) will be added later. For now simply use sc as before.
Average performance over 1M calls for each syscall "type":
- stat: C wrapper calling INTERNAL_SYSCALL
- getpid: templated ASM syscall
- syscall: call to gettid using syscall function
Standard:
stat : 1.573445 us / ~3619 cycles
getpid : 0.164986 us / ~379 cycles
syscall : 0.162743 us / ~374 cycles
With scv:
stat : 1.537049 us / ~3535 cycles <~ -84 cycles / -2.32%
getpid : 0.109923 us / ~253 cycles <~ -126 cycles / -33.25%
syscall : 0.116410 us / ~268 cycles <~ -106 cycles / -28.34%
Tested on powerpc, powerpc64, powerpc64le (with and without scv)
Tested-by: Lucas A. M. Magalhães <lamm@linux.ibm.com>
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
This patch remove the PID cache and usage in current GLIBC code. Current
usage is mainly used a performance optimization to avoid the syscall,
however it adds some issues:
- The exposed clone syscall will try to set pid/tid to make the new
thread somewhat compatible with current GLIBC assumptions. This cause
a set of issue with new workloads and usecases (such as BZ#17214 and
[1]) as well for new internal usage of clone to optimize other algorithms
(such as clone plus CLONE_VM for posix_spawn, BZ#19957).
- The caching complexity also added some bugs in the past [2] [3] and
requires more effort of each port to handle such requirements (for
both clone and vfork implementation).
- Caching performance gain in mainly on getpid and some specific
code paths. The getpid performance leverage is questionable [4],
either by the idea of getpid being a hotspot as for the getpid
implementation itself (if it is indeed a justifiable hotspot a
vDSO symbol could let to a much more simpler solution).
Other usage is mainly for non usual code paths, such as pthread
cancellation signal and handling.
For thread creation (on stack allocation) the code simplification in fact
adds some performance gain due the no need of transverse the stack cache
and invalidate each element pid.
Other thread usages will require a direct getpid syscall, such as
cancellation/setxid signal, thread cancellation, thread fail path (at
create_thread), and thread signal (pthread_kill and pthread_sigqueue).
However these are hardly usual hotspots and I think adding a syscall is
justifiable.
It also simplifies both the clone and vfork arch-specific implementation.
And by review each fork implementation there are some discrepancies that
this patch also solves:
- microblaze clone/vfork does not set/reset the pid/tid field
- hppa uses the default vfork implementation that fallback to fork.
Since vfork is deprecated I do not think we should bother with it.
The patch also removes the TID caching in clone. My understanding for
such semantic is try provide some pthread usage after a user program
issue clone directly (as done by thread creation with CLONE_PARENT_SETTID
and pthread tid member). However, as stated before in multiple discussions
threads, GLIBC provides clone syscalls without further supporting all this
semantics.
I ran a full make check on x86_64, x32, i686, armhf, aarch64, and powerpc64le.
For sparc32, sparc64, and mips I ran the basic fork and vfork tests from
posix/ folder (on a qemu system). So it would require further testing
on alpha, hppa, ia64, m68k, nios2, s390, sh, and tile (I excluded microblaze
because it is already implementing the patch semantic regarding clone/vfork).
[1] https://codereview.chromium.org/800183004/
[2] https://sourceware.org/ml/libc-alpha/2006-07/msg00123.html
[3] https://sourceware.org/bugzilla/show_bug.cgi?id=15368
[4] http://yarchive.net/comp/linux/getpid_caching.html
* sysdeps/nptl/fork.c (__libc_fork): Remove pid cache setting.
* nptl/allocatestack.c (allocate_stack): Likewise.
(__reclaim_stacks): Likewise.
(setxid_signal_thread): Obtain pid through syscall.
* nptl/nptl-init.c (sigcancel_handler): Likewise.
(sighandle_setxid): Likewise.
* nptl/pthread_cancel.c (pthread_cancel): Likewise.
* sysdeps/unix/sysv/linux/pthread_kill.c (__pthread_kill): Likewise.
* sysdeps/unix/sysv/linux/pthread_sigqueue.c (pthread_sigqueue):
Likewise.
* sysdeps/unix/sysv/linux/createthread.c (create_thread): Likewise.
* sysdeps/unix/sysv/linux/getpid.c: Remove file.
* nptl/descr.h (struct pthread): Change comment about pid value.
* nptl/pthread_getattr_np.c (pthread_getattr_np): Remove thread
pid assert.
* sysdeps/unix/sysv/linux/pthread-pids.h (__pthread_initialize_pids):
Do not set pid value.
* nptl_db/td_ta_thr_iter.c (iterate_thread_list): Remove thread
pid cache check.
* nptl_db/td_thr_validate.c (td_thr_validate): Likewise.
* sysdeps/aarch64/nptl/tcb-offsets.sym: Remove pid offset.
* sysdeps/alpha/nptl/tcb-offsets.sym: Likewise.
* sysdeps/arm/nptl/tcb-offsets.sym: Likewise.
* sysdeps/hppa/nptl/tcb-offsets.sym: Likewise.
* sysdeps/i386/nptl/tcb-offsets.sym: Likewise.
* sysdeps/ia64/nptl/tcb-offsets.sym: Likewise.
* sysdeps/m68k/nptl/tcb-offsets.sym: Likewise.
* sysdeps/microblaze/nptl/tcb-offsets.sym: Likewise.
* sysdeps/mips/nptl/tcb-offsets.sym: Likewise.
* sysdeps/nios2/nptl/tcb-offsets.sym: Likewise.
* sysdeps/powerpc/nptl/tcb-offsets.sym: Likewise.
* sysdeps/s390/nptl/tcb-offsets.sym: Likewise.
* sysdeps/sh/nptl/tcb-offsets.sym: Likewise.
* sysdeps/sparc/nptl/tcb-offsets.sym: Likewise.
* sysdeps/tile/nptl/tcb-offsets.sym: Likewise.
* sysdeps/x86_64/nptl/tcb-offsets.sym: Likewise.
* sysdeps/unix/sysv/linux/aarch64/clone.S: Remove pid and tid caching.
* sysdeps/unix/sysv/linux/alpha/clone.S: Likewise.
* sysdeps/unix/sysv/linux/arm/clone.S: Likewise.
* sysdeps/unix/sysv/linux/hppa/clone.S: Likewise.
* sysdeps/unix/sysv/linux/i386/clone.S: Likewise.
* sysdeps/unix/sysv/linux/ia64/clone2.S: Likewise.
* sysdeps/unix/sysv/linux/mips/clone.S: Likewise.
* sysdeps/unix/sysv/linux/nios2/clone.S: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/clone.S: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/clone.S: Likewise.
* sysdeps/unix/sysv/linux/sh/clone.S: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/clone.S: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/clone.S: Likewise.
* sysdeps/unix/sysv/linux/tile/clone.S: Likewise.
* sysdeps/unix/sysv/linux/x86_64/clone.S: Likewise.
* sysdeps/unix/sysv/linux/aarch64/vfork.S: Remove pid set and reset.
* sysdeps/unix/sysv/linux/alpha/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/arm/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/i386/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/ia64/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/m68k/clone.S: Likewise.
* sysdeps/unix/sysv/linux/m68k/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/mips/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/nios2/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/sh/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/tile/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/x86_64/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/tst-clone2.c (f): Remove direct pthread
struct access.
(clone_test): Remove function.
(do_test): Rewrite to take in consideration pid is not cached anymore.
2003-01-14 Guido Guenther <agx@sigxcpu.org>
* sysdeps/unix/sysv/linux/mips/sysdep.h (INTERNAL_SYSCALL,
INTERNAL_SYSCALL_DECL, INTERNAL_SYSCALL_ERRNO,
INTERNAL_SYSCALL_ERROR_P, INLINE_SYSCALL): Define.
2003-01-14 Steven Munroe <sjmunroe@us.ibm.com>
* sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h
(INTERNAL_SYSCALL): Make use of ERR parameter.
(INTERNAL_SYSCALL_DECL, INTERNAL_SYSCALL_ERRNO,
INTERNAL_SYSCALL_ERROR_P): Adjust accordingly.
(INLINE_SYSCALL): Make use of INTERNAL_SYSCALL.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/vfork.S: New file.
Patch by Denis Zaitsev <zzz@cd-club.ru>.
that %eax is modified. Reported by Denis Zaitsev <zzz@cd-club.ru>.