This patch does not have any functionality change, we only provide a spin
count tunes for pthread adaptive spin mutex. The tunable
glibc.pthread.mutex_spin_count tunes can be used by system administrator to
squeeze system performance according to different hardware capabilities and
workload characteristics.
The maximum value of spin count is limited to 32767 to avoid the overflow
of mutex->__data.__spins variable with the possible type of short in
pthread_mutex_lock ().
The default value of spin count is set to 100 with the reference to the
previous number of times of spinning via trylock. This value would be
architecture-specific and can be tuned with kinds of benchmarks to fit most
cases in future.
I would extend my appreciation sincerely to H.J.Lu for his help to refine
this patch series.
* manual/tunables.texi (POSIX Thread Tunables): New node.
* nptl/Makefile (libpthread-routines): Add pthread_mutex_conf.
* nptl/nptl-init.c: Include pthread_mutex_conf.h
(__pthread_initialize_minimal_internal) [HAVE_TUNABLES]: Call
__pthread_tunables_init.
* nptl/pthreadP.h (MAX_ADAPTIVE_COUNT): Remove.
(max_adaptive_count): Define.
* nptl/pthread_mutex_conf.c: New file.
* nptl/pthread_mutex_conf.h: New file.
* sysdeps/generic/adaptive_spin_count.h: New file.
* sysdeps/nptl/dl-tunables.list: New file.
* nptl/pthread_mutex_lock.c (__pthread_mutex_lock): Use
max_adaptive_count () not MAX_ADAPTIVE_COUNT.
* nptl/pthread_mutex_timedlock.c (__pthrad_mutex_timedlock):
Likewise.
Suggested-by: Andi Kleen <andi.kleen@intel.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Signed-off-by: Kemi.wang <kemi.wang@intel.com>
This patch uses posix_spawn on system implementation. On Linux this has
the advantage of much lower memory consumption (usually 32 Kb minimum for
the mmap stack area).
Although POSIX does not require, glibc system implementation aims to be
thread and cancellation safe. The cancellation code is moved to generic
implementation and enabled iff SIGCANCEL is defined (similar on how the
cancellation handler is enabled on nptl-init.c).
Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu,
arm-linux-gnueabihf, and powerpc64le-linux-gnu.
* sysdeps/unix/sysv/linux/spawni.c (__spawni_child): Use
__sigismember instead of sigismember.
* sysdeps/posix/system.c [SIGCANCEL] (cancel_handler_args,
cancel_handler): New definitions.
(CLEANUP_HANDLER, CLEANUP_RESET): Likewise.
(DO_LOCK, DO_UNLOCK, INIT_LOCK, ADD_REF, SUB_REF): Remove.
(do_system): Use posix_spawn instead of fork and execl and remove
reentracy code.
* sysdeps/generic/not-errno.h (__kill_noerrno): New prototype.
* sysdeps/unix/sysv/linux/not-errno.h (__kill_noerrno): Likewise.
* sysdeps/unix/sysv/linux/ia64/system.c: Remove file.
* sysdeps/unix/sysv/linux/s390/system.c: Likewise.
* sysdeps/unix/sysv/linux/sparc/system.c: Likewise.
* sysdeps/unix/sysv/linux/system.c: Likewise.
The logic for generating sysdeps/mach/hurd/bits/errno.h involves a
stamp file and $(move-if-change).
The temporary file (generated unconditionally) is generated in the
source directory. This means that even if
sysdeps/mach/hurd/bits/errno.h is up to date, and has an up to date
timestamp, the build will fail if the source directory is read-only.
Even with a writable source directory, multiple concurrent builds for
i686-gnu with the same source directory could race to access the
temporary file (which always has the same name).
This patch uses the build directory for the temporary file instead to
avoid those problems. (In the case where the file is out of date and
the temporary file does need to be moved to the source directory, if
there are multiple concurrent builds for i686-gnu with the same source
directory, and the source and build directories are on different
filesystems, it's possible there might still be races replacing the
file in the source directory, depending on exactly how mv handles such
cross-filesystem moves. This is certainly no worse than the present
situation, where such a case would have races regardless of whether
the file is out of date or whether different filesystems are in use.)
Tested with a build-many-glibcs.py build for i686-gnu.
* sysdeps/mach/hurd/Makefile ($(common-objpfx)stamp-errnos): Use
$(hurd-objpfx)bits/errno.h-tmp, not $(hurd)/bits/errno.h-tmp.
All the required code already existed, and some of it was already
running.
AT_SYSINFO_EHDR is processed if NEED_DL_SYSINFO_DSO is defined, but it
looks like it always is. The call to setup_vdso is also unconditional,
so all that was left to do was setup the function pointers and use
them. This patch just deletes some #ifdef to enable that.
[BZ #19767]
* nptl/Makefile (tests-static): Add tst-cond11-static.
(tests): Likewise.
* nptl/tst-cond11-static.c: New File.
* sysdeps/unix/sysv/linux/Makefile (tests-static): Add
tst-affinity-static.
(tests): Likewise.
* sysdeps/unix/sysv/linux/sysdep-vdso.h: Check USE_VSYSCALL
instead of SHARED.
* sysdeps/unix/sysv/linux/sysdep.h (ALWAYS_USE_VSYSCALL): New.
(USE_VSYSCALL): Likewise.
* sysdeps/unix/sysv/linux/tst-affinity-static.c: New file.
* sysdeps/unix/sysv/linux/x86/libc-vdso.h: Check USE_VSYSCALL
instead of SHARED.
* sysdeps/unix/sysv/linux/x86_64/init-first.c: Don't check
SHARED.
* sysdeps/unix/sysv/linux/x86_64/sysdep.h (ALWAYS_USE_VSYSCALL):
New.
The generic kernel-features.h defines __ASSUME_COPY_FILE_RANGE for 4.5
and later kernels. However, for 32-bit Arm binaries running on 64-bit
Arm kernels, the syscall was only wired up in the 4.7 kernel, although
the 32-bit Arm kernel had the syscall from 4.5 onwards. This patch
corrects the Arm kernel-features.h to undefine the macro for
configured minimum kernel versions before 4.7.
Tested (compilation only) with a build-many-glibcs.py build for
arm-linux-gnueabi.
[BZ #23915]
* sysdeps/unix/sysv/linux/arm/kernel-features.h
[__LINUX_KERNEL_VERSION < 0x040700] (__ASSUME_COPY_FILE_RANGE):
Undefine.
Add a re-exec test with legacy bitmap to verify that legacy bitmap is
properly hanlded by kernel.
* sysdeps/x86/Makefile (tests): Add tst-cet-legacy-1a.
(tst-cet-legacy-1a-ARGS): New.
($(objpfx)tst-cet-legacy-1a): New target.
* sysdeps/x86/tst-cet-legacy-1a.c: New file.
Introduce new pow symbol version that doesn't do SVID compatible error
handling. The standard errno and fp exception based error handling is
inline in the new code and does not have significant overhead.
The wrapper is disabled for sysdeps/ieee754/dbl-64 by using empty
w_pow.c and enabled for targets with their own pow implementation or
ifunc dispatch on __ieee754_pow by including math/w_pow.c.
The compatibility symbol version still uses the wrapper with SVID error
handling around the new code. There is no new symbol version nor
compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv).
On targets where previously powl was an alias of pow, now it points to
the compatibility symbol with the wrapper, because it still need the
SVID compatible error handling. This affects NO_LONG_DOUBLE (e.g. arm)
and LONG_DOUBLE_COMPAT (e.g. alpha) targets as well.
The __pow_finite symbol is now an alias of pow. Both __pow_finite and
pow set errno and thus not const functions.
The ia64 asm is changed so the compat and new symbol versions map to the
same address.
On x86_64 #include <math.h> was added before macro definitions that
may affect that header.
Tested with build-many-glibcs.py.
* math/Versions (GLIBC_2.29): Add pow.
* math/w_pow_compat.c (__pow_compat): Change to versioned compat
symbol.
* math/w_pow.c: New file.
* sysdeps/i386/fpu/w_pow.c: New file.
* sysdeps/ia64/fpu/e_pow.S: Add versioned symbols.
* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Rename to __pow
and add necessary aliases.
* sysdeps/ieee754/dbl-64/w_pow.c: New file.
* sysdeps/m68k/m680x0/fpu/w_pow.c: New file.
* sysdeps/mach/hurd/i386/libm.abilist: Update.
* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update.
* sysdeps/unix/sysv/linux/alpha/libm.abilist: Update.
* sysdeps/unix/sysv/linux/arm/libm.abilist: Update.
* sysdeps/unix/sysv/linux/hppa/libm.abilist: Update.
* sysdeps/unix/sysv/linux/i386/libm.abilist: Update.
* sysdeps/unix/sysv/linux/ia64/libm.abilist: Update.
* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Update.
* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Update.
* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Update.
* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Update.
* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Update.
* sysdeps/unix/sysv/linux/nios2/libm.abilist: Update.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Update.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Update.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Update.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Update.
* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Update.
* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Update.
* sysdeps/unix/sysv/linux/sh/libm.abilist: Update.
* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Update.
* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Update.
* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Update.
* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Update.
* sysdeps/x86_64/fpu/multiarch/e_pow-fma.c (__ieee754_pow): Rename to
__pow.
* sysdeps/x86_64/fpu/multiarch/e_pow-fma4.c (__ieee754_pow): Likewise.
* sysdeps/x86_64/fpu/multiarch/e_pow.c (__ieee754_pow): Likewise.
* sysdeps/x86_64/fpu/multiarch/w_pow.c: New file.
Introduce new log2 symbol version that doesn't do SVID compatible error
handling. The standard errno and fp exception based error handling is
inline in the new code and does not have significant overhead.
The wrapper is disabled for sysdeps/ieee754/dbl-64 by using empty
w_log2.c and enabled for targets with their own log2 implementation by
including math/w_log2.c.
The compatibility symbol version still uses the wrapper with SVID error
handling around the new code. There is no new symbol version nor
compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv).
On targets where previously log2l was an alias of log2, now it points to
the compatibility symbol with the wrapper, because it still need the
SVID compatible error handling. This affects NO_LONG_DOUBLE (e.g. arm)
and LONG_DOUBLE_COMPAT (e.g. alpha) targets as well.
The __log2_finite symbol is now an alias of log2. Both __log2_finite
and log2 set errno and thus not const functions.
The ia64 asm is changed so the compat and new symbol versions map to the
same address.
Tested with build-many-glibcs.py.
* math/Versions (GLIBC_2.29): Add log2.
* math/w_log2_compat.c (__log2_compat): Change to versioned compat
symbol.
* math/w_log2.c: New file.
* sysdeps/i386/fpu/w_log2.c: New file.
* sysdeps/ia64/fpu/e_log2.S: Add versioned symbols.
* sysdeps/ieee754/dbl-64/e_log2.c (__ieee754_log2): Rename to __log2
and add necessary aliases.
* sysdeps/ieee754/dbl-64/w_log2.c: New file.
* sysdeps/m68k/m680x0/fpu/w_log2.c: New file.
* sysdeps/mach/hurd/i386/libm.abilist: Update.
* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update.
* sysdeps/unix/sysv/linux/alpha/libm.abilist: Update.
* sysdeps/unix/sysv/linux/arm/libm.abilist: Update.
* sysdeps/unix/sysv/linux/hppa/libm.abilist: Update.
* sysdeps/unix/sysv/linux/i386/libm.abilist: Update.
* sysdeps/unix/sysv/linux/ia64/libm.abilist: Update.
* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Update.
* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Update.
* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Update.
* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Update.
* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Update.
* sysdeps/unix/sysv/linux/nios2/libm.abilist: Update.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Update.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Update.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Update.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Update.
* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Update.
* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Update.
* sysdeps/unix/sysv/linux/sh/libm.abilist: Update.
* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Update.
* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Update.
* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Update.
* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Update.
Introduce new log symbol version that doesn't do SVID compatible error
handling. The standard errno and fp exception based error handling is
inline in the new code and does not have significant overhead.
The wrapper is disabled for sysdeps/ieee754/dbl-64 by using empty
w_log.c and enabled for targets with their own log implementation by
including math/w_log.c.
The compatibility symbol version still uses the wrapper with SVID error
handling around the new code. There is no new symbol version nor
compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv).
On targets where previously logl was an alias of log, now it points to
the compatibility symbol with the wrapper, because it still need the
SVID compatible error handling. This affects NO_LONG_DOUBLE (e.g. arm)
and LONG_DOUBLE_COMPAT (e.g. alpha) targets as well.
The __log_finite symbol is now an alias of log. Both __log_finite and
log set errno and thus not const functions.
The ia64 asm is changed so the compat and new symbol versions map to the
same address.
On x86_64 #include <math.h> was added before macro definitions that may
affect that header.
Tested with build-many-glibcs.py.
* math/Versions (GLIBC_2.29): Add log.
* math/w_log_compat.c (__log_compat): Change to versioned compat
symbol.
* math/w_log.c: New file.
* sysdeps/i386/fpu/w_log.c: New file.
* sysdeps/ia64/fpu/e_log.S: Update.
* sysdeps/ieee754/dbl-64/e_log.c (__ieee754_log): Rename to __log
and add necessary aliases.
* sysdeps/ieee754/dbl-64/w_log.c: New file.
* sysdeps/m68k/m680x0/fpu/w_log.c: New file.
* sysdeps/mach/hurd/i386/libm.abilist: Update.
* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update.
* sysdeps/unix/sysv/linux/alpha/libm.abilist: Update.
* sysdeps/unix/sysv/linux/arm/libm.abilist: Update.
* sysdeps/unix/sysv/linux/hppa/libm.abilist: Update.
* sysdeps/unix/sysv/linux/i386/libm.abilist: Update.
* sysdeps/unix/sysv/linux/ia64/libm.abilist: Update.
* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Update.
* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Update.
* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Update.
* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Update.
* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Update.
* sysdeps/unix/sysv/linux/nios2/libm.abilist: Update.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Update.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Update.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Update.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Update.
* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Update.
* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Update.
* sysdeps/unix/sysv/linux/sh/libm.abilist: Update.
* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Update.
* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Update.
* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Update.
* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Update.
* sysdeps/x86_64/fpu/multiarch/e_log-avx.c (__ieee754_log): Rename to
__log.
* sysdeps/x86_64/fpu/multiarch/e_log-fma.c (__ieee754_log): Likewise.
* sysdeps/x86_64/fpu/multiarch/e_log-fma4.c (__ieee754_log): Likewise.
* sysdeps/x86_64/fpu/multiarch/e_log.c (__ieee754_log): Likewise.
* sysdeps/x86_64/fpu/multiarch/w_log.c: New file.
Introduce new exp and exp2 symbol version that don't do SVID compatible
error handling. The standard errno and fp exception based error handling
is inline in the new code and does not have significant overhead.
The double precision wrappers are disabled for sysdeps/ieee754/dbl-64
by using empty w_exp.c and w_exp2.c files, the math/w_exp.c and
math/w_exp2.c files use the wrapper template and can be included by
targets that have their own exp and exp2 implementations or use ifunc
on the glibc internal __ieee754_exp symbol.
The compatibility symbol versions still use the wrapper with SVID error
handling around the new code. There is no new symbol version nor
compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv).
On targets where previously expl and exp2l were aliases of exp and exp2,
now they point to the compatibility symbols with the wrapper, because
they still need the SVID compatible error handling. This affects
NO_LONG_DOUBLE (e.g arm) and LONG_DOUBLE_COMPAT (e.g. alpha) targets
as well.
The _finite symbols are now aliases of the standard symbols (they have
no performance advantage anymore). Both the standard symbols and
_finite symbols set errno and thus not const functions.
The ia64 asm is changed so the compat and new symbol versions map to the
same address.
On x86_64 #include <math.h> was added before macro definitions that may
affect that header (the new macro name is __exp instead of __ieee754_exp
which breaks some math.h macros).
Tested with build-many-glibcs.py.
* math/Versions (GLIBC_2.29): Add exp and exp2.
* math/w_exp2_compat.c (__exp2_compat): Change to versioned compat
symbol, handle NO_LONG_DOUBLE and LONG_DOUBLE_COMPAT explicitly.
* math/w_exp_compat.c (__exp_compat): Likewise.
* math/w_exp.c: New file.
* math/w_exp2.c: New file.
* sysdeps/i386/fpu/w_exp.c: New file.
* sysdeps/i386/fpu/w_exp2.c: New file.
* sysdeps/ia64/fpu/e_exp.S: Add versioned symbols.
* sysdeps/ia64/fpu/e_exp2.S: Likewise.
* sysdeps/ieee754/dbl-64/e_exp.c (__ieee754_exp): Rename to __exp
and add necessary aliases.
* sysdeps/ieee754/dbl-64/e_exp2.c (__ieee754_exp2): Rename to __exp2
and add necessary aliases.
* sysdeps/ieee754/dbl-64/w_exp.c: New file.
* sysdeps/ieee754/dbl-64/w_exp2.c: New file.
* sysdeps/m68k/m680x0/fpu/w_exp.c: New file.
* sysdeps/m68k/m680x0/fpu/w_exp2.c: New file.
* sysdeps/mach/hurd/i386/libm.abilist: Update.
* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update.
* sysdeps/unix/sysv/linux/alpha/libm.abilist: Update.
* sysdeps/unix/sysv/linux/arm/libm.abilist: Update.
* sysdeps/unix/sysv/linux/hppa/libm.abilist: Update.
* sysdeps/unix/sysv/linux/i386/libm.abilist: Update.
* sysdeps/unix/sysv/linux/ia64/libm.abilist: Update.
* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Update.
* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Update.
* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Update.
* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Update.
* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Update.
* sysdeps/unix/sysv/linux/nios2/libm.abilist: Update.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Update.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Update.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Update.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Update.
* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Update.
* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Update.
* sysdeps/unix/sysv/linux/sh/libm.abilist: Update.
* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Update.
* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Update.
* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Update.
* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Update.
* sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__exp1): Remove.
(__ieee754_exp): Rename to __exp.
* sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__exp1): Remove.
(__ieee754_exp): Rename to __exp.
* sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__exp1): Remove.
(__ieee754_exp): Rename to __exp.
* sysdeps/x86_64/fpu/multiarch/e_exp.c (__ieee754_exp): Rename to
__exp.
* sysdeps/x86_64/fpu/multiarch/w_exp.c: New file.
This fixes an ineffiency in the non-zero memset. Delaying the writeback
until the end of the loop is slightly faster on some cores - this shows
~5% performance gain on Cortex-A53 when doing large non-zero memsets.
* sysdeps/aarch64/memset.S (MEMSET): Improve non-zero memset loop.
On platforms where long double used to have the same format as double,
but later switched to a different format (alpha, s390, sparc, and
powerpc), accessing the older behavior is possible and it happens via
__nldbl_* functions (not on the API, but accessible from header
redirection and from compat symbols). These functions write to the
global flag __ldbl_is_dbl, which tells other functions that long double
variables should be handled as double. This patch takes the first step
towards removing this global flag and creates __vstrfmon_l_internal,
which takes an explicit flags parameter.
This change arguably makes the generated code slightly worse on
architectures where __ldbl_is_dbl is never true; right now, on those
architectures, it's a compile-time constant; after this change, the
compiler could theoretically prove that __vstrfmon_l_internal was
never called with a nonzero flags argument, but it would probably need
LTO to do it. This is not performance critical code and I tend to
think that the maintainability benefits of removing action at a
distance are worth it. However, we _could_ wrap the runtime flag
check with a macro that was defined to ignore its argument and always
return false on architectures where __ldbl_is_dbl is never true, if
people think the codegen benefits are important.
Tested for powerpc and powerpc64le.
* sysdeps/mach/hurd/dl-sysdep.c (check_no_hidden): Use
__attribute_copy__ to copy attributes from name. Drop static qualifier
to avoid warnings about leaf attribute not having effect on static
functions.
This patch fixes the build for MIPS (o32) with GCC 9 by stopping MIPS
__longjmp from using strong_alias, instead defining the alias
manually, so that the intended effect of not copying the nomips16
attribute is achieved, as explained in the included comment.
Tested with build-many-glibcs.py compilers build for mips64-linux-gnu
(which includes glibc builds for all three ABIs).
* sysdeps/mips/__longjmp.c (__longjmp): Define alias manually with
alias attribute, not with strong_alias.
Soft-float powerpc fails to build with current GCC mainline because of
use of libc_hidden_data_def for TLS variables, resulting in a non-TLS
alias being defined, to which the tls_model attribute is now copied,
resulting in a warning about it being ignored.
The problem here appears to be the non-TLS alias. This patch adds a
hidden_tls_def macro family, corresponding to the hidden_tls_proto
macros, to define TLS aliases properly in such a case, and uses it for
those powerpc soft-float variables.
Tested with build-many-glibcs.py compilers build for powerpc-linux-gnu
soft-float. Also tested for x86_64.
* include/libc-symbols.h [SHARED && !NO_HIDDEN && !__ASSEMBLER__]
(__hidden_ver2): New macro. Use old definition of __hidden_ver1
with additional parameter thread.
[SHARED && !NO_HIDDEN && !__ASSEMBLER__] (__hidden_ver1): Define
in terms of __hidden_ver2.
(hidden_tls_def): New macro.
(libc_hidden_tls_def): Likewise.
(rtld_hidden_tls_def): Likewise.
(libm_hidden_tls_def): Likewise.
(libmvec_hidden_tls_def): Likewise.
(libresolv_hidden_tls_def): Likewise.
(librt_hidden_tls_def): Likewise.
(libdl_hidden_tls_def): Likewise.
(libnss_files_hidden_tls_def): Likewise.
(libnsl_hidden_tls_def): Likewise.
(libnss_nisplus_hidden_tls_def): Likewise.
(libutil_hidden_tls_def): Likewise.
(libutil_hidden_tls_def): Likweise.
* sysdeps/powerpc/nofpu/sim-full.c (__sim_exceptions_thread): Use
libc_hidden_tls_def.
(__sim_disabled_exceptions_thread): Likewise.
(__sim_round_mode_thread): Likewise.
Similar to the x86_64 and armv7 build issues, glibc fails to build for
sparc64 with current mainline GCC because of aliases declared in the
course of defining IFUNCs, which copy their attributes from a header
declaration, ending up with fewer attributes than the (built-in)
string function they alias. This patch fixes the issue similarly to
the fixes for those other architectures.
Tested with build-many-glibcs.py compilers build for
sparc64-linux-gnu.
* sysdeps/sparc/sparc-ifunc.h [SHARED]
(sparc_ifunc_redirected_hidden_def): Use __attribute_copy__ to
copy attributes from name.
Similar to the x86_64 build issues, glibc fails to build for armv7
with current mainline GCC because of aliases declared in the course of
defining IFUNCs, which copy their attributes from a header
declaration, ending up with fewer attributes than the (built-in)
string function they alias: the relevant attributes (nonnull, leaf)
are present on the header declaration, but elided therefrom when glibc
itself if being built (whatever the reasons are for disabling the
nonnull and leaf attributes in that case, and whether or not those
reasons are actually still valid). This patch fixes the issue
similarly to the x86_64 fix, by adding an addition __attribute_copy__
use (in this case, on the definition of arm_libc_ifunc_hidden_def).
Tested with build-many-glibcs.py build for armeb-linux-gnueabi-be8.
* sysdeps/arm/arm-ifunc.h [SHARED] (arm_libc_ifunc_hidden_def):
Use __attribute_copy__ to copy attributes from name.
This patch fixes the glibc build for i686 with current mainline GCC,
where there are warnings about inconsistent attributes for aliases in
certain files defining libm IFUNCs.
In three of the files, the aliases were defined in terms of internal
symbols such as __sinf, and copied attributes from file-local
declarations of those functions which lacked the nothrow attribute.
Since the nothrow attribute is present on the declarations from
<math.h> (which include declarations of those __-prefixed functions),
the natural fix was to include <math.h> in those files, replacing the
local declarations.
In the other three files, a more complicated __hidden_ver1 call was
involved in the warnings. <math.h> has not been included at this
point and, furthermore, it is included indirectly only later in the
source file after macros have been defined to remap a function name
therein. So there isn't an obvious declaration from which to copy the
attribute and it seems simplest and safest just to add __THROW to the
hidden_ver1 calls.
Tested for i686 (build-many-glibcs.py compilers build for
x86_64-linux-gnu with GCC mainline; full testsuite run with GCC 7).
* sysdeps/i386/i686/fpu/multiarch/e_expf.c [SHARED]: Use __THROW
with __hidden_ver1 call.
* sysdeps/i386/i686/fpu/multiarch/e_log2f.c [SHARED]: Likewise.
* sysdeps/i386/i686/fpu/multiarch/e_logf.c [SHARED]: Likewise.
* sysdeps/i386/i686/fpu/multiarch/s_cosf.c: Include <math.h>.
(__cosf): Do not declare here.
* sysdeps/i386/i686/fpu/multiarch/s_sincosf.c: Include <math.h>.
(__sincosf): Do not declare here.
* sysdeps/i386/i686/fpu/multiarch/s_sinf.c: Include <math.h>.
(__sinf): Do not declare here.
After the changes to use the copy attribute, building glibc for ia64
fails, even with older compilers, because
sysdeps/ia64/fpu/sfp-machine.h has a definition of _strong_alias that
now differs from the one in libc-symbols.h.
That definition is a relic of this file coming from libgcc, as are
some other such macro definitions in this file; in the glibc context,
there is no need for those macros, and this patch removes them to fix
the build.
Tested with build-many-glibcs.py for ia64-linux-gnu.
* sysdeps/ia64/fpu/sfp-machine.h (__LITTLE_ENDIAN): Remove.
(__BIG_ENDIAN): Likewise.
(__BYTE_ORDER): Likewise.
(strong_alias): Likewise.
(_strong_alias): Likewise.
* hurd/hurd/userlink.h (_hurd_userlink_move): New function.
* hurd/hurd/port.h (_hurd_port_move): New function.
* sysdeps/mach/hurd/spawni.c (NEW_ULINK_TABLE): New macro.
(EXPAND_DTABLE): Use NEW_ULINK_TABLE macro for ulink_dtable.
This fixes build-many-glibcs.py on i686-gnu.
Thanks Florian Weimer for the initial version.
* sysdeps/mach/hurd/spawni.c (__spawni): Add ccwdir port. Test and use
it, free it if needed.
(reauthenticate): Test and use ccwdir.
(child_init_port): In non-resetids case, test and use ccwdir.
(child_chdir): New nested function to set ccwdir.
GCC 9 has gained an enhancement to help detect attribute mismatches
between alias declarations and their targets. It consists of a new
warning, -Wattribute-alias, an enhancement to an existing warning,
-Wmissing-attributes, and a new attribute called copy.
The purpose of the warnings is to help identify either possible bugs
(an alias declared with more restrictive attributes than its target
promises) or optimization or diagnostic opportunities (an alias target
missing some attributes that it could be declared with that might
benefit analysis and code generation). The purpose of the new
attribute is to easily apply (almost) the same set of attributes
to one declaration as those already present on another.
As expected (and intended) the enhancement triggers warnings for
many alias declarations in Glibc code. This change, tested on
x86_64-linux, avoids all instances of the new warnings by making
use of the attribute where appropriate. To fully benefit from
the enhancement Glibc will need to be compiled with
-Wattribute-alias=2 and remaining warnings reviewed and dealt with
(there are a couple of thousand but most should be straightforward
to deal with).
ChangeLog:
* include/libc-symbols.h (__attribute_copy__): Define macro unless
it's already defined.
(_strong_alias): Use __attribute_copy__.
(_weak_alias, __hidden_ver1, __hidden_nolink2): Same.
* misc/sys/cdefs.h (__attribute_copy__): New macro.
* sysdeps/x86_64/multiarch/memchr.c (memchr): Use __attribute_copy__.
* sysdeps/x86_64/multiarch/memcmp.c (memcmp): Same.
* sysdeps/x86_64/multiarch/mempcpy.c (mempcpy): Same.
* sysdeps/x86_64/multiarch/memset.c (memset): Same.
* sysdeps/x86_64/multiarch/stpcpy.c (stpcpy): Same.
* sysdeps/x86_64/multiarch/strcat.c (strcat): Same.
* sysdeps/x86_64/multiarch/strchr.c (strchr): Same.
* sysdeps/x86_64/multiarch/strcmp.c (strcmp): Same.
* sysdeps/x86_64/multiarch/strcpy.c (strcpy): Same.
* sysdeps/x86_64/multiarch/strcspn.c (strcspn): Same.
* sysdeps/x86_64/multiarch/strlen.c (strlen): Same.
* sysdeps/x86_64/multiarch/strncmp.c (strncmp): Same.
* sysdeps/x86_64/multiarch/strncpy.c (strncpy): Same.
* sysdeps/x86_64/multiarch/strnlen.c (strnlen): Same.
* sysdeps/x86_64/multiarch/strpbrk.c (strpbrk): Same.
* sysdeps/x86_64/multiarch/strrchr.c (strrchr): Same.
* sysdeps/x86_64/multiarch/strspn.c (strspn): Same.
The __ASSUME_SOCKETCALL macro in kernel-features.h is no longer used
for anything. (It used to be used in defining other macros related to
accept4 / recvmmsg / sendmmsg availability, but the code in that area
was simplified once we could assume a kernel with those features,
whether through a syscall or through socketcall, so allowing those
functions to be handled much like other socket operations, without
requring __ASSUME_SOCKETCALL.) This patch removes that unused macro.
(Note: once we can assume a Linux 4.4 or later kernel, much of the
support for using socketcall at all can be removed from glibc,
although a few functions may need that support in glibc for longer.)
Tested with build-many-glibcs.py.
* sysdeps/unix/sysv/linux/kernel-features.h: Remove comment about
__ASSUME_SOCKETCALL.
* sysdeps/unix/sysv/linux/i386/kernel-features.h
(__ASSUME_SOCKETCALL): Remove.
* sysdeps/unix/sysv/linux/m68k/kernel-features.h
(__ASSUME_SOCKETCALL): Likewise.
* sysdeps/unix/sysv/linux/microblaze/kernel-features.h
(__ASSUME_SOCKETCALL): Likewise.
* sysdeps/unix/sysv/linux/powerpc/kernel-features.h
(__ASSUME_SOCKETCALL): Likewise.
* sysdeps/unix/sysv/linux/s390/kernel-features.h
(__ASSUME_SOCKETCALL): Likewise.
* sysdeps/unix/sysv/linux/sh/kernel-features.h
(__ASSUME_SOCKETCALL): Likewise.
* sysdeps/unix/sysv/linux/sparc/kernel-features.h
(__ASSUME_SOCKETCALL): Likewise.
Linkers group input note sections with the same name into one output
note section with the same name. One output note section is placed in
one PT_NOTE segment. Since new linkers merge input .note.gnu.property
sections into one output .note.gnu.property section, there is only
one NT_GNU_PROPERTY_TYPE_0 note in one PT_NOTE segment with new linkers.
Since older linkers treat input .note.gnu.property section as a generic
note section and just concatenate all input .note.gnu.property sections
into one output .note.gnu.property section without merging them, we may
see multiple NT_GNU_PROPERTY_TYPE_0 notes in one PT_NOTE segment with
older linkers.
When an older linker is used to created the program on CET-enabled OS,
the linker output has a single .note.gnu.property section with multiple
NT_GNU_PROPERTY_TYPE_0 notes, some of which have IBT and SHSTK enable
bits set even if the program isn't CET enabled. Such programs will
crash on CET-enabled machines. This patch updates the note parser:
1. Skip note parsing if a NT_GNU_PROPERTY_TYPE_0 note has been processed.
2. Check multiple NT_GNU_PROPERTY_TYPE_0 notes.
[BZ #23509]
* sysdeps/x86/dl-prop.h (_dl_process_cet_property_note): Skip
note parsing if a NT_GNU_PROPERTY_TYPE_0 note has been processed.
Update the l_cet field when processing NT_GNU_PROPERTY_TYPE_0 note.
Check multiple NT_GNU_PROPERTY_TYPE_0 notes.
* sysdeps/x86/link_map.h (l_cet): Expand to 3 bits, Add
lc_unknown.
The generic kernel-features.h defines __ASSUME_MLOCK2 for 4.4 and
later kernels. However, for 32-bit ARM binaries running on 64-bit ARM
kernels, and for MicroBlaze, the syscall was only wired up in the 4.7
kernel. (32-bit ARM kernels did have the syscall from 4.4 onwards.)
This patch duly arranges for the macro to be undefined for those
architectures for kernels before 4.7.
Tested with build-many-glibcs.py for its ARM and MicroBlaze
configurations.
[BZ #23867]
* sysdeps/unix/sysv/linux/arm/kernel-features.h
[__LINUX_KERNEL_VERSION < 0x040700] (__ASSUME_MLOCK2): Undefine.
* sysdeps/unix/sysv/linux/microblaze/kernel-features.h
[__LINUX_KERNEL_VERSION < 0x040700] (__ASSUME_MLOCK2): Undefine.
The SH kernel-features.h undefines __ASSUME_RENAMEAT2 for kernel
versions before 4.8, but fails to undefine __ASSUME_EXECVEAT,
__ASSUME_MLOCK2 and __ASSUME_COPY_FILE_RANGE, although all those
syscalls (and several others) were added for SH in the same Linux
kernel commit (first released in 4.8). This patch adds the proper
undefines of those macros.
Tested with build-many-glibcs.py for its SH configurations.
[BZ #23862]
* sysdeps/unix/sysv/linux/sh/kernel-features.h
[__LINUX_KERNEL_VERSION < 0x040800] (__ASSUME_EXECVEAT): Undefine.
[__LINUX_KERNEL_VERSION < 0x040800] (__ASSUME_MLOCK2): Likewise.
[__LINUX_KERNEL_VERSION < 0x040800] (__ASSUME_COPY_FILE_RANGE):
Likewise.
Looking at kernel-features.h files, I saw that SPARC was missing full
information on when it gained separate socket syscalls.
This patch adds such information to the SPARC kernel-features.h. It
also corrects what appear to be bugs in the existing code (that would
cause syscalls to be assumed to be present when not actually present).
Various __ASSUME_* macros, defined by default, were not undefined for
32-bit despite those syscalls only being added for 32-bit in Linux
4.4. Some syscalls were used in the SPARC64 syscalls.list but only
added in 4.4; this was harmless before the __NR_* macros were defined
at all, but once the macros were defined it means a build with
post-4.4 headers would assume the syscalls to be present regardless of
--enable-kernel version. Then, various __ASSUME_* macros were
previously not defined in cases where they could be defined (this part
of the patch is just an optimization, not a bug fix).
Note the observation in a comment in the patch that even the latest
Linux kernel for SPARC does not have getpeername and getsockname
syscalls in the compat syscall table for 32-bit binaries on 64-bit
kernels (so glibc can't assume those syscalls to be present for 32-bit
at all, although the 32-bit syscall table gained them in 4.4).
Tested (compilation only) for SPARC with build-many-glibcs.py.
[BZ #23848]
* sysdeps/unix/sysv/linux/sparc/kernel-features.h [!__arch64__ &&
__LINUX_KERNEL_VERSION < 0x040400] (__ASSUME_SENDMSG_SYSCALL):
Undefine.
[!__arch64__ && __LINUX_KERNEL_VERSION < 0x040400]
(__ASSUME_RECVMSG_SYSCALL): Likewise.
[!__arch64__ && __LINUX_KERNEL_VERSION < 0x040400]
(__ASSUME_SENDTO_SYSCALL): Likewise.
[!__arch64__ && __LINUX_KERNEL_VERSION < 0x040400]
(__ASSUME_ACCEPT_SYSCALL): Undefine under this condition, not just
[!__arch64__].
[!__arch64__ && __LINUX_KERNEL_VERSION < 0x040400]
(__ASSUME_CONNECT_SYSCALL): Likewise.
[!__arch64__ && __LINUX_KERNEL_VERSION < 0x040400]
(__ASSUME_RECVFROM_SYSCALL): Likewise.
[__LINUX_KERNEL_VERSION >= 0x040400] (__ASSUME_BIND_SYSCALL):
Define.
[__LINUX_KERNEL_VERSION >= 0x040400] (__ASSUME_LISTEN_SYSCALL):
Likewise.
[__LINUX_KERNEL_VERSION >= 0x040400]
(__ASSUME_SETSOCKOPT_SYSCALL): Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/syscalls.list (bind):
Remove.
(listen): Likewise.
(setsockopt): Likewise.
GAS treats the R5900 as MIPS III, with some modifications. The MIPS III
designation means that the GNU C Library will try to assemble the LL and
SC instructions, even though they are not implemented in the R5900. GAS
will therefore produce the following errors:
Error: opcode not supported on this processor: r5900 (mips3) `ll $2,0($4)'
Error: opcode not supported on this processor: r5900 (mips3) `sc $6,0($4)'
The MIPS II ISA override as used here enables the kernel to trap and
emulate the LL and SC instructions, as required.
This change has been tested by compiling the GNU C Library 2.27 with a
GCC 8.2.0 cross-compiler for mipsr5900el-unknown-linux-gnu under Gentoo.
* sysdeps/mips/sys/tas.h (_test_and_set): Handle the R5900 CPU
with the ISA override.
The #else of two nested #if clauses were identical.
* sysdeps/unix/sysv/linux/sysdep-vdso.h: Simplify an #if #else
#endif.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Mark the ra register as undefined in _start, so that unwinding through
main works correctly. Also, don't use a tail call so that ra points after
the call to __libc_start_main, not after the previous call.
* sysdeps/mach/hurd/i386/intr-msg.h (INTR_MSG_TRAP): Make
_hurd_intr_rpc_msg_about_to global point to start of controlled
assembly snippet. Make it check canceled flag.
* hurd/hurdsig.c (_hurdsig_abort_rpcs): Only mutate thread if it passed
the _hurd_intr_rpc_msg_about_to point.
* hurd/intr-msg.c (_hurd_intr_rpc_mach_msg): Remove comment on mutation
issue, remove cancel flag check.
When new symbol versions were introduced without SVID compatible
error handling the exp2f, log2f and powf symbols were accidentally
removed from the ia64 lim.a. The regression was introduced by
the commits
f5f0f52651
New expf and exp2f version without SVID compat wrapper
72d3d28108
New symbol version for logf, log2f and powf without SVID compat
With WEAK_LIBM_ENTRY(foo), there is a hidden __foo and weak foo
symbol definition in both SHARED and !SHARED build.
[BZ #23822]
* sysdeps/ia64/fpu/e_exp2f.S (exp2f): Use WEAK_LIBM_ENTRY.
* sysdeps/ia64/fpu/e_log2f.S (log2f): Likewise.
* sysdeps/ia64/fpu/e_exp2f.S (powf): Likewise.
This patch adds the IN_MASK_CREATE macro from Linux 4.19 to
sys/inotify.h.
Tested for x86_64.
* sysdeps/unix/sysv/linux/sys/inotify.h (IN_MASK_CREATE): New
macro.
To determine whether the default time_t interfaces are 32-bit
and so need conversions, or are 64-bit and so are compatible
with the internal 64-bit type without conversions, a macro
giving the size of the default time_t is also required.
This macro is called __TIMESIZE.
This macro can then be used instead of __WORDSIZE in msq-pad.h
and shm-pad.h files, which in turn allows removing their x86
variants, and in sem-pad.h files but keeping the x86 variant.
This patch was tested by running 'make check' on branch master
then applying this patch and running 'make check' again, and
checking that both 'make check' yield identical results.
This was done on x86_64-linux-gnu and i686-linux-gnu.
* bits/timesize.h: New file.
* stdlib/Makefile (headers): Add bits/timesize.h.
* sysdeps/unix/sysv/linux/bits/msq-pad.h
(__MSQ_PAD_AFTER_TIME): Use __TIMESIZE instead of __WORDSIZE.
* sysdeps/unix/sysv/linux/bits/sem-pad.h
(__SEM_PAD_AFTER_TIME): Likewise.
* sysdeps/unix/sysv/linux/bits/shm-pad.h
(__SHM_PAD_AFTER_TIME): Likewise.
* sysdeps/unix/sysv/linux/hppa/bits/msq-pad.h
(__MSQ_PAD_BEFORE_TIME): Likewise.
* sysdeps/unix/sysv/linux/hppa/bits/sem-pad.h
(__SEM_PAD_BEFORE_TIME): Likewise.
* sysdeps/unix/sysv/linux/hppa/bits/shm-pad.h
(__SHM_PAD_BEFORE_TIME, __SHM_PAD_BETWEEN_TIME_AND_SEGSZ): Likewise.
* sysdeps/unix/sysv/linux/mips/bits/msq-pad.h
(__MSQ_PAD_AFTER_TIME, __MSQ_PAD_BEFORE_TIME): Likewise.
* sysdeps/unix/sysv/linux/powerpc/bits/msq-pad.h
(__MSQ_PAD_BEFORE_TIME): Likewise.
* sysdeps/unix/sysv/linux/powerpc/bits/sem-pad.h
(__SEM_PAD_BEFORE_TIME): Likewise.
* sysdeps/unix/sysv/linux/powerpc/bits/shm-pad.h
(__SHM_PAD_BEFORE_TIME, __SHM_PAD_BETWEEN_TIME_AND_SEGSZ): Likewise.
* sysdeps/unix/sysv/linux/sparc/bits/msq-pad.h
(__MSQ_PAD_BEFORE_TIME): Likewise.
* sysdeps/unix/sysv/linux/sparc/bits/sem-pad.h
(__SEM_PAD_BEFORE_TIME): Likewise.
* sysdeps/unix/sysv/linux/sparc/bits/shm-pad.h
(__SHM_PAD_BEFORE_TIME): Likewise.
* sysdeps/unix/sysv/linux/x86/bits/msq-pad.h: Delete file.
* sysdeps/unix/sysv/linux/x86/bits/shm-pad.h: Likewise.
* sysdeps/unix/sysv/linux/x86/bits/timesize.h: New file.
RDTSCP waits until all previous instructions have executed and all
previous loads are globally visible before reading the counter. RDTSC
doesn't wait until all previous instructions have been executed before
reading the counter. All x86 processors since 2010 support RDTSCP
instruction. This patch adds RDTSCP support to benchtests.
* benchtests/Makefile (CPPFLAGS-nonlib): Add -DUSE_RDTSCP if
USE_RDTSCP is defined.
* sysdeps/x86/hp-timing.h (HP_TIMING_NOW): Use RDTSCP if
USE_RDTSCP is defined.
Th commit 'Disable TSX on some Haswell processors.' (2702856bf4) changed the
default flags for Haswell models. Previously, new models were handled by the
default switch path, which assumed a Core i3/i5/i7 if AVX is available. After
the patch, Haswell models (0x3f, 0x3c, 0x45, 0x46) do not set the flags
Fast_Rep_String, Fast_Unaligned_Load, Fast_Unaligned_Copy, and
Prefer_PMINUB_for_stringop (only the TSX one).
This patch fixes it by disentangle the TSX flag handling from the memory
optimization ones. The strstr case cited on patch now selects the
__strstr_sse2_unaligned as expected for the Haswell cpu.
Checked on x86_64-linux-gnu.
[BZ #23709]
* sysdeps/x86/cpu-features.c (init_cpu_features): Set TSX bits
independently of other flags.
Linux 4.19 does not add any new syscalls (some existing ones are added
to more architectures); this patch updates the version number in
syscall-names.list to reflect that it's still current for 4.19.
Tested with build-many-glibcs.py.
* sysdeps/unix/sysv/linux/syscall-names.list: Update kernel
version to 4.19.
Use __builtin_ia32_rdtsc directly since including <x86intrin.h> makes
building glibc very slow. On Intel Core i5-6260U, this patch reduces
x86-64 build time from 8 minutes 33 seconds to 3 minutes 48 seconds
with "make -j4" and GCC 8.2.1.
* sysdeps/x86/hp-timing.h: Don't include <x86intrin.h>.
(HP_TIMING_NOW): Replace _rdtsc with __builtin_ia32_rdtsc.
Since _rdtsc intrinsic is supported in GCC 4.9, we can use it for
HP_TIMING_NOW. This patch
1. Create x86 hp-timing.h to replace i686 and x86_64 hp-timing.h.
2. Move MINIMUM_ISA from init-arch.h to isa.h so that x86 hp-timing.h
can check minimum x86 ISA to decide if _rdtsc can be used.
NB: Checking if __i686__ isn't sufficient since __i686__ may not be
defined when building for i686 class processors.
* sysdeps/i386/init-arch.h: Removed.
* sysdeps/i386/i586/init-arch.h: Likewise.
* sysdeps/i386/i686/init-arch.h: Likewise.
* sysdeps/i386/i686/hp-timing.h: Likewise.
* sysdeps/x86_64/hp-timing.h: Likewise.
* sysdeps/i386/isa.h: New file.
* sysdeps/i386/i586/isa.h: Likewise.
* sysdeps/i386/i686/isa.h: Likewise.
* sysdeps/x86_64/isa.h: Likewise.
* sysdeps/x86/hp-timing.h: New file.
* sysdeps/x86/init-arch.h: Include <isa.h>.
After my patch to move SHMLBA to its own header, the bits/shm.h
headers for architectures using the Linux kernel still vary in a few
ways: the use of __syscall_ulong_t; whether padding for 32-bit systems
is present before or after time fields, or missing altogether (mips,
x32); whether shm_segsz is before or after the time fields; whether,
if after time fields, there is extra padding before shm_segsz.
This patch arranges for a single header to be used. __syscall_ulong_t
is safe to use everywhere, while bits/shm-pad.h is added with new
macros __SHM_PAD_AFTER_TIME, __SHM_PAD_BEFORE_TIME,
__SHM_SEGSZ_AFTER_TIME and __SHM_PAD_BETWEEN_TIME_AND_SEGSZ to
describe the differences.
Tested for x86_64 and x86, and with build-many-glibcs.py.
* sysdeps/unix/sysv/linux/Makefile (sysdep_headers): Add
bits/shm-pad.h.
* sysdeps/unix/sysv/linux/bits/shm.h: Include <bits/shm-pad.h>.
(shmatt_t): Define as __syscall_ulong_t.
(__SHM_PAD_TIME): New macro, depending on [__SHM_PAD_BEFORE_TIME]
and [__SHM_PAD_AFTER_TIME].
(struct shmid_ds): Define time fields using __SHM_PAD_TIME.
Define shm_segsz and associated padding based on
[__SHM_SEGSZ_AFTER_TIME] and [__SHM_PAD_BETWEEN_TIME_AND_SEGSZ].
Use __syscall_ulong_t instead of unsigned long int.
[__USE_MISC] (struct shminfo): Use __syscall_ulong_t instead of
unsigned long int.
[__USE_MISC] (struct shm_info): Likewise.
* sysdeps/unix/sysv/linux/bits/shm-pad.h: New file.
* sysdeps/unix/sysv/linux/hppa/bits/shm-pad.h: Likewise.
* sysdeps/unix/sysv/linux/mips/bits/shm-pad.h: Likewise.
* sysdeps/unix/sysv/linux/powerpc/bits/shm-pad.h: Likewise.
* sysdeps/unix/sysv/linux/sparc/bits/shm-pad.h: Likewise.
* sysdeps/unix/sysv/linux/x86/bits/shm-pad.h: Likewise.
* sysdeps/unix/sysv/linux/hppa/bits/shm.h: Remove.
* sysdeps/unix/sysv/linux/mips/bits/shm.h: Likewise.
* sysdeps/unix/sysv/linux/powerpc/bits/shm.h: Likewise.
* sysdeps/unix/sysv/linux/sparc/bits/shm.h: Likewise.
* sysdeps/unix/sysv/linux/x86/bits/shm.h: Likewise.
One difference between bits/shm.h headers for architectures using the
Linux kernel is the definition of SHMLBA. This was noted in
<https://sourceware.org/ml/libc-alpha/2018-09/msg00175.html> as a
reason why even a new architecture (C-SKY) might need its own
bits/shm.h; thus, splitting it out of bits/shm.h can allow less
duplication of headers for new architectures.
This patch moves that definition to its own header, bits/shmlba.h, to
allow more sharing of headers between architectures. That move allows
the arm, ia64 and sh variants of bits/shm.h to be removed, as they had
no other significant differences from the generic bits/shm.h; powerpc
and x86 have their own bits/shm.h but do not need to get their own
bits/shmlba.h because they use the same SHMLBA as the generic header.
Other architectures with their own bits/shm.h get their own
bits/shmlba.h without being able to remove their own bits/shm.h until
the generic one has been adapted to be able to handle more
architectures (where, in addition to the differences seen for
bits/msq.h and bits/sem.h, the position of shm_segsz in struct
shmid_ds also depends on the architecture).
Tested for x86_64 and x86, and with build-many-glibcs.py.
* sysdeps/unix/sysv/linux/Makefile (sysdep_headers): Add
bits/shmlba.h.
* sysdeps/unix/sysv/linux/bits/shm.h: Include <bits/shmlba.h>.
(SHMLBA): Remove macro.
(__getpagesize): Remove function declaration.
* sysdeps/unix/sysv/linux/hppa/bits/shm.h: Include
<bits/shmlba.h>.
(SHMLBA): Remove macro.
* sysdeps/unix/sysv/linux/mips/bits/shm.h: Include
<bits/shmlba.h>.
(SHMLBA): Remove macro.
* sysdeps/unix/sysv/linux/powerpc/bits/shm.h: Include
<bits/shmlba.h>.
(SHMLBA): Remove macro.
(__getpagesize): Remove function declaration.
* sysdeps/unix/sysv/linux/sparc/bits/shm.h: Include
<bits/shmlba.h>.
(SHMLBA): Remove macro.
(__getshmlba): Remove function declaration.
* sysdeps/unix/sysv/linux/x86/bits/shm.h: Include <bits/shmlba.h>.
(SHMLBA): Remove macro.
(__getpagesize): Remove function declaration.
* sysdeps/unix/sysv/linux/arm/bits/shm.h: Remove file.
* sysdeps/unix/sysv/linux/ia64/bits/shm.h: Likewise.
* sysdeps/unix/sysv/linux/sh/bits/shm.h: Likewise.
* sysdeps/unix/sysv/linux/bits/shmlba.h: New file.
* sysdeps/unix/sysv/linux/arm/bits/shmlba.h: Likewise.
* sysdeps/unix/sysv/linux/hppa/bits/shmlba.h: Likewise.
* sysdeps/unix/sysv/linux/ia64/bits/shmlba.h: Likewise.
* sysdeps/unix/sysv/linux/mips/bits/shmlba.h: Likewise.
* sysdeps/unix/sysv/linux/sh/bits/shmlba.h: Likewise.
* sysdeps/unix/sysv/linux/sparc/bits/shmlba.h: Likewise.
The race leads either to pthread_mutex_destroy returning EBUSY
or triggering an assertion (See description in bugzilla).
This patch is fixing the race by ensuring that the elision path is
used in all cases if elision is enabled by the GLIBC_TUNABLES framework.
The __kind variable in struct __pthread_mutex_s is accessed concurrently.
Therefore we are now using the atomic macros.
The new testcase tst-mutex10 is triggering the race on s390x and intel.
Presumably also on power, but I don't have access to a power machine
with lock-elision. At least the code for power is the same as on the other
two architectures.
ChangeLog:
[BZ #23275]
* nptl/tst-mutex10.c: New File.
* nptl/Makefile (tests): Add tst-mutex10.
(tst-mutex10-ENV): New variable.
* sysdeps/unix/sysv/linux/s390/force-elision.h: (FORCE_ELISION):
Ensure that elision path is used if elision is available.
* sysdeps/unix/sysv/linux/powerpc/force-elision.h (FORCE_ELISION):
Likewise.
* sysdeps/unix/sysv/linux/x86/force-elision.h: (FORCE_ELISION):
Likewise.
* nptl/pthreadP.h (PTHREAD_MUTEX_TYPE, PTHREAD_MUTEX_TYPE_ELISION)
(PTHREAD_MUTEX_PSHARED): Use atomic_load_relaxed.
* nptl/pthread_mutex_consistent.c (pthread_mutex_consistent): Likewise.
* nptl/pthread_mutex_getprioceiling.c (pthread_mutex_getprioceiling):
Likewise.
* nptl/pthread_mutex_lock.c (__pthread_mutex_lock_full)
(__pthread_mutex_cond_lock_adjust): Likewise.
* nptl/pthread_mutex_setprioceiling.c (pthread_mutex_setprioceiling):
Likewise.
* nptl/pthread_mutex_timedlock.c (__pthread_mutex_timedlock): Likewise.
* nptl/pthread_mutex_trylock.c (__pthread_mutex_trylock): Likewise.
* nptl/pthread_mutex_unlock.c (__pthread_mutex_unlock_full): Likewise.
* sysdeps/nptl/bits/thread-shared-types.h (struct __pthread_mutex_s):
Add comments.
* nptl/pthread_mutex_destroy.c (__pthread_mutex_destroy):
Use atomic_load_relaxed and atomic_store_relaxed.
* nptl/pthread_mutex_init.c (__pthread_mutex_init):
Use atomic_store_relaxed.
Since aligned loads and stores are huge performance
advantage the implementation always tries to do aligned
access. Among the cases when src and dst addresses are
aligned or unaligned evenly there are cases of not evenly
unaligned src and dst. For such cases (if the length is
big enough) ext instruction is used to merge-and-shift
two memory chunks loaded from two adjacent aligned
locations and then the adjusted chunk gets stored to
aligned address.
Performance gain against the current T2 implementation:
memcpy-large: 65K-32M: +40% - +10%
memcpy-walk: 128-32M: +20% - +2%
The bits/sem.h headers for architectures using the Linux kernel vary
in a few ways:
* x32 uses __syscall_ulong_t instead of unsigned long int.
* The x86 header uses padding after time fields unconditionally
(including for both x86_64 ABIs), not just for 32-bit time (unlike
in msqid_ds where there is only padding for 32-bit time). Because
this padding is present for x32, and is __syscall_ulong_t there, it
does have to be __syscall_ulong_t, not unsigned long int.
* The MIPS header never uses padding around time fields, even when
32-bit (unlike in msqid_ds where it has endian-dependent padding for
32-bit time).
* Some older 32-bit big-endian architectures have padding before
rather than after time fields, although the preferred generic
approach is padding after the time fields independent of endianness.
(There are also insubstantial differences such as use of unsigned int
for padding instead of unsigned long int, which makes no difference to
layout since the padding fields using unsigned int are only present on
32-bit architectures.)
For the first, __syscall_ulong_t can be used in the generic version as
it's the same as unsigned long int everywhere except x32. For the
other differences, this patch adds macros __SEM_PAD_BEFORE_TIME and
__SEM_PAD_AFTER_TIME in a new bits/sem-pad.h header, so that header is
the only one needing to be provided on architectures with differences
in this area, and everything else can go in a single common bits/sem.h
header.
Tested for x86_64 and x86, and with build-many-glibcs.py.
* sysdeps/unix/sysv/linux/Makefile (sysdep_headers): Add
bits/sem-pad.h.
* sysdeps/unix/sysv/linux/bits/sem.h: Include <bits/sem-pad.h>
instead of <bits/wordsize.h>.
(__SEM_PAD_TIME): New macro, depending on [__SEM_PAD_BEFORE_TIME]
and [__SEM_PAD_AFTER_TIME].
(struct semid_ds): Define time fields using __SEM_PAD_TIME. Use
__syscall_ulong_t instead of unsigned long int.
* sysdeps/unix/sysv/linux/bits/sem-pad.h: New file.
* sysdeps/unix/sysv/linux/hppa/bits/sem-pad.h: Likewise.
* sysdeps/unix/sysv/linux/mips/bits/sem-pad.h: Likewise.
* sysdeps/unix/sysv/linux/powerpc/bits/sem-pad.h: Likewise.
* sysdeps/unix/sysv/linux/sparc/bits/sem-pad.h: Likewise.
* sysdeps/unix/sysv/linux/x86/bits/sem-pad.h: Likewise.
* sysdeps/unix/sysv/linux/hppa/bits/sem.h: Remove.
* sysdeps/unix/sysv/linux/mips/bits/sem.h: Likewise.
* sysdeps/unix/sysv/linux/powerpc/bits/sem.h: Likewise.
* sysdeps/unix/sysv/linux/sparc/bits/sem.h: Likewise.
* sysdeps/unix/sysv/linux/x86/bits/sem.h: Likewise.
The bits/msq.h headers for architectures using the Linux kernel vary
in a few ways:
* x32 uses __syscall_ulong_t instead of unsigned long int.
* x32 has 64-bit time_t, so no padding around time fields despite
__WORDSIZE == 32.
* Some older 32-bit big-endian architectures have padding before
rather than after time fields, although the preferred generic
approach is padding after the time fields independent of endianness.
(There are also insubstantial differences such as use of unsigned int
for padding instead of unsigned long int, which makes no difference to
layout since the padding fields using unsigned int are only present on
32-bit architectures.)
For the first, __syscall_ulong_t can be used in the generic version as
it's the same as unsigned long int everywhere except x32. For the
other two differences, this patch adds macros __MSQ_PAD_BEFORE_TIME
and __MSQ_PAD_AFTER_TIME in a new bits/msq-pad.h header, so that
header is the only one needing to be provided on architectures with
differences in this area, and everything else can go in a single
common bits/msq.h header. Once we have __TIMESIZE, the generic
bits/msq-pad.h can change to use that instead of __WORDSIZE, at which
point the x86 version of bits/msq-pad.h won't be needed either.
Tested for x86_64 and x86, and with build-many-glibcs.py.
* sysdeps/unix/sysv/linux/Makefile (sysdep_headers): Add
bits/msq-pad.h.
* sysdeps/unix/sysv/linux/bits/msq.h: Include <bits/msq-pad.h>
instead of <bits/wordsize.h>.
(msgqnum_t): Define as __syscall_ulong_t.
(msglen_t): Likewise.
(__MSQ_PAD_TIME): New macro, depending on [__MSQ_PAD_BEFORE_TIME]
and [__MSQ_PAD_AFTER_TIME].
(struct msqid_ds): Define time fields using __MSQ_PAD_TIME. Use
__syscall_ulong_t instead of unsigned long int.
* sysdeps/unix/sysv/linux/bits/msq-pad.h: New file.
* sysdeps/unix/sysv/linux/hppa/bits/msq-pad.h: Likewise.
* sysdeps/unix/sysv/linux/mips/bits/msq-pad.h: Likewise.
* sysdeps/unix/sysv/linux/powerpc/bits/msq-pad.h: Likewise.
* sysdeps/unix/sysv/linux/sparc/bits/msq-pad.h: Likewise.
* sysdeps/unix/sysv/linux/x86/bits/msq-pad.h: Likewise.
* sysdeps/unix/sysv/linux/hppa/bits/msq.h: Remove.
* sysdeps/unix/sysv/linux/mips/bits/msq.h: Likewise.
* sysdeps/unix/sysv/linux/powerpc/bits/msq.h: Likewise.
* sysdeps/unix/sysv/linux/sparc/bits/msq.h: Likewise.
* sysdeps/unix/sysv/linux/x86/bits/msq.h: Likewise.
sysdeps/unix/sysv/linux/bits/shm.h has padding after time fields in
struct shmid_ds unconditionally, and thus is only suitable for 32-bit
architectures (no 64-bit configurations use this file);
sysdeps/unix/sysv/linux/generic/bits/shm.h is substantively the same,
except that the padding is conditioned on __WORDSIZE == 32, and so it
can be used for 64-bit architectures as well.
This patch adds the conditionals to
sysdeps/unix/sysv/linux/bits/shm.h. The linux/generic/ version is
then no longer needed and so is removed, as are the alpha and s390
versions which are also no longer needed. The other
architecture-specific versions have different padding, layout, types
or SHMLBA definitions and so are still needed after this change.
This is essentially the same change for bits/shm.h as the bits/msq.h
patch and the bits/sem.h patch. However, the details of the padding
variations for the architectures that aren't changed are not all the
same between msqid_ds, shmid_ds and semid_ds.
Tested with build-many-glibcs.py.
* sysdeps/unix/sysv/linux/bits/shm.h: Include <bits/wordsize.h>.
(struct shmid_ds): Condition padding after time fields on
[__WORDSIZE == 32].
* sysdeps/unix/sysv/linux/alpha/bits/shm.h: Remove file.
* sysdeps/unix/sysv/linux/generic/bits/shm.h: Likewise.
* sysdeps/unix/sysv/linux/s390/bits/shm.h: Likewise.
sysdeps/unix/sysv/linux/bits/sem.h has padding after time fields in
struct semid_ds unconditionally, and thus is only suitable for 32-bit
architectures (no 64-bit configurations use this file);
sysdeps/unix/sysv/linux/generic/bits/sem.h is substantively the same,
except that the padding is conditioned on __WORDSIZE == 32, and so it
can be used for 64-bit architectures as well.
This patch adds the conditionals to
sysdeps/unix/sysv/linux/bits/sem.h. The linux/generic/ version is
then no longer needed and so is removed, as are the alpha, ia64 and
s390 versions which are also no longer needed. The other
architecture-specific versions have different padding or types and so
are still needed after this change.
This is essentially the same change for bits/sem.h as the bits/msq.h
patch. However, the details of the padding variations for the
architectures that aren't changed are not all the same between
msqid_ds and semid_ds.
Tested with build-many-glibcs.py.
* sysdeps/unix/sysv/linux/bits/sem.h: Include <bits/wordsize.h>.
(struct semid_ds): Condition padding after time fields on
[__WORDSIZE == 32].
* sysdeps/unix/sysv/linux/alpha/bits/sem.h: Remove file.
* sysdeps/unix/sysv/linux/generic/bits/sem.h: Likewise.
* sysdeps/unix/sysv/linux/ia64/bits/sem.h: Likewise.
* sysdeps/unix/sysv/linux/s390/bits/sem.h: Likewise.
sysdeps/unix/sysv/linux/bits/msq.h has padding after time fields in
struct msqid_ds unconditionally, and thus is only suitable for 32-bit
architectures (no 64-bit configurations use this file);
sysdeps/unix/sysv/linux/generic/bits/msq.h is substantively the same,
except that the padding is conditioned on __WORDSIZE == 32, and so it
can be used for 64-bit architectures as well.
This patch adds the conditionals to
sysdeps/unix/sysv/linux/bits/msq.h. The linux/generic/ version is
then no longer needed and so is removed, as are the alpha, ia64 and
s390 versions which are also no longer needed. The other
architecture-specific versions have different padding or types and so
are still needed after this change.
Tested with build-many-glibcs.py.
* sysdeps/unix/sysv/linux/bits/msq.h: Include <bits/wordsize.h>.
(struct msqid_ds): Condition padding after time fields on
[__WORDSIZE == 32].
* sysdeps/unix/sysv/linux/alpha/bits/msq.h: Remove file.
* sysdeps/unix/sysv/linux/generic/bits/msq.h: Likewise.
* sysdeps/unix/sysv/linux/ia64/bits/msq.h: Likewise.
* sysdeps/unix/sysv/linux/s390/bits/msq.h: Likewise.
hppa currently has a bits/mman.h that does not include
bits/mman-linux.h, unlike all other architectures using the Linux
kernel. This sort of variation between architectures is generally
unhelpful when making global changes for new constants added to new
Linux kernel releases.
This patch changes hppa to use bits/mman-linux.h, overriding constants
with different values as necessary (including with #undef after
bits/mman.h inclusion when needed, as already done for alpha). While
there could possibly be further improvements through e.g. splitting
more sets of definitions into separate bits/ headers, I think this is
still an improvement on the current state. diffstat shows 27 lines
added, 51 deleted (and some of that is actually existing lines moving
to a different place in the file).
Tested with build-many-glibcs.py for hppa-linux-gnu.
* sysdeps/unix/sysv/linux/hppa/bits/mman.h: Include
<bits/mman-linux.h>.
(PROT_READ): Don't define here.
(PROT_WRITE): Likewise.
(PROT_EXEC): Likewise.
(PROT_NONE): Likewise.
(PROT_GROWSDOWN): Likewise.
(PROT_GROWSUP): Likewise.
(MAP_SHARED): Likewise.
(MAP_PRIVATE): Likewise.
[__USE_MISC] (MAP_SHARED_VALIDATE): Likewise.
[__USE_MISC] (MAP_FILE): Likewise.
[__USE_MISC] (MAP_ANONYMOUS): Likewise.
[__USE_MISC] (MAP_ANON): Likewise.
[__USE_MISC] (MAP_HUGE_SHIFT): Likewise.
[__USE_MISC] (MAP_HUGE_MASK): Likewise.
(MCL_CURRENT): Likewise.
(MCL_FUTURE): Likewise.
(MCL_ONFAULT): Likewise.
[__USE_MISC] (MADV_NORMAL): Likewise.
[__USE_MISC] (MADV_RANDOM): Likewise.
[__USE_MISC] (MADV_SEQUENTIAL): Likewise.
[__USE_MISC] (MADV_WILLNEED): Likewise.
[__USE_MISC] (MADV_DONTNEED): Likewise.
[__USE_MISC] (MADV_FREE): Likewise.
[__USE_MISC] (MADV_REMOVE): Likewise.
[__USE_MISC] (MADV_DONTFORK): Likewise.
[__USE_MISC] (MADV_DOFORK): Likewise.
[__USE_MISC] (MADV_HWPOISON): Likewise.
[__USE_XOPEN2K] (POSIX_MADV_NORMAL): Likewise.
[__USE_XOPEN2K] (POSIX_MADV_RANDOM): Likewise.
[__USE_XOPEN2K] (POSIX_MADV_SEQUENTIAL): Likewise.
[__USE_XOPEN2K] (POSIX_MADV_WILLNEED): Likewise.
[__USE_XOPEN2K] (POSIX_MADV_DONTNEED): Likewise.
(__MAP_ANONYMOUS): New macro.
[__USE_MISC] (MAP_TYPE): Undefine and redefine after
<bits/mman-linux.h> inclusion.
(MAP_FIXED): Likewise.
(MS_SYNC): Likewise.
(MS_ASYNC): Likewise.
(MS_INVALIDATE): Likewise.
[__USE_MISC] (MADV_MERGEABLE): Likewise.
[__USE_MISC] (MADV_UNMERGEABLE): Likewise.
[__USE_MISC] (MADV_HUGEPAGE): Likewise.
[__USE_MISC] (MADV_NOHUGEPAGE): Likewise.
[__USE_MISC] (MADV_DONTDUMP): Likewise.
[__USE_MISC] (MADV_DODUMP): Likewise.
[__USE_MISC] (MADV_WIPEONFORK): Likewise.
[__USE_MISC] (MADV_KEEPONFORK): Likewise.
The redirection of built-in functions such as sqrt in include/math.h
applies when the wrappers for those functions in libnldbl_nonshared.a
are built, resulting in references to internal names such as
__ieee754_sqrt that aren't actually exported from the shared libm.
(This applies for sqrt in 2.28, also for the round-to-integer
functions in current master because of my changes there.) This patch
arranges for NO_MATH_REDIRECT to be used for all the affected
functions, and adds a test for those functions in
libnldbl_nonshared.a.
(We could of course choose to obsolete libnldbl_nonshared.a and
require that people building with -mlong-double-64 either include the
relevant headers and have a compiler supporting asm redirection, or
have some other means of achieving that redirection at compile time if
not including those headers. But while we have libnldbl_nonshared.a,
it seems appropriate to fix such bugs in it.)
Tested for powerpc, and with build-many-glibcs.py.
[BZ #23735]
* sysdeps/ieee754/ldbl-opt/nldbl-compat.h (NO_MATH_REDIRECT):
Define.
* sysdeps/ieee754/ldbl-opt/test-nldbl-redirect.c: New file.
* sysdeps/ieee754/ldbl-opt/Makefile [$(subdir) = math] (tests):
Add test-nldbl-redirect.
[$(subdir) = math] (CFLAGS-test-nldbl-redirect.c): New variable.
[$(subdir) = math] ($(objpfx)test-nldbl-redirect): Depend on
$(objpfx)libnldbl_nonshared.a.
* with -O, -O1, -Os it fails with:
In file included from ../soft-fp/soft-fp.h:318,
from ../sysdeps/ieee754/soft-fp/s_fdiv.c:28:
../sysdeps/ieee754/soft-fp/s_fdiv.c: In function '__fdiv':
../soft-fp/op-2.h:98:25: error: 'R_f1' may be used uninitialized in this function [-Werror=maybe-uninitialized]
X##_f0 = (X##_f1 << (_FP_W_TYPE_SIZE - (N)) | X##_f0 >> (N) \
^~
../sysdeps/ieee754/soft-fp/s_fdiv.c:38:14: note: 'R_f1' was declared here
FP_DECL_D (R);
^
../soft-fp/op-2.h:37:36: note: in definition of macro '_FP_FRAC_DECL_2'
_FP_W_TYPE X##_f0 _FP_ZERO_INIT, X##_f1 _FP_ZERO_INIT
^
../soft-fp/double.h:95:24: note: in expansion of macro '_FP_DECL'
# define FP_DECL_D(X) _FP_DECL (2, X)
^~~~~~~~
../sysdeps/ieee754/soft-fp/s_fdiv.c:38:3: note: in expansion of macro 'FP_DECL_D'
FP_DECL_D (R);
^~~~~~~~~
../soft-fp/op-2.h:101:17: error: 'R_f0' may be used uninitialized in this function [-Werror=maybe-uninitialized]
: (X##_f0 << (_FP_W_TYPE_SIZE - (N))) != 0)); \
^~
../sysdeps/ieee754/soft-fp/s_fdiv.c:38:14: note: 'R_f0' was declared here
FP_DECL_D (R);
^
../soft-fp/op-2.h:37:14: note: in definition of macro '_FP_FRAC_DECL_2'
_FP_W_TYPE X##_f0 _FP_ZERO_INIT, X##_f1 _FP_ZERO_INIT
^
../soft-fp/double.h:95:24: note: in expansion of macro '_FP_DECL'
# define FP_DECL_D(X) _FP_DECL (2, X)
^~~~~~~~
../sysdeps/ieee754/soft-fp/s_fdiv.c:38:3: note: in expansion of macro 'FP_DECL_D'
FP_DECL_D (R);
^~~~~~~~~
Build tested with Yocto for ARM, AARCH64, X86, X86_64, PPC, MIPS, MIPS64
with -O, -O1, -Os.
For AARCH64 it needs one more fix in locale for -Os.
[BZ #19444]
* sysdeps/ieee754/soft-fp/s_fdiv.c: Include <libc-diag.h> and use
DIAG_PUSH_NEEDS_COMMENT, DIAG_IGNORE_NEEDS_COMMENT and
DIAG_POP_NEEDS_COMMENT to disable -Wmaybe-uninitialized.
Since RTM intrinsics are supported in GCC 4.9, we can use them in
pthread mutex lock elision.
* sysdeps/unix/sysv/linux/x86/Makefile (CFLAGS-elision-lock.c):
Add -mrtm.
(CFLAGS-elision-unlock.c): Likewise.
(CFLAGS-elision-timed.c): Likewise.
(CFLAGS-elision-trylock.c): Likewise.
* sysdeps/unix/sysv/linux/x86/hle.h: Rewritten.
As POSIX states [1] a freopen call should first flush the stream as if by a
call fflush. C99 (n1256) and C11 (n1570) only states the function should
first close any file associated with the specific stream. Although current
implementation only follow C specification, current BSD and other libc
implementation (musl) are in sync with POSIX and fflush the stream.
This patch change freopen{64} to fflush the stream before actually reopening
it (or returning if the stream does not support reopen). It also changes the
Linux implementation to avoid a dynamic allocation on 'fd_to_filename'.
Checked on x86_64-linux-gnu.
[BZ #21037]
* libio/Makefile (tests): Add tst-memstream4 and tst-wmemstream4.
* libio/freopen.c (freopen): Sync stream before reopen and adjust to
new fd_to_filename interface.
* libio/freopen64.c (freopen64): Likewise.
* libio/tst-memstream.h: New file.
* libio/tst-memstream4.c: Likewise.
* libio/tst-wmemstream4.c: Likewise.
* sysdeps/generic/fd_to_filename.h (fd_to_filename): Change signature.
* sysdeps/unix/sysv/linux/fd_to_filename.h (fd_to_filename): Likewise
and remove internal dynamic allocation.
[1] http://pubs.opengroup.org/onlinepubs/9699919799/
The MREMAP_* flags are identical between bits/mman-linux.h and the
hppa bits/mman.h; thus, they should be in bits/mman-shared.h instead
to avoid unnecessary duplication. This patch moves them there.
Tested for x86_64, and with build-many-glibcs.py.
* sysdeps/unix/sysv/linux/bits/mman-linux.h [__USE_GNU]
(MREMAP_MAYMOVE): Do not define here.
[__USE_GNU] (MREMAP_FIXED): Likewise.
* sysdeps/unix/sysv/linux/bits/mman-shared.h [__USE_GNU]
(MREMAP_MAYMOVE): Define here instead.
[__USE_GNU] (MREMAP_FIXED): Likewise.
* sysdeps/unix/sysv/linux/hppa/bits/mman.h [__USE_GNU]
(MREMAP_MAYMOVE): Remove.
[__USE_GNU] (MREMAP_FIXED): Likewise.
After my changes to move various macros, inlines and other content
from math_private.h to more specific headers, many files including
math_private.h no longer need to do so. Furthermore, since the
optimized inlines of various functions have been moved to
include/fenv.h or replaced by use of function names GCC inlines
automatically, a missing math_private.h include where one is
appropriate will reliably cause a build failure rather than possibly
causing code to be less well optimized while still building
successfully. Thus, this patch removes includes of math_private.h
that are now unnecessary. In the case of two RISC-V files, the
include is replaced by one of stdbool.h because the files in question
were relying on math_private.h to get a definition of bool.
Tested for x86_64 and x86, and with build-many-glibcs.py.
* math/fromfp.h: Do not include <math_private.h>.
* math/s_cacosh_template.c: Likewise.
* math/s_casin_template.c: Likewise.
* math/s_casinh_template.c: Likewise.
* math/s_ccos_template.c: Likewise.
* math/s_cproj_template.c: Likewise.
* math/s_fdim_template.c: Likewise.
* math/s_fmaxmag_template.c: Likewise.
* math/s_fminmag_template.c: Likewise.
* math/s_iseqsig_template.c: Likewise.
* math/s_ldexp_template.c: Likewise.
* math/s_nextdown_template.c: Likewise.
* math/w_log1p_template.c: Likewise.
* math/w_scalbln_template.c: Likewise.
* sysdeps/aarch64/fpu/feholdexcpt.c: Likewise.
* sysdeps/aarch64/fpu/fesetround.c: Likewise.
* sysdeps/aarch64/fpu/fgetexcptflg.c: Likewise.
* sysdeps/aarch64/fpu/ftestexcept.c: Likewise.
* sysdeps/aarch64/fpu/s_llrint.c: Likewise.
* sysdeps/aarch64/fpu/s_llrintf.c: Likewise.
* sysdeps/aarch64/fpu/s_lrint.c: Likewise.
* sysdeps/aarch64/fpu/s_lrintf.c: Likewise.
* sysdeps/i386/fpu/s_atanl.c: Likewise.
* sysdeps/i386/fpu/s_f32xaddf64.c: Likewise.
* sysdeps/i386/fpu/s_f32xsubf64.c: Likewise.
* sysdeps/i386/fpu/s_fdim.c: Likewise.
* sysdeps/i386/fpu/s_logbl.c: Likewise.
* sysdeps/i386/fpu/s_rintl.c: Likewise.
* sysdeps/i386/fpu/s_significandl.c: Likewise.
* sysdeps/ia64/fpu/s_matherrf.c: Likewise.
* sysdeps/ia64/fpu/s_matherrl.c: Likewise.
* sysdeps/ieee754/dbl-64/s_atan.c: Likewise.
* sysdeps/ieee754/dbl-64/s_cbrt.c: Likewise.
* sysdeps/ieee754/dbl-64/s_fma.c: Likewise.
* sysdeps/ieee754/dbl-64/s_fmaf.c: Likewise.
* sysdeps/ieee754/flt-32/s_cbrtf.c: Likewise.
* sysdeps/ieee754/k_standardf.c: Likewise.
* sysdeps/ieee754/k_standardl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_copysignl.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_finitel.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_fpclassifyl.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_isinfl.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_isnanl.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_signbitl.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_cbrtl.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_fma.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_fmal.c: Likewise.
* sysdeps/ieee754/s_signgam.c: Likewise.
* sysdeps/powerpc/power5+/fpu/s_modf.c: Likewise.
* sysdeps/powerpc/power5+/fpu/s_modff.c: Likewise.
* sysdeps/powerpc/power7/fpu/s_logbf.c: Likewise.
* sysdeps/riscv/rv64/rvd/s_ceil.c: Likewise.
* sysdeps/riscv/rv64/rvd/s_floor.c: Likewise.
* sysdeps/riscv/rv64/rvd/s_nearbyint.c: Likewise.
* sysdeps/riscv/rv64/rvd/s_round.c: Likewise.
* sysdeps/riscv/rv64/rvd/s_roundeven.c: Likewise.
* sysdeps/riscv/rv64/rvd/s_trunc.c: Likewise.
* sysdeps/riscv/rvd/s_finite.c: Likewise.
* sysdeps/riscv/rvd/s_fmax.c: Likewise.
* sysdeps/riscv/rvd/s_fmin.c: Likewise.
* sysdeps/riscv/rvd/s_fpclassify.c: Likewise.
* sysdeps/riscv/rvd/s_isinf.c: Likewise.
* sysdeps/riscv/rvd/s_isnan.c: Likewise.
* sysdeps/riscv/rvd/s_issignaling.c: Likewise.
* sysdeps/riscv/rvf/fegetround.c: Likewise.
* sysdeps/riscv/rvf/feholdexcpt.c: Likewise.
* sysdeps/riscv/rvf/fesetenv.c: Likewise.
* sysdeps/riscv/rvf/fesetround.c: Likewise.
* sysdeps/riscv/rvf/feupdateenv.c: Likewise.
* sysdeps/riscv/rvf/fgetexcptflg.c: Likewise.
* sysdeps/riscv/rvf/ftestexcept.c: Likewise.
* sysdeps/riscv/rvf/s_ceilf.c: Likewise.
* sysdeps/riscv/rvf/s_finitef.c: Likewise.
* sysdeps/riscv/rvf/s_floorf.c: Likewise.
* sysdeps/riscv/rvf/s_fmaxf.c: Likewise.
* sysdeps/riscv/rvf/s_fminf.c: Likewise.
* sysdeps/riscv/rvf/s_fpclassifyf.c: Likewise.
* sysdeps/riscv/rvf/s_isinff.c: Likewise.
* sysdeps/riscv/rvf/s_isnanf.c: Likewise.
* sysdeps/riscv/rvf/s_issignalingf.c: Likewise.
* sysdeps/riscv/rvf/s_nearbyintf.c: Likewise.
* sysdeps/riscv/rvf/s_roundevenf.c: Likewise.
* sysdeps/riscv/rvf/s_roundf.c: Likewise.
* sysdeps/riscv/rvf/s_truncf.c: Likewise.
* sysdeps/riscv/rv64/rvd/s_rint.c: Include <stdbool.h> instead of
<math_private.h>.
* sysdeps/riscv/rvf/s_rintf.c: Likewise.
When elf_machine_runtime_setup is called to set up resolver, it should
use _dl_runtime_resolve_shstk or _dl_runtime_profile_shstk if SHSTK is
enabled by kernel.
Tested on i686 with and without --enable-cet as well as on CET emulator
with --enable-cet.
[BZ #23716]
* sysdeps/i386/dl-cet.c: Removed.
* sysdeps/i386/dl-machine.h (_dl_runtime_resolve_shstk): New
prototype.
(_dl_runtime_profile_shstk): Likewise.
(elf_machine_runtime_setup): Use _dl_runtime_profile_shstk or
_dl_runtime_resolve_shstk if SHSTK is enabled by kernel.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
The fallback code of Linux wrapper for preadv2/pwritev2 executes
regardless of the errno code for preadv2, instead of the case where
the syscall is not supported.
This fixes it by calling the fallback code iff errno is ENOSYS. The
patch also adds tests for both invalid file descriptor and invalid
iov_len and vector count.
The only discrepancy between preadv2 and fallback code regarding
error reporting is when an invalid flags are used. The fallback code
bails out earlier with ENOTSUP instead of EINVAL/EBADF when the syscall
is used.
Checked on x86_64-linux-gnu on a 4.4.0 and 4.15.0 kernel.
[BZ #23579]
* misc/tst-preadvwritev2-common.c (do_test_with_invalid_fd): New
test.
* misc/tst-preadvwritev2.c, misc/tst-preadvwritev64v2.c (do_test):
Call do_test_with_invalid_fd.
* sysdeps/unix/sysv/linux/preadv2.c (preadv2): Use fallback code iff
errno is ENOSYS.
* sysdeps/unix/sysv/linux/preadv64v2.c (preadv64v2): Likewise.
* sysdeps/unix/sysv/linux/pwritev2.c (pwritev2): Likewise.
* sysdeps/unix/sysv/linux/pwritev64v2.c (pwritev64v2): Likewise.
Continuing the move to use, within libm, public names for libm
functions that can be inlined as built-in functions on many
architectures, this patch moves calls to __round functions to call the
corresponding round names instead, with asm redirection to __round
when the calls are not inlined.
An additional complication arises in
sysdeps/ieee754/ldbl-128ibm/e_expl.c, where a call to roundl, with the
result converted to int, gets converted by the compiler to call
lroundl in the case of 32-bit long, so resulting in localplt test
failures. It's logically correct to let the compiler make such an
optimization; an appropriate asm redirection of lroundl to __lroundl
is thus added to that file (it's not needed anywhere else).
Tested for x86_64, and with build-many-glibcs.py.
* include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ &&
__FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (round): Redirect
using MATH_REDIRECT.
* sysdeps/aarch64/fpu/s_round.c: Define NO_MATH_REDIRECT before
header inclusion.
* sysdeps/aarch64/fpu/s_roundf.c: Likewise.
* sysdeps/ieee754/dbl-64/s_round.c: Likewise.
* sysdeps/ieee754/dbl-64/wordsize-64/s_round.c: Likewise.
* sysdeps/ieee754/float128/s_roundf128.c: Likewise.
* sysdeps/ieee754/flt-32/s_roundf.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_roundl.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_roundl.c: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_round.c: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_roundf.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_round.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_roundf.c: Likewise.
* sysdeps/riscv/rv64/rvd/s_round.c: Likewise.
* sysdeps/riscv/rvf/s_roundf.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_roundl.c: Likewise.
(round): Redirect to __round.
(__roundl): Call round instead of __round.
* sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__round):
Remove macro.
[_ARCH_PWR5X] (__roundf): Likewise.
* sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Use round
functions instead of __round variants.
* sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Likewise.
* sysdeps/ieee754/ldbl-128/e_gammal_r.c (gammal_positive):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (gammal_positive):
Likewise.
* sysdeps/ieee754/ldbl-96/e_gammal_r.c (gammal_positive):
Likewise.
* sysdeps/x86/fpu/powl_helper.c (__powl_helper): Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_expl.c (lroundl): Redirect to
__lroundl.
(__ieee754_expl): Call roundl instead of __roundl.
Continuing bits/mman.h unification between architectures using the
Linux kernel, this patch arranges for the common set of MAP_* flags to
be used by two more architectures. That common set is moved to
bits/mman-map-flags-generic.h, which is included by bits/mman.h, to
allow architectures to use that common set even if they also have
architecture-specific additions to it. As well as the generic
bits/mman.h, the versions for x86 and ia64 are also then made to
include bits/mman-map-flags-generic.h, so while they still need
architecture-specific bits/mman.h (for MAP_32BIT and MAP_GROWSUP
respectively), they do not need to duplicate the generic flag
definitions in there.
Tested for x86_64 and x86, and with build-many-glibcs.py.
* sysdeps/unix/sysv/linux/bits/mman-map-flags-generic.h: New
file. Most contents moved from ....
* sysdeps/unix/sysv/linux/bits/mman.h: ... here. Move contents to
and include <bits/mman-map-flags-generic.h>.
* sysdeps/unix/sysv/linux/Makefile [$(subdir) = misc]
(sysdep_headers): Add bits/mman-map-flags-generic.h.
* sysdeps/unix/sysv/linux/ia64/bits/mman.h: Include
<bits/mman-map-flags-generic.h>.
[__USE_MISC] (MAP_GROWSUP): Only define this macro, not other
macros defined in <bits/mman-map-flags-generic.h>.
* sysdeps/unix/sysv/linux/x86/bits/mman.h: Include
<bits/mman-map-flags-generic.h>.
[__USE_MISC] (MAP_32BIT): Only define this macro, not other macros
defined in <bits/mman-map-flags-generic.h>.
This patch completes the process of unifying sys/procfs.h headers for
architectures using the Linux kernel by making alpha use the generic
version.
That was previously deferred because alpha has different definitions
of prgregset_t and prfpregset_t from other architectures, so changing
to the common definitions would change C++ name mangling. To avoid
such a change, a header bits/procfs-prregset.h is added, and alpha
gets its own version of that header.
Tested for x86_64 and x86, and with build-many-glibcs.py.
* sysdeps/unix/sysv/linux/sys/procfs.h: Include
<bits/procfs-prregset.h>.
(prgregset_t): Define using __prgregset_t.
(prfpregset_t): Define using __prfpregset_t.
* sysdeps/unix/sysv/linux/Makefile [$(subdir) = misc]
(sysdep_headers): Add bits/procfs-prregset.h.
* sysdeps/unix/sysv/linux/bits/procfs-prregset.h: New file.
* sysdeps/unix/sysv/linux/alpha/bits/procfs-prregset.h: Likewise.
* sysdeps/unix/sysv/linux/alpha/bits/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/alpha/sys/procfs.h: Remove file.
This patch continues the process of unifying sys/procfs.h headers for
architectures using the Linux kernel.
A bits/procfs-id.h header is added to define __pr_uid_t and __pr_gid_t
for the types of pr_uid and pr_gid; the default version of this header
uses unsigned int. On some architectures, sys/procfs.h has copies of
32-bit structures for 64-bit builds; those move into a
bits/procfs-extra.h header (they can't go in bits/procfs.h because
they have to come *after* other declarations from sys/procfs.h).
Given appropriate versions of these headers, six more architectures
can then move to providing only bits/procfs*.h without duplicating the
rest of the contents of sys/procfs.h. Only alpha needs a further
bits/ header to be added before it can stop having its own
sys/procfs.h.
Tested for x86_64 and x86, and with build-many-glibcs.py.
* sysdeps/unix/sysv/linux/sys/procfs.h: Include
<bits/procfs-id.h> and <bits/procfs-extra.h>.
(struct elf_prpsinfo): Use __pr_uid_t and __pr_gid_t as types of
pr_uid and pr_gid.
* sysdeps/unix/sysv/linux/Makefile [$(subdir) = misc]
(sysdep_headers): Add bits/procfs-id.h and bits/procfs-extra.h.
* sysdeps/unix/sysv/linux/bits/procfs-extra.h: New file.
* sysdeps/unix/sysv/linux/bits/procfs-id.h: Likewise.
* sysdeps/unix/sysv/linux/arm/bits/procfs-id.h: Likewise.
* sysdeps/unix/sysv/linux/arm/bits/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/m68k/bits/procfs-id.h: Likewise.
* sysdeps/unix/sysv/linux/m68k/bits/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/s390/bits/procfs-extra.h: Likewise.
* sysdeps/unix/sysv/linux/s390/bits/procfs-id.h: Likewise.
* sysdeps/unix/sysv/linux/s390/bits/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/sh/bits/procfs-id.h: Likewise.
* sysdeps/unix/sysv/linux/sh/bits/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/sparc/bits/procfs-extra.h: Likewise.
* sysdeps/unix/sysv/linux/sparc/bits/procfs-id.h: Likewise.
* sysdeps/unix/sysv/linux/sparc/bits/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/x86/bits/procfs-id.h: Likewise.
* sysdeps/unix/sysv/linux/x86/bits/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/arm/sys/procfs.h: Remove file.
* sysdeps/unix/sysv/linux/m68k/sys/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/s390/sys/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/sh/sys/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/sparc/sys/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/x86/sys/procfs.h: Likewise.
As per recent discussions, this patch unifies some of the sys/procfs.h
headers for architectures using the Linux kernel, producing a generic
version that can hopefully be used by all new architectures as well.
The new generic version is based on the AArch64 one. The register
definitions, the only part that generally needs to vary by
architecture, go in a new bits/procfs.h header (which each
architecture using the generic version needs to provide); that header
also has any #includes that were in the architecture-specific
sys/procfs.h, where those includes went beyond the generic set.
The generic version is used for eight architectures where the generic
definitions were the same as the architecture-specific ones. (Some of
those architectures had #if 0 fields, now removed; some defined types
or fields using different type names which were typedefs for the same
underlying types.)
Six of the remaining architectures with their own sys/procfs.h use
unsigned short for pr_uid / pr_gid in some cases; moving those to the
generic header will require a bits/ header to define a typedef for the
type of those fields. In the case of alpha, the generic sys/procfs.h
uses elf_gregset_t (= unsigned long int[33]) to define prgregset_t and
elf_fpregset_t (= double[32]) to define prfpregset_t, but the alpha
version uses gregset_t (= long int[33]) and fpregset_t (= long
int[32]), so avoiding unnecessarily changing the underlying types (and
thus C++ name mangling) again means a bits/ header will need to be
able to define a different choice for those typedefs.
bits/procfs.h is included outside the __BEGIN_DECLS / __END_DECLS pair
(whereas the definitions it contains were previously inside that pair
in various sys/procfs.h headers), because it sometimes includes other
headers and putting those other #includes inside that pair seems
risky. Because none of the declarations in bits/procfs.h are of
functions or variables or involve function types, I don't think it
makes any difference whether they are inside or outside an extern "C"
context.
Tested with build-many-glibcs.py (again, that does not provide much
validation for the correctness of this patch).
* sysdeps/unix/sysv/linux/sys/procfs.h: Replace with file based on
AArch64 version. Include <bits/procfs.h>.
* sysdeps/unix/sysv/linux/Makefile [$(subdir) = misc]
(sysdep_headers): Add bits/procfs.h.
* sysdeps/unix/sysv/linux/bits/procfs.h: New file.
* sysdeps/unix/sysv/linux/aarch64/bits/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/hppa/bits/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/ia64/bits/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/microblaze/bits/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/mips/bits/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/nios2/bits/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/powerpc/bits/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/riscv/bits/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/aarch64/sys/procfs.h: Remove file.
* sysdeps/unix/sysv/linux/hppa/sys/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/ia64/sys/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/microblaze/sys/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/mips/sys/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/nios2/sys/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/powerpc/sys/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/riscv/sys/procfs.h: Likewise.
Linux from 3.9 through 4.2 does not abort HTM transaction on syscalls,
instead it suspend and resume it when leaving the kernel. The
side-effects of the syscall will always remain visible, even if the
transaction is aborted. This is an issue when transaction is used along
with futex syscall, on pthread_cond_wait for instance, where the futex
call might succeed but the transaction is rolled back leading the
pthread_cond object in an inconsistent state.
Glibc used to prevent it by always aborting a transaction before issuing
a syscall. Linux 4.2 also decided to abort active transaction in
syscalls which makes the glibc workaround superfluous. Worse, glibc
transaction abortion leads to a performance issue on recent kernels
where the HTM state is saved/restore lazily (v4.9). By aborting a
transaction on every syscalls, regardless whether a transaction has being
initiated before, GLIBS makes the kernel always save/restore HTM state
(it can not even lazily disable it after a certain number of syscall
iterations).
Because of this shortcoming, Transactional Lock Elision is just enabled
when it has been explicitly set (either by tunables of by a configure
switch) and if kernel aborts HTM transactions on syscalls
(PPC_FEATURE2_HTM_NOSC). It is reported that using simple benchmark [1],
the context-switch is about 5% faster by not issuing a tabort in every
syscall in newer kernels.
Checked on powerpc64le-linux-gnu with 4.4.0 kernel (Ubuntu 16.04).
* NEWS: Add note about new TLE support on powerpc64le.
* sysdeps/powerpc/nptl/tcb-offsets.sym (TM_CAPABLE): Remove.
* sysdeps/powerpc/nptl/tls.h (tcbhead_t): Rename tm_capable to
__ununsed1.
(TLS_INIT_TP, TLS_DEFINE_INIT_TP): Remove tm_capable setup.
(THREAD_GET_TM_CAPABLE, THREAD_SET_TM_CAPABLE): Remove macros.
* sysdeps/powerpc/powerpc32/sysdep.h,
sysdeps/powerpc/powerpc64/sysdep.h (ABORT_TRANSACTION_IMPL,
ABORT_TRANSACTION): Remove macros.
* sysdeps/powerpc/sysdep.h (ABORT_TRANSACTION): Likewise.
* sysdeps/unix/sysv/linux/powerpc/elision-conf.c (elision_init): Set
__pthread_force_elision iff PPC_FEATURE2_HTM_NOSC is set.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h,
sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h
sysdeps/unix/sysv/linux/powerpc/syscall.S (ABORT_TRANSACTION): Remove
usage.
* sysdeps/unix/sysv/linux/powerpc/not-errno.h: Remove file.
Reported-by: Breno Leitão <leitao@debian.org>
I noticed that sysdeps/x86/cpu-features.h had conditionals on whether
to define HAS_CPUID, HAS_I586 and HAS_I686 with a long list of
preprocessor macros for i686-and-later processors which however was
out of date. This patch avoids the problem of the list getting out of
date by instead having conditionals on all the (few, old) pre-i686
processors for which GCC has preprocessor macros, rather than the
(many, expanding list) i686-and-later processors. It seems HAS_I586
and HAS_I686 are unused so the only effect of these macros being
missing is that 32-bit glibc built for one of these processors would
end up doing runtime detection of CPUID availability.
i386 builds are prevented by a configure test so there is no need to
allow for them here. __geode__ (no long nops?) and __k6__ (no CMOV,
at least according to GCC) are conservatively handled as i586, not
i686, here (as noted above, this is a theoretical distinction at
present in that only HAS_CPUID appears to be used).
Tested for x86.
* sysdeps/x86/cpu-features.h [__geode__ || __k6__]: Handle like
[__i586__ || __pentium__].
[__i486__]: Handle explicitly.
(HAS_CPUID): Define to 1 if above macros are undefined.
(HAS_I586): Likewise.
(HAS_I686): Likewise.
The algorithm is exp(y * log(x)), where log(x) is computed with about
1.3*2^-68 relative error (1.5*2^-68 without fma), returning the result
in two doubles, and the exp part uses the same algorithm (and lookup
tables) as exp, but takes the input as two doubles and a sign (to handle
negative bases with odd integer exponent). The __exp1 internal symbol
is no longer necessary.
There is separate code path when fma is not available but the worst case
error is about 0.54 ULP in both cases. The lookup table and consts for
log are 4168 bytes. The .rodata+.text is decreased by 37908 bytes on
aarch64. The non-nearest rounding error is less than 1 ULP.
Improvements on Cortex-A72 compared to current glibc master:
pow thruput: 2.40x in [0.01 11.1]x[0.01 11.1]
pow latency: 1.84x in [0.01 11.1]x[0.01 11.1]
Tested on
aarch64-linux-gnu (defined __FP_FAST_FMA, TOINT_INTRINSICS) and
arm-linux-gnueabihf (!defined __FP_FAST_FMA, !TOINT_INTRINSICS) and
x86_64-linux-gnu (!defined __FP_FAST_FMA, !TOINT_INTRINSICS) and
powerpc64le-linux-gnu (defined __FP_FAST_FMA, !TOINT_INTRINSICS) targets.
* NEWS: Mention pow improvements.
* math/Makefile (type-double-routines): Add e_pow_log_data.
* sysdeps/generic/math_private.h (__exp1): Remove.
* sysdeps/i386/fpu/e_pow_log_data.c: New file.
* sysdeps/ia64/fpu/e_pow_log_data.c: New file.
* sysdeps/ieee754/dbl-64/Makefile (CFLAGS-e_pow.c): Allow fma
contraction.
* sysdeps/ieee754/dbl-64/e_exp.c (__exp1): Remove.
(exp_inline): Remove.
(__ieee754_exp): Only single double input is handled.
* sysdeps/ieee754/dbl-64/e_pow.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_pow_log_data.c: New file.
* sysdeps/ieee754/dbl-64/math_config.h (issignaling_inline): Define.
(__pow_log_data): Define.
* sysdeps/ieee754/dbl-64/upow.h: Remove.
* sysdeps/ieee754/dbl-64/upow.tbl: Remove.
* sysdeps/m68k/m680x0/fpu/e_pow_log_data.c: New file.
* sysdeps/x86_64/fpu/multiarch/Makefile (CFLAGS-e_pow-fma.c): Allow fma
contraction.
(CFLAGS-e_pow-fma4.c): Likewise.
Many bits/mman.h headers for Linux architectures have exactly the same
contents, up to whitespace, comments and the number of leading 0s on
constants. Specifically, this applies to architectures that, in the
Linux kernel, either have no uapi/asm/mman.h, or have one that
includes asm-generic/mman.h without any changes or additions relevant
to glibc (this last case is the one that applies to Arm).
It's not useful to have to duplicate the set of MAP_* constants in
glibc for all such architectures and any new architectures with that
property. Thus, this patch creates a generic
sysdeps/unix/sysv/linux/bits/mman.h and removes all the
architecture-specific versions that become unnecessary.
Further unification remains possible after this patch. For example,
the new bits/mman.h could become bits/mman-map-flags-generic.h so that
it could also be used by architecture-specific bits/mman.h headers on
architectures that use the generic flags but add architecture-specific
ones to them. That would allow this common set of MAP_* definitions
to be used on ia64 and x86 as well (architectures that include
asm-generic/mman.h from their own uapi/asm/mman.h but define
additional MAP_* values of their own).
Tested with build-many-glibcs.py.
* sysdeps/unix/sysv/linux/bits/mman.h: New file.
* sysdeps/unix/sysv/linux/aarch64/bits/mman.h: Remove.
* sysdeps/unix/sysv/linux/arm/bits/mman.h: Likewise.
* sysdeps/unix/sysv/linux/m68k/bits/mman.h: Likewise.
* sysdeps/unix/sysv/linux/microblaze/bits/mman.h: Likewise.
* sysdeps/unix/sysv/linux/nios2/bits/mman.h: Likewise.
* sysdeps/unix/sysv/linux/riscv/bits/mman.h: Likewise.
* sysdeps/unix/sysv/linux/s390/bits/mman.h: Likewise.
* sysdeps/unix/sysv/linux/sh/bits/mman.h: Likewise.
The ldbl-128ibm implementations of ceill and floorl call the
corresponding double functions. This patch fixes those
implementations to call those functions as ceil and floor rather than
as __ceil and __floor, so that the proper inlining takes place when
possible, while including local asm redirections for when the
functions are not inlined since NO_MATH_REDIRECT applies to the double
functions as well as to the long double ones.
Tested with build-many-glibcs.py for all its powerpc configurations.
* sysdeps/ieee754/ldbl-128ibm/s_ceill.c (ceil): Redirect to
__ceil.
(__ceill): Call ceil instead of __ceil.
* sysdeps/ieee754/ldbl-128ibm/s_floorl.c (floor): Redirect to
__floor.
(__floorl): Call floor instead of __floor.