The tile version of math_private.h defines some inline functions for
fenv.h functions, to optimize away internal calls to these functions
that do nothing given no support for floating-point exceptions and
rounding modes. (Some functions may have error cases for invalid
arguments, but those aren't applicable to the internal calls from
within glibc.) Other configurations lacking support for exceptions
and rounding modes lack such inline functions. This patch moves them
to the generic math_private.h, appropriately conditioned, so that all
such configurations can benefit from the.
include/fenv.h is made to check whether there are any non-default
rounding modes; that needs to be done there, rather than later,
because get-rounding-mode.h defines values for otherwise unsupported
FE_* rounding modes. It also gives an error for FE_TONEAREST
undefined, a case that already did not work for building the glibc
testsuite; the convention has by now been established that all
architectures need to provide a version of bits/fenv.h that at least
defines FE_TONEAREST.
Tested with build-many-glibcs.py. As expected, installed stripped
shared libraries are unchanged for tile and for architectures
supporting exceptions and rounding modes, but changed for non-tile
architectures not supporting exceptions and rounding modes that
previously lacked this optimization (e.g. Nios II libm.so is about 1kB
smaller).
The optimization is not in fact complete (does not cover feholdexcept
/ __feholdexcept, so a few calls to those remain unnecessarily within
libm even after this patch), but that can be dealt with separately.
* include/fenv.h [!_ISOMAC && !FE_TONEAREST]: Give #error.
[!_ISOMAC] (FE_HAVE_ROUNDING_MODES): New macro.
* sysdeps/generic/math_private.h
[!FE_HAVE_ROUNDING_MODES && FE_ALL_EXCEPT == 0] (fegetenv): New
inline function.
[!FE_HAVE_ROUNDING_MODES && FE_ALL_EXCEPT == 0] (__fegetenv):
Likewise.
[!FE_HAVE_ROUNDING_MODES && FE_ALL_EXCEPT == 0] (fesetenv):
Likewise.
[!FE_HAVE_ROUNDING_MODES && FE_ALL_EXCEPT == 0] (__fesetenv):
Likewise.
[!FE_HAVE_ROUNDING_MODES && FE_ALL_EXCEPT == 0] (feupdateenv):
Likewise.
[!FE_HAVE_ROUNDING_MODES && FE_ALL_EXCEPT == 0] (__feupdateenv):
Likewise.
[!FE_HAVE_ROUNDING_MODES] (fegetround): Likewise.
[!FE_HAVE_ROUNDING_MODES] (__fegetround): Likewise.
[!FE_HAVE_ROUNDING_MODES] (fesetround): Likewise.
[!FE_HAVE_ROUNDING_MODES] (__fesetround): Likewise.
* sysdeps/tile/math_private.h (fegetenv): Remove inline function.
(__fegetenv): Likewise.
(fesetenv): Likewise.
(__fesetenv): Likewise.
(feupdateenv): Likewise.
(__feupdateenv): Likewise.
(fegetround): Likewise.
(__fegetround): Likewise.
(fesetround): Likewise.
(__fesetround): Likewise.
Various configurations lacking support for floating-point exceptions
and rounding modes have a math_private.h that overrides certain
functions and macros, internal and external, to avoid references to
FE_* constants that are undefined in those configurations. For
example, there are unconditional feraiseexcept (FE_INVALID) calls in
generic libm code, and these macro definitions duly define
feraiseexcept to ignore its argument to avoid an error from FE_INVALID
being undefined.
In fact it is easy to tell in an architecture-independent way whether
this is needed, by testing whether FE_ALL_EXCEPT == 0. Thus, this
patch puts such a test, and feraiseexcept and __feraiseexcept macros,
in the generic math_private.h, so reducing the duplication between
architecture versions of this header. The feclearexcept macro present
in several versions of this header, and fetestexcept in the tile
version, are not needed; they would have been needed before there were
proper soft-fp fma implementations (when generic versions, that depend
on FE_TOWARDZERO and FE_INEXACT, were being used for configurations
not supporting those features), but aren't needed any more, and so are
removed.
The tile version of this header has several inline functions for
fenv.h functions to optimize calls to them away in such configurations
where they do nothing useful, and all these header versions also have
definitions of some of the libc_fe* internal macros. I intend to make
those generic in subsequent patches.
Tested with build-many-glibcs.py that installed stripped shared
libraries are unchanged by this patch.
* sysdeps/generic/math_private.h [FE_ALL_EXCEPT == 0]
(feraiseexcept): New macro.
[FE_ALL_EXCEPT == 0] (__feraiseexcept): Likewise.
* sysdeps/m68k/coldfire/nofpu/math_private.h (feraiseexcept):
Remove macro.
(__feraiseexcept): Likewise.
(feclearexcept): Likewise.
* sysdeps/microblaze/math_private.h (feraiseexcept): Likewise.
(__feraiseexcept): Likewise.
(feclearexcept): Likewise.
* sysdeps/nios2/math_private.h (feraiseexcept): Likewise.
(__feraiseexcept): Likewise.
(feclearexcept): Likewise.
* sysdeps/tile/math_private.h (feraiseexcept): Likewise.
(__feraiseexcept): Likewise.
(feclearexcept): Likewise.
(fetestexcept): Likewise.
Since I've been fixing build issues for ColdFire, this patch adds a
math-tests.h file for ColdFire, reflecting the lack of support for
exceptions and rounding modes for soft float. I think it is logically
correct, but have not tested it beyond build-many-glibcs.py for both
hard and soft float.
* sysdeps/m68k/coldfire/math-tests.h: New file.
The m68k bits/fenv.h is in sysdeps/m68k/fpu/, meaning that no-FPU
ColdFire instead gets the generic (top-level) bits/fenv.h.
That top-level bits/fenv.h defines no rounding mode constants. That
no longer works for building glibc tests: some tests fail to build (at
least with warnings) if no rounding mode macros are defined, so at
least FE_TONEAREST must be defined in all cases (as various
architectures without rounding mode support indeed do), while
__FE_UNDEFINED must be defined in the case where not all the standard
rounding modes are supported.
On general principles of supporting multilib toolchains with a single
set of headers shared between multilibs for a given architecture, it's
also desirable for the same bits/fenv.h header to work for both FPU
and no-FPU configurations. Thus, this patch moves the m68k
bits/fenv.h to sysdeps/m68k/bits/fenv.h, and inserts appropriate
conditionals to handle the no-FPU case. All the exception macros, and
FE_NOMASK_ENV, are disabled in the no-FPU case; FE_ALL_EXCEPT is
defined to 0 in that case. All rounding modes except FE_TONEAREST are
disabled in that case, and __FE_UNDEFINED is defined accordingly. To
avoid an unnecessary ABI change, fenv_t is defined in the no-FPU case
to match the definition it would have got from the generic
bits/fenv.h.
This suffices to get a clean glibc and testsuite build for this
configuration with build-many-glibcs.py (and keeps a clean build for
the other m68k configurations); it has not been otherwise tested.
* sysdeps/m68k/fpu/bits/fenv.h: Move to ....
* sysdeps/m68k/bits/fenv.h: ... here.
[!__HAVE_68881__ && !__HAVE_FPU__ && !__mcffpu__] (FE_INEXACT): Do
not define.
[!__HAVE_68881__ && !__HAVE_FPU__ && !__mcffpu__] (FE_DIVBYZERO):
Likewise.
[!__HAVE_68881__ && !__HAVE_FPU__ && !__mcffpu__] (FE_UNDERFLOW):
Likewise.
[!__HAVE_68881__ && !__HAVE_FPU__ && !__mcffpu__] (FE_OVERFLOW):
Likewise.
[!__HAVE_68881__ && !__HAVE_FPU__ && !__mcffpu__] (FE_INVALID):
Likewise.
[!__HAVE_68881__ && !__HAVE_FPU__ && !__mcffpu__] (FE_ALL_EXCEPT):
Define to 0.
[!__HAVE_68881__ && !__HAVE_FPU__ && !__mcffpu__]
(__FE_UNDEFINED): New enum constant.
[!__HAVE_68881__ && !__HAVE_FPU__ && !__mcffpu__] (FE_TOWARDZERO):
Do not define.
[!__HAVE_68881__ && !__HAVE_FPU__ && !__mcffpu__] (FE_DOWNWARD):
Likewise.
[!__HAVE_68881__ && !__HAVE_FPU__ && !__mcffpu__] (FE_UPWARD):
Likewise.
[!__HAVE_68881__ && !__HAVE_FPU__ && !__mcffpu__] (fenv_t): Define
to match generic bits/fenv.h.
[!__HAVE_68881__ && !__HAVE_FPU__ && !__mcffpu__] (FE_NOMASK_ENV):
Do not define.
As reported in bug 21314, building log1p and log1pf fails with -Os
because of a spurious -Wmaybe-uninitialized warning (reported there
for GCC 5 for MIPS, I see it also with GCC 7 for x86_64). This patch,
based on the patches in the bug, fixes this using the DIAG_* macros.
Tested for x86_64 with -Os that this eliminates those warnings and so
allows the build to progress further.
2018-02-01 Carlos O'Donell <carlos@redhat.com>
Ramin Seyed-Moussavi <lordrasmus@gmail.com>
Joseph Myers <joseph@codesourcery.com>
[BZ #21314]
* sysdeps/ieee754/dbl-64/s_log1p.c: Include <libc-diag.h>.
(__log1p): Disable -Wmaybe-uninitialized for -Os around
computation using c.
* sysdeps/ieee754/flt-32/s_log1pf.c: Include <libc-diag.h>.
(__log1pf): Disable -Wmaybe-uninitialized for -Os around
computation using c.
We don't have support for hp timing for now, even the i686 variant, which needs
to know the CPU speed.
Copied from sysdeps/generic/hp-timing.h
* sysdeps/mach/hurd/hp-timing.h: New file.
* hurd/Versions: Fix version when _hurd_exec_paths was added.
* mach/Versions: Fix version when __mach_host_self_ was added.
* sysdeps/mach/hurd/i386/ld.abilist: New file.
* sysdeps/mach/hurd/i386/libBrokenLocale.abilist: New file.
* sysdeps/mach/hurd/i386/libanl.abilist: New file.
* sysdeps/mach/hurd/i386/libc.abilist: New file.
* sysdeps/mach/hurd/i386/libcrypt.abilist: New file.
* sysdeps/mach/hurd/i386/libdl.abilist: New file.
* sysdeps/mach/hurd/i386/libm.abilist: New file.
* sysdeps/mach/hurd/i386/libnsl.abilist: New file.
* sysdeps/mach/hurd/i386/libresolv.abilist: New file.
* sysdeps/mach/hurd/i386/librt.abilist: New file.
* sysdeps/mach/hurd/i386/libutil.abilist: New file.
This contains a definition of __IPC_64 that matches the RISC-V Linux
ABI.
2018-01-29 Darius Rad <darius@bluespec.com>
* sysdeps/unix/sysv/linux/riscv/ipc_priv.h: New file.
This patch lays out the top-level orginazition of the RISC-V port. It
contains all the Implies files as well as various other fragments of
build infastructure for the RISC-V port. This contains the only change
to a shared file: config.h.in.
RISC-V is a family of base ISAs with optional extensions. The base ISAs
are RV32I and RV64I, which are 32-bit and 64-bit integer-only ISAs, but
this port currently only supports RV64I based systems. Support for
RISC-V lives in in sysdeps/riscv. In addition to these ISAs, our glibc
port supports most of the currently-defined extensions: the A extension
for atomics, the M extension for multiplication, the C extension for
compressed instructions, and the F/D extensions for single/double
precision IEEE floating-point. Most of these extensions are handled by
GCC, but glibc defines various floating-point wrappers and emulation
routines as well as some atomic wrappers.
We support running glibc-based programs on Linux, the support for which
lives in sysdeps/unix/sysv/linux/riscv.
2018-01-29 Palmer Dabbelt <palmer@sifive.com>
* sysdeps/riscv/Implies: New file.
* sysdeps/riscv/Makefile: Likewise.
* sysdeps/riscv/configure: Likewise.
* sysdeps/riscv/configure.ac: Likewise.
* sysdeps/riscv/nptl/Makefile: Likewise.
* sysdeps/riscv/preconfigure: Likewise.
* sysdeps/riscv/rv64/Implies-after: Likewise.
* sysdeps/riscv/rv64/rvd/Implies: Likewise.
* sysdeps/riscv/rv64/rvf/Implies: Likewise.
* sysdeps/unix/sysv/linux/riscv/Implies: Likewise.
* sysdeps/unix/sysv/linux/riscv/Makefile: Likewise.
* sysdeps/unix/sysv/linux/riscv/Versions: Likewise.
* sysdeps/unix/sysv/linux/riscv/configure: Likewise.
* sysdeps/unix/sysv/linux/riscv/configure.ac: Likewise.
* sysdeps/unix/sysv/linux/riscv/ldd-rewrite.sed: Likewise.
* sysdeps/unix/sysv/linux/riscv/rv64/Implies: Likewise.
* sysdeps/unix/sysv/linux/riscv/rv64/Makefile: Likewise.
* sysdeps/unix/sysv/linux/riscv/shlib-versions: Likewise.
I started with the aarch64 ABI lists and manually went through each
difference, ensuring that the missing entries had been deprecated along
the line. Darius generated the ulps files by running the test cases on QEMU.
2018-01-29 Palmer Dabbelt <palmer@sifive.com>
* sysdeps/riscv/nofpu/libm-test-ulps: New file.
* sysdeps/riscv/nofpu/libm-test-ulps-name: Likewise.
* sysdeps/riscv/rv64/rvd/libm-test-ulps: Likewise.
* sysdeps/riscv/rv64/rvd/libm-test-ulps-name: Likewise.
* sysdeps/unix/sysv/linux/riscv/localplt.data: Likewise.
* sysdeps/unix/sysv/linux/riscv/rv64/c++-types.data: Likewise.
* sysdeps/unix/sysv/linux/riscv/rv64/ld.abilist: Likewise.
* sysdeps/unix/sysv/linux/riscv/rv64/libanl.abilist: Likewise.
* sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist: Likewise.
* sysdeps/unix/sysv/linux/riscv/rv64/libdl.abilist: Likewise.
* sysdeps/unix/sysv/linux/riscv/rv64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/riscv/rv64/libnsl.abilist: Likewise.
* sysdeps/unix/sysv/linux/riscv/rv64/librt.abilist: Likewise.
* sysdeps/unix/sysv/linux/riscv/rv64/libutil.abilist: Likewise.
This contains the Linux-specific code for loading programs on RISC-V.
2018-01-29 Palmer Dabbelt <palmer@sifive.com>
* sysdeps/unix/sysv/linux/riscv/dl-static.c: New file.
* sysdeps/unix/sysv/linux/riscv/ldconfig.h: Likewise.
* sysdeps/unix/sysv/linux/riscv/ldsodefs.h: Likewise.
Contains the Linux system call interface, as well as the definitions of
a handful of system calls.
2018-01-29 Palmer Dabbelt <palmer@sifive.com>
* sysdeps/riscv/nptl/nptl-sysdep.S: New file.
* sysdeps/unix/sysv/linux/riscv/arch-fork.h: Likewise.
* sysdeps/unix/sysv/linux/riscv/clone.S: Likewise.
* sysdeps/unix/sysv/linux/riscv/profil-counter.h: Likewise.
* sysdeps/unix/sysv/linux/riscv/pt-vfork.S: Likewise.
* sysdeps/unix/sysv/linux/riscv/syscall.c: Likewise.
* sysdeps/unix/sysv/linux/riscv/sysdep.S: Likewise.
* sysdeps/unix/sysv/linux/riscv/sysdep.h: Likewise.
* sysdeps/unix/sysv/linux/riscv/vfork.S: Likewise.
This patch implements various atomic and locking routines on RISC-V. We
mandate the A extension on Linux-capable RISC-V systems, so this can
rely on always having the various atomic instructions availiable.
2018-01-29 Palmer Dabbelt <palmer@sifive.com>
* sysdeps/riscv/nptl/bits/pthreadtypes-arch.h: New file.
* sysdeps/riscv/nptl/bits/semaphore.h: Likewise.
* sysdeps/riscv/nptl/libc-lowlevellock.c: Likewise.
* sysdeps/unix/sysv/linux/riscv/atomic-machine.h: Likewise.
This patch contains the miscellaneous math routines and headers we have
implemented for RISC-V. This includes things from <math.h> that aren't
completely ISA-generic, floating-point bit manipulation, and soft-fp
hooks.
2018-01-29 Palmer Dabbelt <palmer@sifive.com>
* sysdeps/riscv/bits/fenv.h: New file.
* sysdeps/riscv/e_sqrtl.c: Likewise.
* sysdeps/riscv/fpu_control.h: Likewise.
* sysdeps/riscv/math-tests.h: Likewise.
* sysdeps/riscv/nofpu/Implies: Likewise.
* sysdeps/riscv/sfp-machine.h: Likewise.
* sysdeps/riscv/tininess.h: Likewise.
This patch implements TLS support for RISC-V. We support all four
standard TLS addressing modes (LE, IE, LD, and GD) when running on
Linux via NPTL. There is a draft psABI document that defines our TLS
ABI here
https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md#thread-local-storage
2018-01-29 Palmer Dabbelt <palmer@sifive.com>
* sysdeps/riscv/dl-tls.h: New file.
* sysdeps/riscv/libc-tls.c: Likewise.
* sysdeps/riscv/nptl/tcb-offsets.sym: Likewise.
* sysdeps/riscv/nptl/tls.h: Likewise.
* sysdeps/riscv/stackinfo.h: Likewise.
This patch contains code that needs to directly know about the RISC-V
ABI, which is specified in a work-in-progress psABI document:
https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md
This is meant to contain all the RISC-V code that needs to explicitly
name registers or manage in-memory structure layout. This does not
contain any of the Linux-specific code.
2018-01-29 Palmer Dabbelt <palmer@sifive.com>
* sysdeps/riscv/__longjmp.S: New file.
* sysdeps/riscv/backtrace.c: Likewise.
* sysdeps/riscv/bits/endian.h: Likewise.
* sysdeps/riscv/bits/setjmp.h: Likewise.
* sysdeps/riscv/bits/wordsize.h: Likewise.
* sysdeps/riscv/bsd-_setjmp.c: Likewise.
* sysdeps/riscv/bsd-setjmp.c: Likewise.
* sysdeps/riscv/dl-trampoline.S: Likewise.
* sysdeps/riscv/gccframe.h: Likewise.
* sysdeps/riscv/jmpbuf-offsets.h: Likewise.
* sysdeps/riscv/jmpbuf-unwind.h: Likewise.
* sysdeps/riscv/machine-gmon.h: Likewise.
* sysdeps/riscv/memusage.h: Likewise.
* sysdeps/riscv/setjmp.S: Likewise.
* sysdeps/riscv/sys/asm.h: Likewise.
* sysdeps/riscv/tls-macros.h: Likewise.
The RISC-V port contains a crti.S that simply contains a link to
PREINIT_FUNCTION (when defined). As this should be entirely generic,
Joseph Myers suggested that we update the generic init_array version to
contain this. Since RISC-V is the only user of init_array this won't
break any existing ports.
2018-01-29 Palmer Dabbelt <palmer@sifive.com>
* sysdeps/init_array/crti.S (.section .init_array): Add
PREINIT_FUNCTION when defined.
copy_file_range syscall was added for microblaze in 4.10.
This patch makes the MicroBlaze kernel-features.h undefine
__ASSUME_COPY_FILE_RANGE for toolchains built with kernel headers < 4.10.
* sysdeps/unix/sysv/linux/microblaze/kernel-features.h
(__ASSUME_COPY_FILE_RANGE) [__LINUX_KERNEL_VERSION < 0x040A00]: Undef.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/?id=7181e5590e5ba898804aef3ee6be7f27606e6f8b
Signed-off-by: Romain Naour <romain.naour@gmail.com>
* sysdeps/mach/hurd/net/ethernet.h: Include <stdint.h>.
* sysdeps/mach/hurd/net/if_arp.h: Include <stdint.h>.
* sysdeps/mach/hurd/net/if_ppp.h: Do not include non-existing
<net/ppp_defs.h>.
_POSIX_CHOWN_RESTRICTED and _POSIX_NO_TRUNC should be always defined.
* sysdeps/mach/hurd/bits/posix_opt.h (_POSIX_CHOWN_RESTRICTED,
_POSIX_NO_TRUNC): Define to 0.
400669754d ('hurd: Fix nscd build') had the side effect of making
libc's freeaddrinfo expose freeifaddrs through __check_pf. We can just
move the renames to gai.c itself, along others.
* sysdeps/mach/hurd/check_pf.c (__getifaddrs, __freeifaddrs): Do not
define macros.
* nscd/gai.c (__getifaddrs): Define macro to getifaddrs.
(__freeifaddrs): Define macro to freeifaddrs.
* hurd/hurd.h (__hurd_fail): Always declare function, and provide inline
version only if __USE_EXTERN_INLINES is defined.
* hurd/hurd/fd.h (_hurd_fd_error_signal, _hurd_fd_error, __hurd_dfail,
__hurd_sockfail): Likewise.
(_hurd_fd_get): Always declare functions, and provide inline versions
only if __USE_EXTERN_INLINES and _LIBC are defined and IS_IN(libc).
* hurd/hurd/port.h (_hurd_port_init, _hurd_port_locked_get,
_hurd_port_get, _hurd_port_free, _hurd_port_locked_set,
_hurd_port_set): Always declare functions, and provide inline versions
only if __USE_EXTERN_INLINES and _LIBC are defined and
IS_IN(libc).
* hurd/hurd/signal.h (_hurd_self_sigstate, _hurd_critical_section_lock,
_hurd_critical_section_unlock): Likewise.
* hurd/hurd/threadvar.h (__hurd_threadvar_location_from_sp,
* __hurd_threadvar_location): Likewise.
* hurd/hurd/userlink.h (_hurd_userlink_link, _hurd_userlink_unlink,
_hurd_userlink_clear): Likewise.
* mach/lock-intern.h (__spin_lock_init, __spin_lock, __mutex_lock,
__mutex_unlock, __mutex_trylock): Always declare functions, and provide
inline versions only if __USE_EXTERN_INLINES and _LIBC are defined.
* mach/mach/mig_support.h (__mig_strncpy): Likewise.
* sysdeps/generic/machine-lock.h (__spin_unlock, __spin_try_lock,
__spin_lock_locked): Likewise.
* sysdeps/mach/i386/machine-lock.h (__spin_unlock, __spin_try_lock,
__spin_lock_locked): Likewise.
* mach/spin-lock.c (__USE_EXTERN_INLINES): Define to 1.
* hurd/Versions (libc: GLIBC_2.27): Add _hurd_fd_error_signal,
_hurd_fd_error, __hurd_dfail, __hurd_sockfail, _hurd_port_locked_set,
__hurd_threadvar_location_from_sp, __hurd_threadvar_location,
_hurd_userlink_link, _hurd_userlink_unlink, _hurd_userlink_clear.
* sysdeps/mach/hurd/getresgid.c (__getresgid): Set result from
critical section to make code simpler and avoid warning.
* sysdeps/mach/hurd/getresuid.c (__getresuid): Set result from
critical section to make code simpler and avoid warning.
Making `special_profil_failure' both avoids warning "variable
'special_profil_failure' set but not used", and makes it easier to
access with gdb.
* sysdeps/mach/hurd/profil.c (special_profil_failure): Move variable
to global scope.
This was dropped from GNU Mach in 2006.
* mach/Machrules (MIGFLAGS): Do not set -DMACH_IPC_COMPAT=0.
* mach/mach/mach_traps.h: Drop comment about MACH_IPC_COMPAT.
* sysdeps/mach/hurd/fork.c (__fork): Drop special casing
MACH_IPC_COMPAT.
timer_ptr2id and timer_id2ptr are used to convert between
application-visible timer_t and struct timer_node *. timer_ptr2id was made
to use void * instead of timer_t in 49b650430e ('Update.') for no reason.
It happens that on Linux timer_t is void *, so both that change and this
commit are no-ops there, but not on systems where timer_t is not void *.
Using timer_ptr2id for filling sival_ptr also does not make sense since that
actually is a void *.
* sysdeps/pthread/posix-timer.h (timer_ptr2id): Cast to timer_t
instead of void *.
* sysdeps/pthread/timer_create.c (timer_create): Do not use
timer_ptr2id to cast struct timer_node * to void *.
In commit cba595c350 and commit
f81ddabffd, ABI compatibility with
applications was broken by increasing the size of the on-stack
allocated __pthread_unwind_buf_t beyond the oringal size.
Applications only have the origianl space available for
__pthread_unwind_register, and __pthread_unwind_next to use,
any increase in the size of __pthread_unwind_buf_t causes these
functions to write beyond the original structure into other
on-stack variables leading to segmentation faults in common
applications like vlc. The only workaround is to version those
functions which operate on the old sized objects, but this must
happen in glibc 2.28.
Thank you to Andrew Senkevich, H.J. Lu, and Aurelien Jarno, for
submitting reports and tracking the issue down.
The commit reverts the above mentioned commits and testing on
x86_64 shows that the ABI compatibility is restored. A tst-cleanup1
regression test linked with an older glibc now passes when run
with the newly built glibc. Previously a tst-cleanup1 linked with
an older glibc would segfault when run with an affected glibc build.
Tested on x86_64 with no regressions.
Signed-off-by: Carlos O'Donell <carlos@redhat.com>
The RISC-V port defines ELF flags that enforce compatibility between
various objects. This adds the shared support necessary for these
flags.
2018-01-25 Palmer Dabbelt <palmer@sifive.com>
* elf/cache.c (print_entry): Add FLAG_RISCV_FLOAT_ABI_SOFT and
FLAG_RISCV_FLOAT_ABI_DOUBLE.
* elf/elf.h (EF_RISCV_RVC): New define.
(EF_RISCV_FLOAT_ABI): Likewise.
(EF_RISCV_FLOAT_ABI_SOFT): Likewise.
(EF_RISCV_FLOAT_ABI_SINGLE): Likewise.
(EF_RISCV_FLOAT_ABI_DOUBLE): Likewise.
(EF_RISCV_FLOAT_ABI_QUAD): Likewise.
* sysdeps/generic/ldconfig.h (FLAG_RISCV_FLOAT_ABI_SOFT): New
define.
(FLAG_RISCV_FLOAT_ABI_DOUBLE): Likewise.
The arguments of the LIBC_SLIBDIR_RTLDDIR macro are used both in unquoted
and single quoted context, so that neither shell nor makefile variable
references work. Consistently put them in single quotes so that they can
refer to makefile variables.
The sole failure for ColdFire in the compilation part of the glibc
testsuite is the localplt test. This patch adds a localplt baseline
for ColdFire to eliminate that failure. The difference from the
existing m68k baseline is that no PLT entry for _Unwind_Find_FDE is
expected, because ColdFire does not set
libc_cv_gcc_unwind_find_fde=yes.
Tested with build-many-glibcs.py.
* sysdeps/unix/sysv/linux/m68k/localplt.data: Move to ....
* sysdeps/unix/sysv/linux/m68k/m680x0/localplt.data: ... here.
* sysdeps/unix/sysv/linux/m68k/coldfire/localplt.data: New file.
As with some other soft-float configurations, no-FPU ColdFire needs
various fenv.h functions and glibc-internal macros overridden in
math_private.h to avoid references to undefined FE_* macros when
building glibc. This patch adds a suitable math_private.h, based on
the MicroBlaze one (Nios II and Tile also have similar files).
There's a case for having such a file in sysdeps/ieee754/soft-fp so
this logic is applied more generally to configurations without
exceptions and rounding modes, even when the relevant macros are
defined in fenv.h - the only case where that might be inappropriate is
ARM soft-float (where the fenv.h functions might or might not work at
runtime, depending on whether the processor used at runtime supports
VFP). There's also a case that soft-float configurations (on
processors with both hard-float and soft-float) should more
consistently avoid defining FE_* macros in bits/fenv.h when not
actually supported. But both of those are separate potential
cleanups.
This allows the no-FPU ColdFire build to get further (another fix is
needed to allow the build to complete).
* sysdeps/m68k/coldfire/nofpu/math_private.h: New file. Based on
MicroBlaze file.
Continuing the fixes for ColdFire glibc build with
build-many-glibcs.py, given a GCC patch for the libgcc build failure,
this patch adds jmp_buf-macros.h for no-FPU ColdFire. This allows the
no-FPU build to progress further than without the patch (although
other fixes are still needed for the build to complete).
* sysdeps/unix/sysv/linux/m68k/coldfire/jmp_buf-macros.h: Move to
....
* sysdeps/unix/sysv/linux/m68k/coldfire/fpu/jmp_buf-macros.h:
... here.
* sysdeps/unix/sysv/linux/m68k/coldfire/nofpu/jmp_buf-macros.h:
New file.
This patch adds a jmp_buf-macros.h for ColdFire. In conjunction with
a GCC patch to fix the libgcc build failure for ColdFire
<https://gcc.gnu.org/ml/gcc-patches/2018-01/msg02064.html> this
suffices to restore the build (tested with build-many-glibcs.py). A
further patch will be needed for soft-float ColdFire (while the
function-calling ABI is the same for hard-float and soft-float
ColdFire, it turns out the glibc ABI is not - so another ColdFire
variant will be needed in build-many-glibcs.py), but I'll deal with
that separately.
Tested with build-many-glibcs.py (m68k-linux-gnu and
m68k-linux-gnu-coldfire). (There's a localplt test failure for
coldfire; that's the only failure in the compilation part of the
testsuite.)
* sysdeps/unix/sysv/linux/m68k/jmp_buf-macros.h: Move to ....
* sysdeps/unix/sysv/linux/m68k/m680x0/jmp_buf-macros.h: ... here.
* sysdeps/unix/sysv/linux/m68k/coldfire/jmp_buf-macros.h: New
file.
The uc_mcontext.__reserved member of ucontext_t is a user visible API,
that should not be changed, because this is the only way to access cpu
states of various extensions of linux asm/sigcontext.h, it does not
violate namespace rules either, so revert this part of the commit
commit 4fa9b3bfe6
Commit: Joseph Myers <joseph@codesourcery.com>
Fix mcontext_t sigcontext namespace (bug 21457).
(In principle the user can type cast &uc_mcontext to struct sigcontext*
to use the linux sigcontext fields, but that's not the existing practice
since mcontext_t used to be a typedef of struct sigcontext.)
[BZ #22742]
* sysdeps/unix/sysv/linux/aarch64/sys/ucontext.h (__glibc_reserved1):
Rename to __reserved and add comment.
* sysdeps/unix/sysv/linux/aarch64/ucontext_i.sym (__glibc_reserved1):
Rename to __reserved.
The tunables framework needs to execute syscall early in process
initialization, before the TCB is available for consumption. This
behavior conflicts with powerpc{|64|64le}'s lock elision code, that
checks the TCB before trying to abort transactions immediately before
executing a syscall.
This patch adds a powerpc-specific implementation of __access_noerrno
that does not abort transactions before the executing syscall.
Tested on powerpc{|64|64le}.
[BZ #22685]
* sysdeps/powerpc/powerpc32/sysdep.h (ABORT_TRANSACTION_IMPL): Renamed
from ABORT_TRANSACTION.
(ABORT_TRANSACTION): Redirect to ABORT_TRANSACTION_IMPL.
* sysdeps/powerpc/powerpc64/sysdep.h (ABORT_TRANSACTION,
ABORT_TRANSACTION_IMPL): Likewise.
* sysdeps/unix/sysv/linux/powerpc/not-errno.h: New file. Reuse
Linux code, but remove the code that aborts transactions.
Signed-off-by: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
The only architecture in glibc that uses the generic debug/backtrace.c
is hppa. The debug/tst-backtrace* tests fail for hppa, so in fact the
generic debug/backtrace.c is not functional anywhere. Instead, the
x86_64 version is a reasonably generic version that uses
_Unwind_Backtrace from libgcc to backtrace using unwind info, and is
used by several architectures. This patch adds hppa to the
architectures using it (leaving open the possibility of a subsequent
cleanup for 2.28 of moving the x86_64 version to debug/backtrace.c,
and removing all the frame.h files that are now unused).
Reported by Adhemerval in
<https://sourceware.org/ml/libc-alpha/2018-01/msg00564.html> that this
does fix the backtrace test failures for hppa.
[BZ #22719]
* sysdeps/hppa/backtrace.c: New file.
_dl_runtime_profile calls _dl_call_pltexit, passing a pointer to
La_x86_64_retval which is allocated on stack. The lrv_vector0
field in La_x86_64_retval must be aligned to size of vector register.
When allocating stack space for La_x86_64_retval, we need to make sure
that the address of La_x86_64_retval + RV_VECTOR0_OFFSET is aligned to
VEC_SIZE. This patch checks the alignment of the lrv_vector0 field
and pads the stack space if needed.
Tested with x32 and x86-64 on SSE4, AVX and AVX512 machines. It fixed
FAIL: elf/tst-audit10
FAIL: elf/tst-audit4
FAIL: elf/tst-audit5
FAIL: elf/tst-audit6
FAIL: elf/tst-audit7
on x32 AVX512 machine.
[BZ #22715]
* sysdeps/x86_64/dl-trampoline.h (_dl_runtime_profile): Properly
align La_x86_64_retval to VEC_SIZE.
The x86_64 backtrace implementation is used as a generic
implementation (unwinding via unwind info and _Unwind_Backtrace) by
various other architectures. This patch makes it more generic by
making it use LIBGCC_S_SO from gnu/lib-names.h instead of hardcoding
the libgcc_s.so.1 name, so that it can also be used on hppa which uses
libgcc_s.so.4.
Tested for x86_64.
* sysdeps/x86_64/backtrace.c: Include <gnu/lib-names.h>.
(init): Use LIBGCC_S_SO not hardcoded "libgcc_s.so.1".
Define new HWCAP bits and add their name to dl-procinfo.c following
the linux definitions. Synchronizing with v4.15-rc8 version of linux,
these are not expected to change before the 4.15 release.
* sysdeps/unix/sysv/linux/aarch64/bits/hwcap.h (HWCAP_SHA3): Define.
(HWCAP_SM3, HWCAP_SM4, HWCAP_ASIMDDP, HWCAP_SHA512, HWCAP_SVE): Define.
* sysdeps/unix/sysv/linux/aarch64/dl-procinfo.c
(_dl_aarch64_cap_flags): Update.
(_DL_HWCAP_COUNT): Update.
Remove unused _DL_HWCAP_LAST definition and move _DL_HWCAP_COUNT
where it is needed (dl-procinfo.h always includes dl-procinfo.c).
* sysdeps/unix/sysv/linux/aarch64/dl-procinfo.h
(_DL_HWCAP_LAST): Remove.
(_DL_HWCAP_COUNT): Move to ...
* sysdeps/unix/sysv/linux/aarch64/dl-procinfo.c
(_DL_HWCAP_COUNT): ... here.
This issue is similar to BZ #19235, where spurious exceptions are
created from adding 0.5 then converting to an integer.
The solution is based on Joseph's fix for BZ #19235.
[BZ #22697]
* sysdeps/powerpc/powerpc32/power4/fpu/s_llround.S (__llround):
Do not add 0.5 to integer or out-of-range arguments.
In the static pie enabled libc, crt1.o uses the same position independent
code as rcrt1.o and crt1.o is used instead of Scrt1.o when -no-pie
executables are linked. When main is not defined in the executable, but
in a shared library crt1.o is currently broken, it assumes main is local.
(glibc has a test for this but i missed it in my previous testing.)
To make both rcrt1.o and crt1.o happy with the same code, a wrapper is
introduced around main: with this crt1.o works with extern main symbol
while rcrt1.o does not depend on GOT relocations. (The change only
affects static pie enabled libc. Further simplification of start.S is
possible in the future by using the same approach for Scrt1.o too.)
* aarch64/start.S (_start): Use __wrap_main.
(__wrap_main): New local symbol.
Currently getcwd(3) can succeed without returning an absolute path
because the underlying getcwd syscall, starting with linux commit
v2.6.36-rc1~96^2~2, may succeed without returning an absolute path.
This is a conformance issue because "The getcwd() function shall
place an absolute pathname of the current working directory
in the array pointed to by buf, and return buf".
This is also a security issue because a non-absolute path returned
by getcwd(3) causes a buffer underflow in realpath(3).
Fix this by checking the path returned by getcwd syscall and falling
back to generic_getcwd if the path is not absolute, effectively making
getcwd(3) fail with ENOENT. The error code is chosen for consistency
with the case when the current directory is unlinked.
[BZ #22679]
CVE-2018-1000001
* sysdeps/unix/sysv/linux/getcwd.c (__getcwd): Fall back to
generic_getcwd if the path returned by getcwd syscall is not absolute.
* io/tst-getcwd-abspath.c: New test.
* io/Makefile (tests): Add tst-getcwd-abspath.
My fix for bug 22702 introduced linknamespace test failures on
s390x-linux-gnu and s390-linux-gnu because it made remainder call
__feholdexcept, and the s390 __feholdexcept calls fegetenv, and
remainder is in Unix98 and XPG4.2 but fegetenv isn't. This patch
makes __feholdexcept call __fegetenv instead to avoid that namespace
issue.
Tested (compilation) with build-many-glibcs.py for s390x-linux-gnu,
where it resolves the test failures.
* sysdeps/s390/fpu/feholdexcpt.c (__feholdexcept): Call __fegetenv
instead of fegetenv.
For soft-float powerpc, the math/test-nearbyint-except-2 test fails
because nearbyintl traps when traps on "inexact" are enabled on entry
(and an "inexact" exception is generated internally, though cleared
for the final return).
The problem is the default implementation of
libc_feholdsetround_noex_ctx, which does not disable exception traps.
There is some ambiguity about whether the *noex* interfaces are
required to do so or only permitted to do so. But given that we
support fe* interfaces to enable and disable traps (on architectures
with that functionality), functions that must not raise an exception
(must not leave the flag set on exit if not set on entry) should also
not trap on it when traps on that exception are enabled. So it is
appropriate to define these interfaces to have the feholdexcept effect
of disabling exception traps; this patch updates the default
implementation and comments accordingly.
At least some architecture versions already disable traps; there are
few uses of the *noex* interfaces at all, and while it's possible
there are bugs on any architecture versions failing to disable traps
that appear in the exp2 and remainder implementations, there are
currently no tests, other than this one for nearbyintl (where only the
ldbl-128ibm implementation uses SET_RESTORE_ROUND_NOEX), that would
fail as a result of such a bug. (Hard-float powerpc does disable
traps here, hence the nearbyintl failure not appearing there.)
Tested for powerpc (soft-float). This brings that configuration to
clean math/ test results, provided you build with GCC 8 to get the fix
for GCC bug 64811.
[BZ #22702]
* sysdeps/generic/math_private.h (libc_feresetround_noex): Update
comment to say exceptions are discarded.
(libc_feholdsetround_noex_ctx): Use __feholdexcept instead of
__fegetenv.
(SET_RESTORE_ROUND_NOEX): Update comment to say non-stop mode must
be enabled.
The ldbl-128ibm implementation of log1pl does ordered comparisons on a
negative qNaN argument, so resulting in spurious "invalid" exceptions
(for soft-float powerpc; hard-float only avoids this because of GCC
bug 58684 meaning ordered comparison instructions never get
generated). This patch fixes this by arranging for the test for NaN
or infinity arguments to handle negative arguments as well.
Tested for powerpc (soft float).
[BZ #22693]
* sysdeps/ieee754/ldbl-128ibm/s_log1pl.c (__log1pl): Handle
negative arguments in test for NaN or infinity argument.
Disabling lazy binding reduces stack usage during unwinding.
Note that RTLD_NOW only makes a difference if libgcc.so has not
already been loaded, so this is only a partial fix.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* hurd/hurd/fd.h: Include <fcntl.h>
(__hurd_at_flags): New function.
* hurd/lookup-at.c (__file_name_lookup_at): Replace flag computation
with call to __hurd_at_flags.
* include/unistd.h (__faccessat, __faccessat_noerrno): Add declaration.
* sysdeps/mach/hurd/access.c (access_common): Move implementation to
__faccessat
(hurd_fail_seterrno, hurd_fail_noerrno): Move to sysdeps/mach/hurd/faccessat.c.
(__access_noerrno): Use __faccessat_common instead of access_common.
(__access): Likewise.
* sysdeps/mach/hurd/euidaccess.c (__euidaccess): Replace implementation
with a call to __faccessat.
* sysdeps/mach/hurd/faccessat.c (faccessat): Rename into...
(__faccessat_common): ... this. Move implementation of __access into it when
AT_FLAGS does not contain AT_EACCESS. Make it call __hurd_at_flags, add
reauthenticate_cwdir_at helper to implement AT mechanism.
(__faccessat_noerrno): New function, just calls __faccessat_common.
(__faccessat): New function, just calls __faccessat_common.
(faccessat): Define weak alias.
For soft-float powerpc, fmaxmagl and fminmagl generate spurious
"invalid" exceptions for quiet NaN arguments. This is another case of
the problems with fabsl inline expansion via comparisons, and so is
fixed by building those functions with -fno-builtin-fabsl.
Tested for powerpc (soft-float).
[BZ #22691]
* sysdeps/powerpc/nofpu/Makefile [$(subdir) = math]
(CFLAGS-s_fmaxmagl.c): New variable.
[$(subdir) = math] (CFLAGS-s_fminmagl.c: Likewise.
The ldbl-128ibm implementations of lrintl and lroundl are missing
"invalid" exceptions for certain overflow cases when compiled with GCC
8. The cause of this is after-the-fact integer overflow checks that
fail when the compiler optimizes on the basis of integer overflow
being undefined; GCC 8 must be able to detect new cases of
undefinedness here.
Failure: lrint (-0x80000001p0): Exception "Invalid operation" not set
Failure: lrint_downward (-0x80000001p0): Exception "Invalid operation" not set
Failure: lrint_towardzero (-0x80000001p0): Exception "Invalid operation" not set
Failure: lrint_upward (-0x80000001p0): Exception "Invalid operation" not set
Failure: lround (-0x80000001p0): Exception "Invalid operation" not set
Failure: lround_downward (-0x80000001p0): Exception "Invalid operation" not set
Failure: lround_towardzero (-0x80000001p0): Exception "Invalid operation" not set
Failure: lround_upward (-0x80000001p0): Exception "Invalid operation" not set
(Tested that these failures occur before the patch for powerpc
soft-float, but the issue applies in principle for hard-float as well,
whether or not the particular optimizations in fact occur there at
present.)
This patch fixes the bug by ensuring the additions / subtractions in
question cast arguments to unsigned long int, or use 1UL as a constant
argument, so that the arithmetic occurs in an unsigned type with the
result then converted back to a signed type.
Tested for powerpc (soft-float).
[BZ #22690]
* sysdeps/ieee754/ldbl-128ibm/s_lrintl.c (__lrintl): Use unsigned
long int for arguments of possibly overflowing addition or
subtraction.
* sysdeps/ieee754/ldbl-128ibm/s_lroundl.c (__lroundl): Likewise.
For soft-float powerpc, the remainderl function produces zero results
with the wrong sign for various inputs. This is another instance of
the problem with incorrect built-in fabsl expansion, so is fixed by
this patch using -fno-builtin-fabsl for this function.
Tested for powerpc (soft-float).
[BZ #22688]
* sysdeps/powerpc/nofpu/Makefile [$(subdir) = math]
(CFLAGS-e_remainderl.c): New variable.
For soft-float powerpc, various _Complex long double functions
generate spurious "invalid" exceptions, even with a compiler with GCC
bug 64811 fixed.
The problem is GCC's built-in fabsl expansion. Various files are
already built with -fno-builtin-fabsl because in this case (IBM long
double, for soft-float or e500v1) a fallback fabsl expansion based on
comparisons is used, which can produce the wrong sign of a zero
result. Those comparisons can also produce spurious exceptions for
NaN arguments. Furthermore, __builtin_fpclassify implemently uses
__builtin_fabsl, and is unaffected by -fno-builtin-fabsl, and the
fpclassify macro uses __builtin_fpclassify in the absence of
-fsignaling-nans. Thus, this patch arranges for the problem files
using fpclassify to be built with -fsignaling-nans in this case, to
avoid spurious exceptions from fpclassify.
Tested for powerpc (soft-float).
[BZ #22687]
* sysdeps/powerpc/nofpu/Makefile (CFLAGS-s_cacosl.c): New
variable.
(CFLAGS-s_cacoshl.c): Likewise.
(CFLAGS-s_casinhl.c): Likewise.
(CFLAGS-s_catanl.c): Likewise.
(CFLAGS-s_catanhl.c): Likewise.
(CFLAGS-s_cexpl.c): Likewise.
(CFLAGS-s_ccoshl.c): Add -fsignaling-nans.
(CFLAGS-s_csinhl.c): Likewise.
(CFLAGS-s_clogl.c): Likewise.
(CFLAGS-s_clog10l.c): Likewise.
(CFLAGS-s_csinl.c): Likewise.
(CFLAGS-s_csqrtl.c): Likewise.
From: Emilio Pozuelo Monfort <pochu27@gmail.com>
From: Svante Signell <svante.signell@gmail.com>
Pass the file paths of executable to the exec server, both relative and
absolute, which exec needs to properly execute and avertise #!-scripts.
Previously, the exec server tried to guess the name from argv[0] but argv[0]
only contains the executable name by convention.
* hurd/hurdexec.c (_hurd_exec): Deprecate function.
(_hurd_exec_paths): New function.
* hurd/hurd.h (_hurd_exec): Deprecate function.
(_hurd_exec_paths): Declare function.
* hurd/Versions: Export _hurd_exec_paths.
* sysdeps/mach/hurd/execve.c: Include <stdlib.h> and <stdio.h>
(__execve): Use __getcwd to build absolute path, and use
_hurd_exec_paths instead of _hurd_exec.
* sysdeps/mach/hurd/spawni.c: Likewise.
* sysdeps/mach/hurd/fexecve.c: Use _hurd_exec_paths instead of
_hurd_exec.
Since the x86-64 assembly version of sincosf is higly optimized with
vector instructions, there isn't much room for improvement. However
s_sincosf.c written in C with vector math and intrinsics can be
optimized by GCC with FMA.
On Skylake, bench-sincosf reports performance improvement:
Assembly FMA improvement
max 104.042 101.008 3%
min 9.426 8.586 10%
mean 20.6209 18.2238 13%
* sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines):
Add s_sincosf-sse2 and s_sincosf-fma.
(CFLAGS-s_sincosf-fma.c): New.
* sysdeps/x86_64/fpu/multiarch/s_sincosf-fma.c: New file.
* sysdeps/x86_64/fpu/multiarch/s_sincosf-sse2.S: Likewise.
* sysdeps/x86_64/fpu/multiarch/s_sincosf.c: Likewise.
* sysdeps/x86_64/fpu/s_sincosf.S: Don't add alias if
__sincosf is defined.
Commit 24731685 ("prlimit: Translate old_rlimit from RLIM64_INFINITY to
RLIM_INFINITY") broken the getrlimit64 for 32-bit configurations which
do no need the 2GiB limited compat getrlimit (default version >= 2.2).
This patch fixes that by restoring the weak alias in that case.
Changelog:
* sysdeps/unix/sysv/linux/getrlimit64 (getrlimit64)
[!__RLIM_T_MATCHES_RLIM64_T]
[!SHLIB_COMPAT (libc, GLIBC_2_1, GLIBC_2_2)]: Define as weak alias of
__getrlimit64. Add libc_hidden_weak.
This follows c45d78aac ('posix: Fix generic p{read,write}v buffer allocation
(BZ#22457)'), which made pwritev to use __mmap instead of __posix_memalign,
but didn't pass PROT_READ to it, while the pwrite() call does need to
read the data we have just copied over.
* sysdeps/posix/pwritev_common.c: Add PROT_READ to __mmap prot.
The RISC-V Linux port defines VDSO symbols
2018-01-06 Palmer Dabbelt <palmer@sifive.com>
* sysdeps/unix/sysv/linux/dl-vdso.h (VDSO_NAME_LINUX_4_15): New
define.
(VDSO_HASH_LINUX_4_15): Likewise.
This follows ccf970c7a ('posix: Add compat glob symbol to not follow
dangling symbols') by adding to gnu/ the same compatibility as for Linux.
* sysdeps/gnu/glob64.c (__glob): Define macro instead of glob macro.
(__glob64): Define GLIBC_2_27 versioned symbol instead of glob64.
* sysdeps/gnu/glob-lstat-compat.c: New file.
* sysdeps/gnu/glob64-lstat-compat.c: New file.
The function _itoa_word() writes characters from the higher address to
the lower address, requiring the destination string to reserve that size
before calling it.
* sysdeps/powerpc/powerpc64/dl-machine.c (_dl_reloc_overflow):
Reserve 16 chars to reloc_addr before calling _itoa_word.
Signed-off-by: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Add a test to check that the getrlimit, setrlimit and prlimit functions
and their 64-bit equivalent behave correctly with RLIM_INFINITY and
RLIM64_INFINITY. For that it assumes that the prlimit64 function calls
the syscall directly without translating the value and that the kernel
uses the -1 value to represent infinity.
It first finds a resource with the hard limit set to infinity so the
soft limit can be manipulated easily and check for the consistency
between the value set or get by the prlimit64 and the other functions.
It is Linux specific add it uses the prlimit and prlimit64 functions.
Changelog:
* sysdeps/unix/sysv/linux/tst-rlimit-infinity.c: New file.
* sysdeps/unix/sysv/linux/Makefile (tests): Add tst-rlimit-infinity.
prlimit called without a new value fails on 32-bit machines if any of
the soft or hard limits are infinity. This is because prlimit does not
translate old_rlimit from RLIM64_INFINITY to RLIM_INFINITY, but checks
that the value returned by the prlimit64 syscall fits into a 32-bit
value, like it is done for example in getrlimit. Note that on the
other hand new_rlimit is correctly translated from RLIM_INFINITY to
RLIM64_INFINITY before calling the syscall.
This patch fixes that.
Changelog:
[BZ #22678]
* sysdeps/unix/sysv/linux/prlimit.c (prlimit): Translate
old_rlimit from RLIM64_INFINITY to RLIM_INFINITY.
Fix the RLIM_INFINITY and RLIM64_INFINITY constants on alpha to match
the kernel one and all other architectures. Change the getrlimit,
getrlimit64, setrlimit, setrlimit64 into old compat symbols, and provide
the Linux generic functions as GLIBC_2_27 version.
Changelog:
* sysdeps/unix/sysv/linux/getrlimit64.c [USE_VERSIONED_RLIMIT]: Do not
define getrlimit and getrlimit64 as weak aliases of __getrlimit64.
Define __GI_getrlimit64 as weak alias of __getrlimit64.
[__RLIM_T_MATCHES_RLIM64_T]: Do not redefine SHLIB_COMPAT, use #elif
instead.
* sysdeps/unix/sysv/linux/setrlimit64.c [USE_VERSIONED_RLIMIT]: Do not
define setrlimit and setrlimit64 as weak aliases of __setrlimit64.
* sysdeps/unix/sysv/linux/alpha/bits/resource.h (RLIM_INFINITY,
RLIM64_INFINITY): Fix values to match the kernel ones.
* sysdeps/unix/sysv/linux/alpha/getrlimit64.c: Define
USE_VERSIONED_RLIMIT. Rename __getrlimit64 into __old_getrlimit64 and
provide it as getrlimit@@GLIBC_2_0 and getrlimit64@@GLIBC_2_1. Add a
__getrlimit64 function and provide it as getrlimit@@GLIBC_2_27 and
getrlimit64@@GLIBC_2_27.
* sysdeps/unix/sysv/linux/alpha/setrlimit64.c: Ditto with setrlimit
and setrlimit64.
* sysdeps/unix/sysv/linux/alpha/libc.abilist (GLIBC_2.27): Add
getrlimit, setrlimit, getrlimit64 and setrlimit64.
* sysdeps/unix/sysv/linux/alpha/Versions (libc): Add getrlimit,
setrlimit, getrlimit64 and setrlimit64.
RLIM64_INFINITY was supposed to be a glibc convention rather than
anything seen by the kernel, but it ended being passed to the kernel
through the prlimit64 syscall.
* On the kernel side, the value is defined for the prlimit64 syscall for
all architectures in include/uapi/linux/resource.h:
#define RLIM64_INFINITY (~0ULL)
* On the kernel side, the value is defined for getrlimit and setrlimit
in arch/alpha/include/uapi/asm/resource.h
#define RLIM_INFINITY 0x7ffffffffffffffful
* On the GNU libc side, the value is defined in
sysdeps/unix/sysv/linux/alpha/bits/resource.h:
# define RLIM64_INFINITY 0x7fffffffffffffffLL
This was not an issue until the getrlimit and setrlimit glibc functions
have been changed in commit 045c13d185 ("Consolidate Linux setrlimit and
getrlimit implementation") to use the prlimit64 syscall instead of the
getrlimit and setrlimit ones.
This patch fixes that by adding a wrapper to fix the value passed to or
received from the kernel, before or after calling the prlimit64 syscall.
Changelog:
[BZ #22648]
* sysdeps/unix/sysv/linux/alpha/getrlimit64.c: New file.
* sysdeps/unix/sysv/linux/alpha/setrlimit64.c: Ditto.
As discussed in libc-alpha [1], alpha trunc{f} implementation uses
addt/suc and subt/suc and although the Alpha Architecture
Handbook version 3 states that that ADDx SUBx OUTPUT Exceptions
(B.3 Mapping to IEEE Standard) should not generate Inexact if INE
bit is set, the Alpha 21264 [2] chip manual (A.8 IEEE Floating-Point
Conformance) states that ADDx SUBx OUTPUT does generate inexact
exception for inexact result regardless.
As Joseph noted [3] to correctly fix it on alpha we need to either
avoid the instruction or avoid any inexact bit from it being set
on return from the function (while preserving the inexact bit that
might be set on the entry to the function). The later will result
mf_fpcr followed by a mt_fpcr to get and set the fpcr which will
defeat the optimization itself.
So the patch just remove the alpha optimized and rely on generic
implementation. It fixes the math/test-*-{trunc} on alpha.
[BZ #15479]
[BZ #22666]
* sysdeps/alpha/fpu/s_trunc.c: Remove file.
* sysdeps/alpha/fpu/s_truncf.c: Likewise.
[1] https://sourceware.org/ml/libc-alpha/2018-01/msg00114.html
[2] https://www.star.bnl.gov/public/daq/HARDWARE/21264_data_sheet.pdf
[3] https://sourceware.org/ml/libc-alpha/2018-01/msg00086.html
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
As discussed in libc-alpha [1], alpha ceil{f} and floor{f}
implementation uses cvttq/svm and although the Alpha Architecture
Handbook version 3 states that that CVTfi OUTPUT Exceptions
(B.3 Mapping to IEEE Standard) should not generate Inexact if INE
bit is set on fpcr, the Alpha 21264 [1] chip manual (A.8 IEEE
Floating-Point Conformance) states that CVTfi and CVTif OUTPUT
does generate inexact exception for inexact result regardless.
As Joseph noted [2] to correctly fix it on alpha we need to either
avoid the instruction or avoid any inexact bit from it being set
on return from the function (while preserving the inexact bit that
might be set on the entry to the function). The later will result
mf_fpcr followed by a mt_fpcr to get and set the fpcr which will
defeat the optimization itself.
So the patch just remove the alpha optimized and rely on generic
implementation. It fixes the math/test-*-{ceil,floor} on alpha.
[BZ #15479]
[BZ #22665]
* sysdeps/alpha/fpu/s_ceil.c: Remove file.
* sysdeps/alpha/fpu/s_ceilf.c: Likewise.
* sysdeps/alpha/fpu/s_floor.c: Likewise.
* sysdeps/alpha/fpu/s_floorf.c: Likewise.
[1] https://www.star.bnl.gov/public/daq/HARDWARE/21264_data_sheet.pdf
[2] https://sourceware.org/ml/libc-alpha/2018-01/msg00086.html
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Before this change, if glibc was compiled with SSE instructions and a
sufficiently recent GCC, an unaligned stack access in
__run_exit_handlers would cause stdlib/tst-makecontext to crash.
Changelog:
* sysdeps/unix/sysv/linux/alpha/getrlimit64.c (__old_getrlimit64):
Drop __RLIM_T_MATCHES_RLIM64_T conditional as __old_getrlimit64 is
never defined in that case.
Changelog:
* sysdeps/unix/sysv/linux/alpha/getrlimit64.c: Fix a typo in the
comment.
* sysdeps/unix/sysv/linux/alpha/setrlimit64.c: Fix a typo in the
comment.
(settrlimit): Rename into setrlimit.
(__sttrlimit): Rename into __setrlimit.
Various fmax and fmin function implementations mishandle sNaN
arguments:
(a) When both arguments are NaNs, the return value should be a qNaN,
but sometimes it is an sNaN if at least one argument is an sNaN.
(b) Under TS 18661-1 semantics, if either argument is an sNaN then the
result should be a qNaN (whereas if one argument is a qNaN and the
other is not a NaN, the result should be the non-NaN argument).
Various implementations treat sNaNs like qNaNs here.
One way to fix that is to detect the sNaN and add a special case. That
said there is no FPU instruction to do that, so it requires transfering
the FP value to an integer register and testing bits. This becomes quite
complicated so it's probably better to just use the generic versions of
these functions which just do that through issignaling.
Changelog:
[BZ #22660]
* sysdeps/alpha/fpu/s_fmax.S: Remove file.
* sysdeps/alpha/fpu/s_fmaxf.S: Likewise.
* sysdeps/alpha/fpu/s_fmin.S: Likewise.
* sysdeps/alpha/fpu/s_fminf.S: Likewise.
Move a shared part of sys/ptrace.h which is the same on all
architectures to a separate file.
* sysdeps/unix/sysv/linux/sys/ptrace.h: Include <bits/ptrace-shared.h>.
(__ptrace_setoptions, __ptrace_eventcodes, __ptrace_peeksiginfo_args,
__ptrace_peeksiginfo_flags, ptrace): Move to ...
* sysdeps/unix/sysv/linux/bits/ptrace-shared.h: ... new file.
* sysdeps/unix/sysv/linux/Makefile (sysdep_headers): Add
bits/ptrace-shared.h.
* sysdeps/unix/sysv/linux/aarch64/sys/ptrace.h: Include
<bits/ptrace-shared.h>.
(__ptrace_setoptions, __ptrace_eventcodes, __ptrace_peeksiginfo_args,
__ptrace_peeksiginfo_flags, ptrace): Remove.
* sysdeps/unix/sysv/linux/ia64/sys/ptrace.h: Likewise.
* sysdeps/unix/sysv/linux/powerpc/sys/ptrace.h: Likewise.
* sysdeps/unix/sysv/linux/s390/sys/ptrace.h: Likewise.
* sysdeps/unix/sysv/linux/sparc/sys/ptrace.h: Likewise.
* sysdeps/unix/sysv/linux/tile/sys/ptrace.h: Likewise.
is_path argument is no longer used and could be safely removed.
* elf/dl-dst.h (DL_DST_COUNT): Remove is_path argument, all callers
updated.
* elf/dl-load.c (is_dst, _dl_dst_count, _dl_dst_substitute,
expand_dynamic_string_token): Likewise.
* sysdeps/generic/ldsodefs.h (_dl_dst_count, _dl_dst_substitute): Remove
is_path argument.
libio.h was originally the header for a set of supported GNU
extensions, but they have not been maintained as such in many years,
they are now standing in the way of improvements to stdio, and we
don't think there are any remaining external users. _G_config.h was
never intended for public use, but predates the bits convention.
Move both of these headers into the bits directory and provide stubs
at top level which issue deprecation warnings.
The contents of (bits/)libio.h and (bits/)_G_config.h are still
exposed to external software via stdio.h; changing that requires more
complex surgery than I have time to attempt right now.
* libio/libio.h, libio/_G_config.h: New stub headers which issue a
deprecation warning and then include <bits/libio.h>, <bits/_G_config.h>
respectively.
* libio/libio.h: Rename the original version of this file to
libio/bits/libio.h. Error out if not included by stdio.h or the
stub libio.h.
* include/libio.h: Move to include/bits. Forward to libio/bits/libio.h.
* sysdeps/generic/_G_config.h: Move to top-level bits/. Error out
if not included by bits/libio.h or the stub _G_config.h.
* sysdeps/unix/sysv/linux/_G_config.h: Move to
sysdeps/unix/sysv/linux/bits. Error out if not included by
bits/libio.h or the stub _G_config.h.
* libio/stdio.h: Include bits/libio.h, not libio.h.
* libio/Makefile: Install bits/libio.h and bits/_G_config.h as
well as libio.h and _G_config.h.
* csu/init.c, libio/fmemopen.c, libio/iolibio.h, libio/oldfmemopen.c
* libio/strfile.h, stdio-common/vfscanf.c
* sysdeps/pthread/flockfile.c, sysdeps/pthread/funlockfile.c
Include stdio.h, not _G_config.h nor libio.h.
* libio/iofgetpos.c: Also rename fgetpos64 out of the way.
* libio/iofsetpos.c: Also rename fsetpos64 out of the way.
* scripts/check-installed-headers.sh: Skip libio.h and _G_config.h.
With tilepro removal, the uppercase instruction are not anymore
required to be defines as potentially macros. This is a
mechanical change done by the following shell script:
---
INSNS="LD LD4U ST ST4 BNEZ BEQZ BEQZT BGTZ CMPEQI CMPEQ CMOVEQZ CMOVNEZ"
FILES=$(find sysdeps/tile sysdeps/unix/sysv/linux/tile -iname *.S)
for insn in $INSNS; do
repl=$(echo $insn | tr '[:upper:]' '[:lower:]')
sed -i 's/\b'$insn'\b/'$repl'/g' $FILES
done
---
Checked with a build for tilegx-linux-gnu and tilegx-linux-gnu-32 with
and without the patch, there is no difference in generated binary with
a dissassemble.
* sysdeps/tile/__longjmp.S (__longjmp): Use lowercase instructions.
* sysdeps/tile/__tls_get_addr.S (__tls_get_addr): Likewise.
* sysdeps/tile/_mcount.S (__mcount): Likewise.
* sysdeps/tile/crti.S (_init, _fini): Likewise.
* sysdeps/tile/crtn.S: Likewise.
* sysdeps/tile/dl-start.S (_start): Likewise.
* sysdeps/tile/dl-trampoline.S: Likewise.
* sysdeps/tile/setjmp.S (__sigsetjmp): Likewise.
* sysdeps/tile/start.S (_start): Likewise.
* sysdeps/unix/sysv/linux/tile/clone.S (_clone): Likewise.
* sysdeps/unix/sysv/linux/tile/getcontext.S (__getcontext): Likewise.
* sysdeps/unix/sysv/linux/tile/ioctl.S (__ioctl): Likewise.
* sysdeps/unix/sysv/linux/tile/setcontext.S (__setcontext): Likewise.
* sysdeps/unix/sysv/linux/tile/swapcontext.S (__swapcontext): Likewise.
* sysdeps/unix/sysv/linux/tile/syscall.S (syscall): Likewise.
* sysdeps/unix/sysv/linux/tile/vfork.S (__vfork): Likewise.
With tilepro support removal we can now simplify internal tile support by
moving the directory structure to avoid the unnecessary directory levels
in tile/tilegx both on generic and linux folders.
Checked with a build for tilegx-linux-gnu and tilegx-linux-gnu-32 with
and without the patch, there is no difference in generated binary with
a dissassemble.
* stdlib/bug-getcontext.c (do_test): Remove tilepro mention in
comment.
* sysdeps/tile/preconfigure: Remove tilegx folder.
* sysdeps/tile/tilegx/Implies: Move definitions to ...
* sysdeps/tile/Implies: ... here.
* sysdeps/tile/tilegx/Makefile: Move rules to ...
* sysdeps/tile/Makefile: ... here.
* sysdeps/tile/tilegx/atomic-machine.h: Move definitions to ...
* sysdeps/tile/atomic-machine.h: ... here. Add include guards.
* sysdeps/tile/tilegx/bits/wordsize.h: Move to ...
* sysdeps/tile/bits/wordsize.h: ... here.
* sysdeps/tile/tilegx/*: Move to ...
* sysdeps/tile/*: ... here.
* sysdeps/tile/tilegx/tilegx32/Implies: Move to ...
* sysdeps/tile/tilegx32/Implies: ... here.
* sysdeps/tile/tilegx/tilegx64/Implies: Move to ...
* sysdeps/tile/tilegx64/Implies: ... here.
* sysdeps/unix/sysv/linux/tile/tilegx/Makefile: Move definitions
to ...
* sysdeps/unix/sysv/linux/tile/Makefile: ... here.
* sysdeps/unix/sysv/linux/tile/tilegx/*: Move to ...
* sysdeps/unix/sysv/linux/tile/*: ... here.
* sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/*: Move to ...
* sysdeps/unix/sysv/linux/tile/tilegx32/*: ... here.
* sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/*: Move to ...
* sysdeps/unix/sysv/linux/tile/tilegx64/*: ... here.
This patch consolidates the pthread_join and gnu extensions to avoid
code duplication. The function pthread_join, pthread_tryjoin_np, and
pthread_timedjoin_np are now based on pthread_timedjoin_ex.
It also fixes some inconsistencies on ESRCH, EINVAL, EDEADLK handling
(where each implementation differs from each other) and also on
clenup handler (which now always use a CAS).
Checked on i686-linux-gnu and x86_64-linux-gnu.
* nptl/pthreadP.h (__pthread_timedjoin_np): Define.
* nptl/pthread_join.c (pthread_join): Use __pthread_timedjoin_np.
* nptl/pthread_tryjoin.c (pthread_tryjoin): Likewise.
* nptl/pthread_timedjoin.c (cleanup): Use CAS on argument setting.
(pthread_timedjoin_np): Define internal symbol and common code from
pthread_join.
* sysdeps/unix/sysv/linux/i386/lowlevellock.h (__lll_timedwait_tid):
Remove superflous checks.
* sysdeps/unix/sysv/linux/x86_64/lowlevellock.h (__lll_timedwait_tid):
Likewise.
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
All binaries use TLS and thus need a properly set up TCB, so we can
simply return its address directly, instead of forwarding to the
libpthread implementation from libc.
For versioned symbols, the dynamic linker checks that the soname matches
the name supplied by the link editor, so a compatibility symbol in
libpthread is needed.
To avoid linking against the libpthread function in all cases, we would
have to bump the symbol version of libpthread in libc.so and supply a
compat symbol. This commit does not do that because the function
implementation is so small, so the overhead by two active copies of the
same function might well be smaller than the increase in symbol table
size.
In C++ mode, __MATH_TG cannot be used for defining iseqsig, because
__MATH_TG relies on __builtin_types_compatible_p, which is a C-only
builtin. This is true when float128 is provided as an ABI-distinct type
from long double.
Moreover, the comparison macros from ISO C take two floating-point
arguments, which need not have the same type. Choosing what underlying
function to call requires evaluating the formats of the arguments, then
selecting which is wider. The macro __MATH_EVAL_FMT2 provides this
information, however, only the type of the macro expansion is relevant
(actually evaluating the expression would be incorrect).
This patch provides a C++ version of iseqsig, in which only the type of
__MATH_EVAL_FMT2 (__typeof or decltype) is used as a template parameter
for __iseqsig_type. This function calls the appropriate underlying
function.
Tested for powerpc64le and x86_64.
[BZ #22377]
* math/Makefile [C++] (tests): Add test for iseqsig.
* math/math.h [C++] (iseqsig): New implementation, which does
not rely on __MATH_TG/__builtin_types_compatible_p.
* math/test-math-iseqsig.cc: New file.
* sysdeps/powerpc/powerpc64le/Makefile
(CFLAGS-test-math-iseqsig.cc): New variable.
As noted in bug 21309, dbl-64/e_pow.c contains signed int shifts that,
although the shift count is in the range [0, 31], shift bits into and
beyond the sign bit and so are undefined in ISO C. Although this is
defined in GNU C, this patch from the bug cleans up the code to avoid
those shifts.
Tested for x86_64.
[BZ #21309]
* sysdeps/ieee754/dbl-64/e_pow.c (checkint): Make m and n
unsigned.
These changes will be active for all platforms that don't provide
their own exp() routines. They will also be active for ieee754
versions of ccos, ccosh, cosh, csin, csinh, sinh, exp10, gamma, and
erf.
Typical performance gains is typically around 5x when measured on
Sparc s7 for common values between exp(1) and exp(40).
Using the glibc perf tests on sparc,
sparc (nsec) x86 (nsec)
old new old new
max 17629 395 5173 144
min 399 54 15 13
mean 5317 200 1349 23
The extreme max times for the old (ieee754) exp are due to the
multiprecision computation in the old algorithm when the true value is
very near 0.5 ulp away from an value representable in double
precision. The new algorithm does not take special measures for those
cases. The current glibc exp perf tests overrepresent those values.
Informal testing suggests approximately one in 200 cases might
invoke the high cost computation. The performance advantage of the new
algorithm for other values is still large but not as large as indicated
by the chart above.
Glibc correctness tests for exp() and expf() were run. Within the
test suite 3 input values were found to cause 1 bit differences (ulp)
when "FE_TONEAREST" rounding mode is set. No differences in exp() were
seen for the tested values for the other rounding modes.
Typical example:
exp(-0x1.760cd2p+0) (-1.46113312244415283203125)
new code: 2.31973271630014299393707e-01 0x1.db14cd799387ap-3
old code: 2.31973271630014271638132e-01 0x1.db14cd7993879p-3
exp = 2.31973271630014285508337 (high precision)
Old delta: off by 0.49 ulp
New delta: off by 0.51 ulp
In addition, because ieee754_exp() is used by other routines, cexp()
showed test results with very small imaginary input values where the
imaginary portion of the result was off by 3 ulp when in upward
rounding mode, but not in the other rounding modes. For x86, tgamma
showed a few values where the ulp increased to 6 (max ulp for tgamma
is 5). Sparc tgamma did not show these failures. I presume the tgamma
differences are due to compiler optimization differences within the
gamma function.The gamma function is known to be difficult to compute
accurately.
* sysdeps/ieee754/dbl-64/e_exp.c: Include <math-svid-compat.h> and
<errno.h>. Include "eexp.tbl".
(half): New constant.
(one): Likewise.
(__ieee754_exp): Rewrite.
(__slowexp): Remove prototype.
* sysdeps/ieee754/dbl-64/eexp.tbl: New file.
* sysdeps/ieee754/dbl-64/slowexp.c: Remove file.
* sysdeps/i386/fpu/slowexp.c: Likewise.
* sysdeps/ia64/fpu/slowexp.c: Likewise.
* sysdeps/m68k/m680x0/fpu/slowexp.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/slowexp-avx.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/slowexp-fma.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/slowexp-fma4.c: Likewise.
* sysdeps/generic/math_private.h (__slowexp): Remove prototype.
* sysdeps/ieee754/dbl-64/e_pow.c: Remove mention of slowexp.c in
comment.
* sysdeps/powerpc/power4/fpu/Makefile [$(subdir) = math]
(CPPFLAGS-slowexp.c): Remove variable.
* sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines):
Remove slowexp-fma, slowexp-fma4 and slowexp-avx.
(CFLAGS-slowexp-fma.c): Remove variable.
(CFLAGS-slowexp-fma4.c): Likewise.
(CFLAGS-slowexp-avx.c): Likewise.
* sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__slowexp): Do not
define as macro.
* sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__slowexp): Likewise.
* sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__slowexp): Likewise.
* math/Makefile (type-double-routines): Remove slowexp.
* manual/probes.texi (slowexp_p6): Remove.
(slowexp_p32): Likewise.
Current optimized ia64 memchr uses a strategy to check for last address
by adding the input one with expected size. However it does not take
care for possible overflow.
It was triggered by 3038145ca2 where default rawmemchr now uses memchr
(p, c, (size_t)-1).
This patch fixes it by implement a satured addition where overflows
sets the maximum pointer size to UINTPTR_MAX.
Checked on ia64-linux-gnu where it fixes both stratcliff and
test-rawmemchr failures.
Adhemerval Zanella <adhemerval.zanella@linaro.org>
James Clarke <jrtc27@jrtc27.com>
[BZ #22603]
* sysdeps/ia64/memchr.S (__memchr): Avoid overflow in pointer
addition.
Since 3f823e87cc (Call exit directly in clone (BZ #21512)) SH clone
implementation fails to set the exit code resulting in the failures:
FAIL: nptl/tst-align-clone
FAIL: nptl/tst-getpid1
This patch fixes the both testcases.
[BZ #22605]
* sysdeps/unix/sysv/linux/sh/clone.S (__clone): Fix exit return
code.
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
On x86, padding in struct __jmp_buf_tag is used for shadow stack pointer
to support Shadow Stack in Intel Control-flow Enforcemen Technology.
cancel_jmp_buf has been updated to include saved_mask so that it is as
large as struct __jmp_buf_tag. We must suport the old cancel_jmp_buf
in existing binaries. Since symbol versioning doesn't work on
cancel_jmp_buf, feature_1 is added to tcbhead_t so that setjmp and
longjmp can check if shadow stack is enabled. NB: Shadow stack is
enabled only if all modules are shadow stack enabled.
[BZ #22563]
* sysdeps/i386/nptl/tcb-offsets.sym (FEATURE_1_OFFSET): New.
* sysdeps/i386/nptl/tls.h (tcbhead_t): Add feature_1.
* sysdeps/x86_64/nptl/tcb-offsets.sym (FEATURE_1_OFFSET): New.
* sysdeps/x86_64/nptl/tls.h (tcbhead_t): Rename __glibc_unused1
to feature_1.
On x86, padding in struct __jmp_buf_tag is used for shadow stack pointer
to support shadow stack in Intel Control-flow Enforcemen Technology.
Since the cancel_jmp_buf array is passed to setjmp and longjmp by
casting it to pointer to struct __jmp_buf_tag, it should be as large
as struct __jmp_buf_tag. Otherwise when shadow stack is enabled,
setjmp and longjmp will write and read beyond cancel_jmp_buf when saving
and restoring shadow stack pointer.
This patch adds bits/types/__cancel_jmp_buf_tag.h to define struct
__cancel_jmp_buf_tag so that Linux/x86 can add saved_mask to
cancel_jmp_buf.
Tested natively on i386, x86_64 and x32. Tested hppa-linux-gnu with
build-many-glibcs.py.
[BZ #22563]
* bits/types/__cancel_jmp_buf_tag.h: New file.
* sysdeps/unix/sysv/linux/x86/bits/types/__cancel_jmp_buf_tag.h
* sysdeps/unix/sysv/linux/x86/pthreaddef.h: Likewise.
* sysdeps/unix/sysv/linux/x86/nptl/pthreadP.h: Likewise.
* nptl/Makefile (headers): Add
bits/types/__cancel_jmp_buf_tag.h.
* nptl/descr.h [NEED_SAVED_MASK_IN_CANCEL_JMP_BUF]
(pthread_unwind_buf): Add saved_mask to cancel_jmp_buf.
* sysdeps/nptl/pthread.h: Include
<bits/types/__cancel_jmp_buf_tag.h>.
(__pthread_unwind_buf_t): Use struct __cancel_jmp_buf_tag with
__cancel_jmp_buf.
* sysdeps/unix/sysv/linux/hppa/pthread.h: Likewise.
m68k bits/mathinline.h declares various functions with const
attributes. These are inappropriate for functions that have results
depending on the rounding mode; the machine-independent
bits/mathcalls.h only uses const attributes for a very few functions
with no rounding mode dependence, and the m68k header should do
likewise. GCC uses pure for such functions with -frounding-math,
resulting in GCC mainline warning for conflicts with between the
header and the built-in attributes and glibc failing to build for m68k
with GCC mainline.
This patch fixes the attributes to avoid using const except when
bits/mathcalls.h does so. (There are a few functions where maybe
bits/mathcalls.h could do so but doesn't, but keeping the headers in
sync in this regard seems to be the safe approach.)
Tested compilation with build-many-glibcs.py with GCC mainline.
[BZ #22631]
* sysdeps/m68k/m680x0/fpu/bits/mathinline.h (__m81_defun): Add
argument for attrubutes. All callers changed.
(__inline_mathop1): Likewise. All callers changed.
(__inline_mathop): Likewise. All callers changed.
[__USE_MISC] (scalbn): Use __inline_forward instead of
__inline_forward_c.
[__USE_ISOC99] (scalbln): Likewise.
[__USE_ISOC99] (nearbyint): Likewise.
[__USE_ISOC99] (lrint): Likewise.
[__USE_MISC] (scalbnf): Likewise.
[__USE_ISOC99] (scalblnf): Likewise.
[__USE_ISOC99] (nearbyintf): Likewise.
[__USE_ISOC99] (lrintf): Likewise.
[__USE_MISC] (scalbnl): Likewise.
[__USE_ISOC99] (scalblnl): Likewise.
[__USE_ISOC99] (nearbyintl): Likewise.
[__USE_ISOC99] (lrintl): Likewise.
* sysdeps/m68k/m680x0/fpu/mathimpl.h: All callers of
__inline_mathop and __m81_defun changed.
GLRO (_rtld_global_ro) is read-only after initialization and can
therefore not be patched at run time, unlike the hook table addresses
and their contents, so this is a desirable hardening feature.
The hooks are only needed if ld.so has not been initialized, and this
happens only after static dlopen (dlmopen uses a single ld.so object
across all namespaces).
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Similar to commit 1ab47db00dfbc0128119e3503d3ed640ffc4830b
("mips64: fix clobbering s0 in setjmp() [BZ #22624]")
as sysdeps/mips/setjmp_aux.c is almost an identical copy
of sysdeps/mips/mips64/setjmp_aux.c.
[BZ #22624]
* sysdeps/mips/setjmp_aux.c (__sigsetjmp_aux): Use
inhibit_stack_protector.
When configured as --enable-stack-protector=all glibc
inserts stack checking canary into every function
including __sigsetjmp_aux(). Stack checking code
ends up using s0 register to temporary hold address
of global canary value.
Unfortunately __sigsetjmp_aux assumes no caller' caller-save
registers should be clobbered as it stores them as-is.
The fix is to disable stack protection of __sigsetjmp_aux.
Tested on the following test:
#include <setjmp.h>
#include <stdio.h>
int main() {
jmp_buf jb;
volatile register long s0 asm ("$s0");
s0 = 1234;
if (setjmp(jb) == 0)
longjmp(jb, 1);
printf ("$s0 = %lu\n", s0);
}
Without the fix:
$ qemu-mipsn32 -L . ./mips-longjmp-bug
$s0 = 1082346228
With the fix:
$ qemu-mipsn32 -L . ./mips-longjmp-bug
$s0 = 1234
[BZ #22624]
* sysdeps/mips/mips64/setjmp_aux.c (__sigsetjmp_aux): Use
inhibit_stack_protector.
There are three flavors of the crt startup code:
1) crt1.o used for non-pie,
2) Scrt1.o used for dynamic linked pie (dynamic linker relocates),
3) rcrt1.o used for static linked pie (self relocation is needed)
In the --enable-static-pie case crt1.o is built with -DPIC and in case
of static linking it interposes _dl_relocate_static_pie in libc to
avoid self relocation.
Scrt1.o is built with -DPIC -DSHARED and it relies on GOT entries that
the static linker cannot relax and thus need relocation before the
start code is executed, so rcrt1.o needs separate implementation.
This implementation does not work for .text > 4G position independent
executables, which is fine since the toolchain does not support
-mcmodel=large with -fPIE.
Tests pass with ld/22269 and ld/22263 binutils bugs fixed.
* sysdeps/aarch64/start.S (_start): Handle PIC && !SHARED case.
Static PIE extends address space layout randomization to static
executables. It provides additional security hardening benefits at
the cost of some memory and performance.
Dynamic linker, ld.so, is a standalone program which can be loaded at
any address. This patch adds a configure option, --enable-static-pie,
to embed the part of ld.so in static executable to create static position
independent executable (static PIE). A static PIE is similar to static
executable, but can be loaded at any address without help from a dynamic
linker. When --enable-static-pie is used to configure glibc, libc.a is
built as PIE and all static executables, including tests, are built as
static PIE. The resulting libc.a can be used together with GCC 8 or
above to build static PIE with the compiler option, -static-pie. But
GCC 8 isn't required to build glibc with --enable-static-pie. Only GCC
with PIE support is needed. When an older GCC is used to build glibc
with --enable-static-pie, proper input files are passed to linker to
create static executables as static PIE, together with "-z text" to
prevent dynamic relocations in read-only segments, which are not allowed
in static PIE.
The following changes are made for static PIE:
1. Add a new function, _dl_relocate_static_pie, to:
a. Get the run-time load address.
b. Read the dynamic section.
c. Perform dynamic relocations.
Dynamic linker also performs these steps. But static PIE doesn't load
any shared objects.
2. Call _dl_relocate_static_pie at entrance of LIBC_START_MAIN in
libc.a. crt1.o, which is used to create dynamic and non-PIE static
executables, is updated to include a dummy _dl_relocate_static_pie.
rcrt1.o is added to create static PIE, which will link in the real
_dl_relocate_static_pie. grcrt1.o is also added to create static PIE
with -pg. GCC 8 has been updated to support rcrt1.o and grcrt1.o for
static PIE.
Static PIE can work on all architectures which support PIE, provided:
1. Target must support accessing of local functions without dynamic
relocations, which is needed in start.S to call __libc_start_main with
function addresses of __libc_csu_init, __libc_csu_fini and main. All
functions in static PIE are local functions. If PIE start.S can't reach
main () defined in a shared object, the code sequence:
pass address of local_main to __libc_start_main
...
local_main:
tail call to main via PLT
can be used.
2. start.S is updated to check PIC instead SHARED for PIC code path and
avoid dynamic relocation, when PIC is defined and SHARED isn't defined,
to support static PIE.
3. All assembly codes are updated check PIC instead SHARED for PIC code
path to avoid dynamic relocations in read-only sections.
4. All assembly codes are updated check SHARED instead PIC for static
symbol name.
5. elf_machine_load_address in dl-machine.h are updated to support static
PIE.
6. __brk works without TLS nor dynamic relocations in read-only section
so that it can be used by __libc_setup_tls to initializes TLS in static
PIE.
NB: When glibc is built with GCC defaulted to PIE, libc.a is compiled
with -fPIE, regardless if --enable-static-pie is used to configure glibc.
When glibc is configured with --enable-static-pie, libc.a is compiled
with -fPIE, regardless whether GCC defaults to PIE or not. The same
libc.a can be used to build both static executable and static PIE.
There is no need for separate PIE copy of libc.a.
On x86-64, the normal static sln:
text data bss dec hex filename
625425 8284 5456 639165 9c0bd elf/sln
the static PIE sln:
text data bss dec hex filename
657626 20636 5392 683654 a6e86 elf/sln
The code size is increased by 5% and the binary size is increased by 7%.
Linker requirements to build glibc with --enable-static-pie:
1. Linker supports --no-dynamic-linker to remove PT_INTERP segment from
static PIE.
2. Linker can create working static PIE. The x86-64 linker needs the
fix for
https://sourceware.org/bugzilla/show_bug.cgi?id=21782
The i386 linker needs to be able to convert "movl main@GOT(%ebx), %eax"
to "leal main@GOTOFF(%ebx), %eax" if main is defined locally.
Binutils 2.29 or above are OK for i686 and x86-64. But linker status for
other targets need to be verified.
3. Linker should resolve undefined weak symbols to 0 in static PIE:
https://sourceware.org/bugzilla/show_bug.cgi?id=22269
4. Many ELF backend linkers incorrectly check bfd_link_pic for TLS
relocations, which should check bfd_link_executable instead:
https://sourceware.org/bugzilla/show_bug.cgi?id=22263
Tested on aarch64, i686 and x86-64.
Using GCC 7 and binutils master branch, build-many-glibcs.py with
--enable-static-pie with all patches for static PIE applied have the
following build successes:
PASS: glibcs-aarch64_be-linux-gnu build
PASS: glibcs-aarch64-linux-gnu build
PASS: glibcs-armeb-linux-gnueabi-be8 build
PASS: glibcs-armeb-linux-gnueabi build
PASS: glibcs-armeb-linux-gnueabihf-be8 build
PASS: glibcs-armeb-linux-gnueabihf build
PASS: glibcs-arm-linux-gnueabi build
PASS: glibcs-arm-linux-gnueabihf build
PASS: glibcs-arm-linux-gnueabihf-v7a build
PASS: glibcs-arm-linux-gnueabihf-v7a-disable-multi-arch build
PASS: glibcs-m68k-linux-gnu build
PASS: glibcs-microblazeel-linux-gnu build
PASS: glibcs-microblaze-linux-gnu build
PASS: glibcs-mips64el-linux-gnu-n32 build
PASS: glibcs-mips64el-linux-gnu-n32-nan2008 build
PASS: glibcs-mips64el-linux-gnu-n32-nan2008-soft build
PASS: glibcs-mips64el-linux-gnu-n32-soft build
PASS: glibcs-mips64el-linux-gnu-n64 build
PASS: glibcs-mips64el-linux-gnu-n64-nan2008 build
PASS: glibcs-mips64el-linux-gnu-n64-nan2008-soft build
PASS: glibcs-mips64el-linux-gnu-n64-soft build
PASS: glibcs-mips64-linux-gnu-n32 build
PASS: glibcs-mips64-linux-gnu-n32-nan2008 build
PASS: glibcs-mips64-linux-gnu-n32-nan2008-soft build
PASS: glibcs-mips64-linux-gnu-n32-soft build
PASS: glibcs-mips64-linux-gnu-n64 build
PASS: glibcs-mips64-linux-gnu-n64-nan2008 build
PASS: glibcs-mips64-linux-gnu-n64-nan2008-soft build
PASS: glibcs-mips64-linux-gnu-n64-soft build
PASS: glibcs-mipsel-linux-gnu build
PASS: glibcs-mipsel-linux-gnu-nan2008 build
PASS: glibcs-mipsel-linux-gnu-nan2008-soft build
PASS: glibcs-mipsel-linux-gnu-soft build
PASS: glibcs-mips-linux-gnu build
PASS: glibcs-mips-linux-gnu-nan2008 build
PASS: glibcs-mips-linux-gnu-nan2008-soft build
PASS: glibcs-mips-linux-gnu-soft build
PASS: glibcs-nios2-linux-gnu build
PASS: glibcs-powerpc64le-linux-gnu build
PASS: glibcs-powerpc64-linux-gnu build
PASS: glibcs-tilegxbe-linux-gnu-32 build
PASS: glibcs-tilegxbe-linux-gnu build
PASS: glibcs-tilegx-linux-gnu-32 build
PASS: glibcs-tilegx-linux-gnu build
PASS: glibcs-tilepro-linux-gnu build
and the following build failures:
FAIL: glibcs-alpha-linux-gnu build
elf/sln is failed to link due to:
assertion fail bfd/elf64-alpha.c:4125
This is caused by linker bug and/or non-PIC code in PIE libc.a.
FAIL: glibcs-hppa-linux-gnu build
elf/sln is failed to link due to:
collect2: fatal error: ld terminated with signal 11 [Segmentation fault]
https://sourceware.org/bugzilla/show_bug.cgi?id=22537
FAIL: glibcs-ia64-linux-gnu build
elf/sln is failed to link due to:
collect2: fatal error: ld terminated with signal 11 [Segmentation fault]
FAIL: glibcs-powerpc-linux-gnu build
FAIL: glibcs-powerpc-linux-gnu-soft build
FAIL: glibcs-powerpc-linux-gnuspe build
FAIL: glibcs-powerpc-linux-gnuspe-e500v1 build
elf/sln is failed to link due to:
ld: read-only segment has dynamic relocations.
This is caused by linker bug and/or non-PIC code in PIE libc.a. See:
https://sourceware.org/bugzilla/show_bug.cgi?id=22264
FAIL: glibcs-powerpc-linux-gnu-power4 build
elf/sln is failed to link due to:
findlocale.c:96:(.text+0x22c): @local call to ifunc memchr
This is caused by linker bug and/or non-PIC code in PIE libc.a.
FAIL: glibcs-s390-linux-gnu build
elf/sln is failed to link due to:
collect2: fatal error: ld terminated with signal 11 [Segmentation fault], core dumped
assertion fail bfd/elflink.c:14299
This is caused by linker bug and/or non-PIC code in PIE libc.a.
FAIL: glibcs-sh3eb-linux-gnu build
FAIL: glibcs-sh3-linux-gnu build
FAIL: glibcs-sh4eb-linux-gnu build
FAIL: glibcs-sh4eb-linux-gnu-soft build
FAIL: glibcs-sh4-linux-gnu build
FAIL: glibcs-sh4-linux-gnu-soft build
elf/sln is failed to link due to:
ld: read-only segment has dynamic relocations.
This is caused by linker bug and/or non-PIC code in PIE libc.a. See:
https://sourceware.org/bugzilla/show_bug.cgi?id=22263
Also TLS code sequence in SH assembly syscalls in glibc doesn't match TLS
code sequence expected by ld:
https://sourceware.org/bugzilla/show_bug.cgi?id=22270
FAIL: glibcs-sparc64-linux-gnu build
FAIL: glibcs-sparcv9-linux-gnu build
FAIL: glibcs-tilegxbe-linux-gnu build
FAIL: glibcs-tilegxbe-linux-gnu-32 build
FAIL: glibcs-tilegx-linux-gnu build
FAIL: glibcs-tilegx-linux-gnu-32 build
FAIL: glibcs-tilepro-linux-gnu build
elf/sln is failed to link due to:
ld: read-only segment has dynamic relocations.
This is caused by linker bug and/or non-PIC code in PIE libc.a. See:
https://sourceware.org/bugzilla/show_bug.cgi?id=22263
[BZ #19574]
* INSTALL: Regenerated.
* Makeconfig (real-static-start-installed-name): New.
(pic-default): Updated for --enable-static-pie.
(pie-default): New for --enable-static-pie.
(default-pie-ldflag): Likewise.
(+link-static-before-libc): Replace $(DEFAULT-LDFLAGS-$(@F))
with $(if $($(@F)-no-pie),$(no-pie-ldflag),$(default-pie-ldflag)).
Replace $(static-start-installed-name) with
$(real-static-start-installed-name).
(+prectorT): Updated for --enable-static-pie.
(+postctorT): Likewise.
(CFLAGS-.o): Add $(pie-default).
(CFLAGS-.op): Likewise.
* NEWS: Mention --enable-static-pie.
* config.h.in (ENABLE_STATIC_PIE): New.
* configure.ac (--enable-static-pie): New configure option.
(have-no-dynamic-linker): New LIBC_CONFIG_VAR.
(have-static-pie): Likewise.
Enable static PIE if linker supports --no-dynamic-linker.
(ENABLE_STATIC_PIE): New AC_DEFINE.
(enable-static-pie): New LIBC_CONFIG_VAR.
* configure: Regenerated.
* csu/Makefile (omit-deps): Add r$(start-installed-name) and
gr$(start-installed-name) for --enable-static-pie.
(extra-objs): Likewise.
(install-lib): Likewise.
(extra-objs): Add static-reloc.o and static-reloc.os
($(objpfx)$(start-installed-name)): Also depend on
$(objpfx)static-reloc.o.
($(objpfx)r$(start-installed-name)): New.
($(objpfx)g$(start-installed-name)): Also depend on
$(objpfx)static-reloc.os.
($(objpfx)gr$(start-installed-name)): New.
* csu/libc-start.c (LIBC_START_MAIN): Call _dl_relocate_static_pie
in libc.a.
* csu/libc-tls.c (__libc_setup_tls): Add main_map->l_addr to
initimage.
* csu/static-reloc.c: New file.
* elf/Makefile (routines): Add dl-reloc-static-pie.
(elide-routines.os): Likewise.
(DEFAULT-LDFLAGS-tst-tls1-static-non-pie): Removed.
(tst-tls1-static-non-pie-no-pie): New.
* elf/dl-reloc-static-pie.c: New file.
* elf/dl-support.c (_dl_get_dl_main_map): New function.
* elf/dynamic-link.h (ELF_DURING_STARTUP): Also check
STATIC_PIE_BOOTSTRAP.
* elf/get-dynamic-info.h (elf_get_dynamic_info): Likewise.
* gmon/Makefile (tests): Add tst-gmon-static-pie.
(tests-static): Likewise.
(DEFAULT-LDFLAGS-tst-gmon-static): Removed.
(tst-gmon-static-no-pie): New.
(CFLAGS-tst-gmon-static-pie.c): Likewise.
(CRT-tst-gmon-static-pie): Likewise.
(tst-gmon-static-pie-ENV): Likewise.
(tests-special): Likewise.
($(objpfx)tst-gmon-static-pie.out): Likewise.
(clean-tst-gmon-static-pie-data): Likewise.
($(objpfx)tst-gmon-static-pie-gprof.out): Likewise.
* gmon/tst-gmon-static-pie.c: New file.
* manual/install.texi: Document --enable-static-pie.
* sysdeps/generic/ldsodefs.h (_dl_relocate_static_pie): New.
(_dl_get_dl_main_map): Likewise.
* sysdeps/i386/configure.ac: Check if linker supports static PIE.
* sysdeps/x86_64/configure.ac: Likewise.
* sysdeps/i386/configure: Regenerated.
* sysdeps/x86_64/configure: Likewise.
* sysdeps/mips/Makefile (ASFLAGS-.o): Add $(pie-default).
(ASFLAGS-.op): Likewise.
While working on another patch I noticed that (a)
sysdeps/sparc/sparc32/Makefile is the only place with special
realclean settings, apart from po/, and (b) the generated files with a
rule in that Makefile to generate them (using m4) had been patched
manually so no longer corresponded with the output of the generator -
so if the timestamps were wrong, a build would result in changes to
the files in the source directory. (They also didn't correspond
because of changes in make 3.81 to how make handles whitespace at the
start of a line in a sequence of backslash-newline continuation lines
within a recipe.)
This patch fixes the generation and output files to match. The issue
with make and whitespace at start of continuation lines is fixed by
putting those newlines outside of arguments to echo, so the number of
spaces in the argument matches the number in the existing generated
files. Then divrem.m4 is changed to avoid generating whitespace-only
lines (my fix to the outputs from 2013; this fix to the generator also
changes the indentation of a label in the output files) and to
generate an alias in udiv.S (Adhemerval's fix from March).
build-many-glibcs.py doesn't have a non-v9 SPARC configuration,
because non-v9 32-bit SPARC didn't build when I set up
build-many-glibcs.py but sparcv9 did build. Whether or not non-v9
32-bit SPARC now builds (or indeed whether or not support for it is
obsolete), I tested by removing the sparcv8 and sparcv9 versions of
the four files in question, so forcing the generated files to be built
and used, and the compilation parts of the glibc testsuite passed.
* sysdeps/sparc/sparc32/Makefile
($(divrem:%=$(sysdep_dir)/sparc/sparc32/%.S)): Do not include
start-of-line whitespace in argument of echo.
* sysdeps/sparc/sparc32/divrem.m4: Avoid generating lines starting
with whitespace. Generate __wrap_.udiv alias.
* sysdeps/sparc/sparc32/rem.S: Regenerated.
* sysdeps/sparc/sparc32/sdiv.S: Likewise.
* sysdeps/sparc/sparc32/udiv.S: Likewise.
* sysdeps/sparc/sparc32/urem.S: Likewise.
This patch makes use of vectors for aligned inputs. Improvements
upto 30% seen for larger aligned inputs.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com>
Support added to identify Sparc M7/T7/S7/M8/T8 processor capability.
Performance tests run on Sparc S7 using new code and old niagara4 code.
Optimizations for memset also apply to bzero as they share code.
For memset/bzero, performance comparison with niagara4 code:
For memset nonzero data,
256-1023 bytes - 60-90% gain (in cache); 5% gain (out of cache)
1K+ bytes - 80-260% gain (in cache); 40-80% gain (out of cache)
For memset zero data (and bzero),
256-1023 bytes - 80-120% gain (in cache), 0% gain (out of cache)
1024+ bytes - 2-4x gain (in cache), 10-35% gain (out of cache)
Tested in sparcv9-*-* and sparc64-*-* targets in both multi and
non-multi arch configurations.
Patrick McGehearty <patrick.mcgehearty@oracle.com>
Adhemerval Zanella <adhemerval.zanella@linaro.org>
* sysdeps/sparc/sparc32/sparcv9/multiarch/Makefile
(sysdeps_routines): Add memset-niagara7.
* sysdeps/sparc/sparc64/multiarch/Makefile (sysdes_rotuines):
Likewise.
* sysdeps/sparc/sparc32/sparcv9/multiarch/memset-niagara7.S: New
file.
* sysdeps/sparc/sparc64/multiarch/memset-niagara7.S: Likewise.
* sysdeps/sparc/sparc64/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Add __bzero_niagara7 and __memset_niagara7.
* sysdeps/sparc/sparc64/multiarch/ifunc-memset.h (IFUNC_SELECTOR):
Add niagara7 option.
* NEWS: Mention sparc m7 optimized memcpy, mempcpy, memmove, and
memset.
Support added to identify Sparc M7/T7/S7/M8/T8 processor capability.
Performance tests run on Sparc S7 using new code and old niagara4 code.
Optimizations for memcpy also apply to mempcpy and memmove
where they share code. Optimizations for memset also apply
to bzero as they share code.
For memcpy/mempcpy/memmove, performance comparison with niagara4 code:
Long word aligned data
0-127 bytes - minimal changes
128-1023 bytes - 7-30% gain
1024+ bytes - 1-7% gain (in cache); 30-100% gain (out of cache)
Word aligned data
0-127 bytes - 50%+ gain
128-1023 bytes - 10-200% gain
1024+ bytes - 0-15% gain (in cache); 5-50% gain (out of cache)
Unaligned data
0-127 bytes - 0-70%+ gain
128-447 bytes - 40-80%+ gain
448-511 bytes - 1-3% loss
512-4096 bytes - 2-3% gain (in cache); 0-20% gain (out of cache)
4096+ bytes - ± 3% (in cache); 20-50% gain (out of cache)
Tested in sparcv9-*-* and sparc64-*-* targets in both multi and
non-multi arch configurations.
Patrick McGehearty <patrick.mcgehearty@oracle.com>
Adhemerval Zanella <adhemerval.zanella@linaro.org>
* sysdeps/sparc/sparc32/sparcv9/multiarch/Makefile
(sysdeps_routines): Add memcpy-memmove-niagara7 and memmove-ultra1.
* sysdeps/sparc/sparc64/multiarch/Makefile (sysdeps_routines):
Likewise.
* sysdeps/sparc/sparc32/sparcv9/multiarch/memcpy-memmove-niagara7.S:
New file.
* sysdeps/sparc/sparc32/sparcv9/multiarch/memmove-ultra1.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/multiarch/rtld-memmove.c: Likewise.
* sysdeps/sparc/sparc64/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Add __memcpy_niagara7, __mempcpy_niagara7,
and __memmove_niagara7.
* sysdeps/sparc/sparc64/multiarch/ifunc-memcpy.h (IFUNC_SELECTOR):
Add niagara7 option.
* sysdeps/sparc/sparc64/multiarch/memmove.c: New file.
* sysdeps/sparc/sparc64/multiarch/ifunc-memmove.h: Likewise.
* sysdeps/sparc/sparc64/multiarch/memcpy-memmove-niagara7.S: Likewise.
* sysdeps/sparc/sparc64/multiarch/memmove-ultra1.S: Likewise.
* sysdeps/sparc/sparc64/multiarch/rtld-memmove.c: Likewise.
Tested in sparcv9-*-* and sparc64-*-* targets in both non-multi-arch and
multi-arch configurations.
* sysdeps/sparc/sparc32/sparcv9/memmove.S: New file.
* sysdeps/sparc/sparc32/sparcv9/rtld-memmove.c: Likewise.
* sysdeps/sparc/sparc64/memmove.S: Likewise.
* sysdeps/sparc/sparc64/rtld-memmove.c: Likewise.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This patch adds support for the ADP (also known as adi) hardware
capability, as reported by the kernel sparc port when running on M7
machines.
Tested in both sparcv9-*-* and sparc64-*-* targets.
* sysdeps/sparc/bits/hwcap.h (HWCAP_SPARC_ADP): Defined.
* sysdeps/sparc/dl-procinfo.c: Added "adp" to the
_dl_sparc_cap_flags array.
* sysdeps/sparc/dl-procinfo.h (_DL_HWCAP_COUNT): Increment.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Replace the simple byte-wise compare in the misaligned case with a
dword compare with page boundary checks in place. For simplicity I've
chosen a 4K page boundary so that we don't have to query the actual
page size on the system.
This results in up to 3x improvement in performance in the unaligned
case on falkor and about 2.5x improvement on mustang as measured using
bench-strcmp.
* sysdeps/aarch64/strcmp.S (misaligned8): Compare dword at a
time whenever possible.
The default sysdeps/ieee754 fma implementations rely on exceptions and
rounding modes to achieve correct results through internal use of
round-to-odd. Thus, glibc configurations without support for
exceptions and rounding modes instead need to use implementations of
fma based on soft-fp.
At present, this is achieved via having implementation files in
soft-fp/ that are #included by sysdeps files for each glibc
configuration that needs them. In general this means such a
configuration has its own s_fma.c and s_fmaf.c.
TS 18661-1 adds functions that do an operation (+ - * / sqrt fma) on
arguments wider than the return type, with a single rounding of the
infinite-precision result to that return type. These are also
naturally implemented using round-to-odd on platforms with hardware
support for rounding modes and exceptions but lacking hardware support
for these narrowing operations themselves. (Platforms that have
direct hardware support for such narrowing operations include at least
ia64, and Power ISA 2.07 or later, which I think means POWER8 or
later.)
So adding the remaining TS 18661-1 functions would mean at least six
narrowing function implementations (fadd fsub fmul fdiv ffma fsqrt),
with aliases for other types and further implementations in some
configurations, that need to be overridden for configurations lacking
hardware exceptions and rounding modes. Requiring all such
configurations (currently seven of them) to have their own source
files for all those functions seems undesirable.
Thus, this patch adds a directory sysdeps/ieee754/soft-fp to contain
libm function implementations based on soft-fp. This directory is
then used via Implies from all the configurations that need it, so no
more files need adding to every such configuration when adding more
functions with soft-fp implementations. A configuration can still
selectively #include a particular file from this directory if desired;
thus, the MIPS #include of the fmal implementation is retained, since
that's appropriate even for hard float (because long double is always
implementated in software for MIPS64, so the soft-fp implementation of
fmal is better than the ldbl-128 one).
This also provides additional motivation for my recent patch removing
--with-fp / --without-fp: previously there was no need for correct use
of --without-fp for no-FPU ARM or SH3, and now we have autodetection
nofpu/ sysdeps directories can be used by this patch for those
configurations without imposing any new requirements on how glibc is
configured.
(The mips64/*/fpu/s_fma.c files added by this patch are needed to keep
the dbl-64 version of fma for double, rather than the ldbl-128 one,
used in that case.)
Tested with build-many-glibcs.py that installed stripped shared
libraries are unchanged by this patch.
* soft-fp/fmadf4.c: Move to ....
* sysdeps/ieee754/soft-fp/s_fma.c: ... here.
* soft-fp/fmasf4.c: Move to ....
* sysdeps/ieee754/soft-fp/s_fmaf.c: ... here.
* soft-fp/fmatf4.c: Move to ....
* sysdeps/ieee754/soft-fp/s_fmal.c: ... here.
* sysdeps/ieee754/soft-fp/Makefile: New file.
* sysdeps/arm/preconfigure.ac: Define with_fp_cond.
* sysdeps/arm/preconfigure: Regenerated.
* sysdeps/arm/nofpu/Implies: New file.
* sysdeps/arm/s_fma.c: Remove file.
* sysdeps/arm/s_fmaf.c: Likewise.
* sysdeps/m68k/coldfire/nofpu/Implies: New file.
* sysdeps/m68k/coldfire/nofpu/s_fma.c: Remove file.
* sysdeps/m68k/coldfire/nofpu/s_fmaf.c: Likewise.
* sysdeps/microblaze/Implies: Add ieee754/soft-fp.
* sysdeps/microblaze/s_fma.c: Remove file.
* sysdeps/microblaze/s_fmaf.c: Likewise.
* sysdeps/mips/mips32/nofpu/Implies: New file.
* sysdeps/mips/mips64/n32/fpu/s_fma.c: Likewise.
* sysdeps/mips/mips64/n32/nofpu/Implies: Likewise.
* sysdeps/mips/mips64/n64/fpu/s_fma.c: Likewise.
* sysdeps/mips/mips64/n64/nofpu/Implies: Likewise.
* sysdeps/mips/ieee754/s_fma.c: Remove file.
* sysdeps/mips/ieee754/s_fmaf.c: Likewise.
* sysdeps/mips/ieee754/s_fmal.c: Update include for move of fmal
implementation.
* sysdeps/nios2/Implies: Add ieee754/soft-fp.
* sysdeps/nios2/s_fma.c: Remove file.
* sysdeps/nios2/s_fmaf.c: Likewise.
* sysdeps/sh/nofpu/Implies: New file.
* sysdeps/sh/s_fma.c: Remove file.
* sysdeps/sh/s_fmaf.c: Likewise.
* sysdeps/tile/Implies: Add ieee754/soft-fp.
* sysdeps/tile/s_fma.c: Remove file.
* sysdeps/tile/s_fmaf.c: Likewise.