Some headers did not include all of their prerequisite headers.
* rpcsvc/nislib.h: Include rpcsvc/nis.h.
* sysdeps/unix/sysv/linux/netrose/rose.h:
Include sys/socket.h and netax25/ax25.h.
<endian.h> only defines BYTE_ORDER, BIG_ENDIAN, LITTLE_ENDIAN,
etc. under __USE_MISC; glibc's headers should use __BYTE_ORDER,
__BIG_ENDIAN, __LITTLE_ENDIAN, etc. instead.
* inet/netinet/icmp6.h, inet/netinet/ip6.h
* resolv/arpa/nameser_compat.h:
Use __BYTE_ORDER etc. instead of BYTE_ORDER etc.
sys/types.h only conditionally defines caddr_t and clockid_t.
* sysdeps/unix/sysv/linux/sys/quota.h:
Use __caddr_t instead of caddr_t.
* sysdeps/unix/sysv/linux/sys/timerfd.h:
Use __clockid_t instead of clockid_t.
Remove a #warning that was the sole actual problem with using sys/ipc.h
without _GNU_SOURCE/_XOPEN_SOURCE.
* sysvipc/sys/ipc.h: Remove unnecessary #warning.
_LIBC, __USE_XOPEN2K8, and __STDC_VERSION__ are not always defined.
It seems to me that _LIBC should not appear in installed headers, but
avoiding that for argp specifically would require more surgery than
feels appropriate for this patch set. It's possible that
"#ifdef _LIBC" would be sufficient, but I wanted to be conservative.
All three versions of bits/socket.h want to know whether __flexarr
will produce a real flexible array member -- specifically, one that
doesn't alter sizeof(the structure containing it). They were testing
for this with a complicated #if condition that did not agree with
sys/cdefs.h and that tripped -Wundef warnings under -std=c90.
I added a new macro to sys/cdefs.h, __glibc_c99_flexarr_available,
which reveals exactly what these headers want to know. I also took
the opportunity to flatten the rather messy conditional nest defining
__flexarr.
* argp/argp.h: Check whether _LIBC is defined before expanding it.
* posix/glob.h: Check whether __USE_XOPEN2K8 is defined instead
of expanding it.
* misc/sys/cdefs.h: Tidy up conditional nest defining __flexarr.
Define __glibc_c99_flexarr_available to 1 when the compiler
supports C99-compatible flexible array members, 0 otherwise.
* sysdeps/unix/sysv/linux/bits/socket.h
* sysdeps/mach/hurd/bits/socket.h
* bits/socket.h: Use __glibc_c99_flexarr_available in
definitions of struct cmsghdr and CMSG_DATA.
This is the hurd-specific follow-up for
29d794863c : hurdmalloc also needs the
same fix
* hurd/hurdmalloc.c (malloc_fork_prepare): Rename to
_hurd_malloc_fork_prepare.
(malloc_fork_parent): Rename to _hurd_malloc_fork_parent.
(malloc_fork_child): Rename to _hurd_malloc_fork_child.
(_hurd_fork_prepare_hook): Drop malloc_fork_prepare.
(_hurd_fork_parent_hook): Drop malloc_fork_parent.
(_hurd_fork_child_hook): Drop malloc_fork_child.
* hurd/hurdmalloc.h (_hurd_malloc_fork_prepare,
_hurd_malloc_fork_parent, _hurd_malloc_fork_child): Add declarations.
* sysdeps/mach/hurd/fork.c (__fork): Call __malloc_fork_lock_parent
after locking locks (notably hurd_dtable_lock). Call
_hurd_malloc_fork_prepare after that. Call _hurd_malloc_fork_parent
before __malloc_fork_unlock_parent and _hurd_malloc_fork_child before
__malloc_fork_unlock_child.
TS 18661-1 defines macros for the width of integer types, intended for
use with the fromfp functions to convert from floating-point types to
integer types of any width, in any rounding mode and with control over
whether "inexact" is raised. Such macros are, of course, more
generally useful than just with those functions.
Those macros are added to <limits.h> and <stdint.h>. Having
previously added the <limits.h> macros, this patch adds the <stdint.h>
ones. I've also added these macros to GCC's headers for GCC 7, but
for glibc systems, the definitions in GCC's <stdint.h> will only be
used with -ffreestanding.
Tested for x86_64 and x86.
* sysdeps/generic/stdint.h: Define
__GLIBC_INTERNAL_STARTING_HEADER_IMPLEMENTATION and include
<bits/libc-header-start.h> instead of including <features.h>.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (INT8_WIDTH): New macro.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (UINT8_WIDTH): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (INT16_WIDTH): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (UINT16_WIDTH): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (INT32_WIDTH): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (UINT32_WIDTH): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (INT64_WIDTH): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (UINT64_WIDTH): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (INT_LEAST8_WIDTH): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (UINT_LEAST8_WIDTH): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (INT_LEAST16_WIDTH): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (UINT_LEAST16_WIDTH): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (INT_LEAST32_WIDTH): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (UINT_LEAST32_WIDTH): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (INT_LEAST64_WIDTH): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (UINT_LEAST64_WIDTH): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (INT_FAST8_WIDTH): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (UINT_FAST8_WIDTH): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (INT_FAST16_WIDTH): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (UINT_FAST16_WIDTH): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (INT_FAST32_WIDTH): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (UINT_FAST32_WIDTH): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (INT_FAST64_WIDTH): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (UINT_FAST64_WIDTH): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (INTPTR_WIDTH): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (UINTPTR_WIDTH): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (INTMAX_WIDTH): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (UINTMAX_WIDTH): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (PTRDIFF_WIDTH): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (SIG_ATOMIC_WIDTH): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (SIZE_WIDTH): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (WCHAR_WIDTH): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (WINT_WIDTH): Likewise.
* manual/arith.texi (Integers): Document these macros for types
specified by width properties.
* manual/lang.texi (Width of Type): Document these macros for
other standard typedefs.
* stdlib/tst-width-stdint.c: New file.
* stdlib/Makefile (tests): Add tst-width-stdint.
This patch correctly block and unblocks all signals when executing
Linux posix_spawn by using the __libc_signal_{un}block_all functions
instead of default sigprocmask. The latter might remove both
SIGCANCEL and SIGSETXID from the blocked signal list.
Checked on x86_64, i686, powerpc64le, and aarch64.
* sysdeps/unix/sysv/linux/spawni.c (__spawnix): Correctly block and unblock
all signals when executing the clone vfork child.
(SIGALL_SET): Remove macro.
This patch correctly enable and disable asynchronous cancellation on
Linux posix_spawn. Current code invert the logic by enabling and
disabling instead. It also adds a new test to check if posix_spawn
is not a cancellation entrypoint.
Checked on x86_64, i686, powerpc64le, and aarch64.
* nptl/Makefile (tests): Add tst-exec5.
* nptl/tst-exec5.c: New file.
* sysdeps/unix/sysv/linux/spawni.c (__spawni): Correctly enable and disable
asynchronous cancellation.
This requires adding a macro to synthesize the call
to __strto*_nan. Since this is likely to be the only
usage ever for strto* functions in generated libm
calls, a dedicated macro is defined for it.
Use the GCC builtin instead. With the exception of the
files built from a template, they are unused. This
is preparation for making the s_nanF objects generated.
This one is a little more tricky since it is built both for
libm and libc, and exports multiple aliases.
To simplify aliasing, a new macro is introduced which handles
aliasing to two symbols. By default, it just applies
declare_mgen_alias to both target symbols.
Likewise, the makefile is tweaked a little to generate
templates for shared files too, and a new rule is added
to build m_*.c objects from the objpfx directory.
Verified there are no symbol or code changes using a script
to diff the *_ldexp* object files on s390x, aarch64, arm,
x86_64, and ppc64.
This was used by --enable-omitfp, and the bulk of it was removed in this
commit:
commit bdeba1354b
Author: Ulrich Drepper <drepper@gmail.com>
Date: Sat Jan 7 11:29:31 2012 -0500
Remove --enable-omitfp support
Current sparc32 sem_init and default one only differ on sem.newsem.pad
initialization. This patch removes sparc32 and sparc32v9 sem_init arch
specific implementation and set sparc32 to use nptl default one.
The default implementation sets the required sem.newsem.pad to 0 (which
is ununsed in other architectures).
I checked on i686 and a sparc32v9 build.
* nptl/sem_init.c (sem_init): Init pad value to 0.
* sysdeps/sparc/sparc32/sem_init.c: Remove file.
* sysdeps/sparc/sparc32/sparcv9/sem_init.c: Likewise.
This patch changes shm_open to not act as a cancellation point.
Cancellation is disable at start and reenable in function exit.
It fixes BZ #18243.
Tested on x86_64 and i686.
[BZ #18243]
* rt/Makefile (test): Add tst-shm-cancel.
* rt/tst-shm-cancel.c: New file.
* sysdeps/posix/shm_open.c: Disable asynchronous cancellation.
This patch removes the sparc32 sem_wait.c implementation since it is
identical to default nptl one. The sparcv9 is no longer required with
the removal.
Checked with a sparcv9 build.
* sysdeps/sparc/sparc32/sem_wait.c: Remove file.
* sysdeps/sparc/sparc32/sparcv9/sem_wait.c: Likewise.
Current sparc32 sem_open and default one only differ on:
1. Default one contains a 'futex_supports_pshared' check.
2. sem.newsem.pad is initialized to zero.
This patch removes sparc32 and sparc32v9 sem_open arch specific
implementation and instead set sparc32 to use nptl default one.
Using 1. is fine since it should always evaluate 0 for Linux
(an optimized away by the compiler). Adding 2. to default
implementation should be ok since 'pad' field is used mainly
on sparc32 code.
I checked on i686 and checked a sparc32v9 build.
* nptl/sem_open.c (sem_open): Init pad value to 0.
* sysdeps/sparc/sparc32/sem_open.c: Remove file.
* sysdeps/sparc/sparc32/sparcv9/sem_open.c: Likewise.
Nothing depends on the PTW macro anymore, so the mechanism to define
PTW for recompliations of libc routines is no longer needed. The
source files are still recompiled for the nptl directory, just without
the “ptw-” prefix.
(Reducing the number of pattern rules in sysd-rules is critical for
improving make performance.)
This runs the attached sed script against these files using
a regex which aggressively matches long double literals
when not obviously part of a comment.
Likewise, 5 digit or less integral constants are replaced
with integer constants, excepting the two cases of 0 used
in large tables, which are also the only integral values
of the form x.0*E0L encountered within these converted
files.
Likewise, -L(x) is transformed into L(-x).
Naturally, the script has a few minor hiccups which are
more clearly remedied via the attached fixup patch. Such
hiccups include, context-sensitive promotion to a real
type, and munging constants inside harder to detect
comment blocks.
When I added fetestexceptflag, I missed that e500 was another case
that needed its own version because saved exceptions were not directly
stored in a form that could be ANDed with exception bits (they were
stored with exceptions in SPE form, but the FE_* macros always use the
classic hard-float form). This patch adds an e500 version with the
required call to __fexcepts_from_spe to convert from one form to the
other.
Tested for e500.
* sysdeps/powerpc/powerpc32/e500/nofpu/fetestexceptflag.c: New
file.
This patch adds SPARC versions of fegetmode and fesetmode. Untested.
* sysdeps/sparc/fpu/fegetmode.c: New file.
* sysdeps/sparc/fpu/fesetmode.c: Likewise.
This patch adds SH versions of fegetmode and fesetmode. Untested.
* sysdeps/sh/sh4/fpu/fegetmode.c: New file.
* sysdeps/sh/sh4/fpu/fesetmode.c: Likewise.
This patch adds S/390 versions of fegetmode and fesetmode. Untested.
* sysdeps/s390/fpu/fegetmode.c: New file.
* sysdeps/s390/fpu/fesetmode.c: Likewise.
This patch adds M68K versions of fegetmode and fesetmode. Untested.
* sysdeps/m68k/fpu/fegetmode.c: New file.
* sysdeps/m69k/fpu/fesetmode.c: Likewise.
This patch adds IA64 versions of fegetmode and fesetmode. Untested.
* sysdeps/ia64/fpu/fegetmode.c: New file.
* sysdeps/ia64/fpu/fesetmode.c: Likewise.
This patch adds HPPA versions of fegetmode and fesetmode. Untested.
* sysdeps/hppa/fpu/fegetmode.c: New file.
* sysdeps/hppa/fpu/fesetmode.c: Likewise.
This patch adds Alpha versions of fegetmode and fesetmode. Untested.
* sysdeps/alpha/fpu/fegetmode.c: New file.
* sysdeps/alpha/fpu/fesetmode.c: Likewise.
This patch adds AArch64 versions of fegetmode and fesetmode.
Untested.
* sysdeps/aarch64/fpu/fegetmode.c: New file.
* sysdeps/aarch64/fpu/fesetmode.c: Likewise.
TS 18661-1 defines a type femode_t to represent the set of dynamic
floating-point control modes (such as the rounding mode and trap
enablement modes), and functions fegetmode and fesetmode to manipulate
those modes (without affecting other state such as the raised
exception flags) and a corresponding macro FE_DFL_MODE.
This patch series implements those interfaces for glibc. This first
patch adds the architecture-independent pieces, the x86 and x86_64
implementations, and the <bits/fenv.h> and ABI baseline updates for
all architectures so glibc keeps building and passing the ABI tests on
all architectures. Subsequent patches add the fegetmode and fesetmode
implementations for other architectures.
femode_t is generally an integer type - the same type as fenv_t, or as
the single element of fenv_t where fenv_t is a structure containing a
single integer (or the single relevant element, where it has elements
for both status and control registers) - except where architecture
properties or consistency with the fenv_t implementation indicate
otherwise. FE_DFL_MODE follows FE_DFL_ENV in whether it's a magic
pointer value (-1 cast to const femode_t *), a value that can be
distinguished from valid pointers by its high bits but otherwise
contains a representation of the desired register contents, or a
pointer to a constant variable (the powerpc case; __fe_dfl_mode is
added as an exported constant object, an alias to __fe_dfl_env).
Note that where architectures (that share a register between control
and status bits) gain definitions of new floating-point control or
status bits in future, the implementations of fesetmode for those
architectures may need updating (depending on whether the new bits are
control or status bits and what the implementation does with
previously unknown bits), just like existing implementations of
<fenv.h> functions that take care not to touch reserved bits may need
updating when the set of reserved bits changes. (As any new bits are
outside the scope of ISO C, that's just a quality-of-implementation
issue for supporting them, not a conformance issue.)
As with fenv_t, femode_t should properly include any software DFP
rounding mode (and for both fenv_t and femode_t I'd consider that
fragment of DFP support appropriate for inclusion in glibc even in the
absence of the rest of libdfp; hardware DFP rounding modes should
already be included if the definitions of which bits are status /
control bits are correct).
Tested for x86_64, x86, mips64 (hard float, and soft float to test the
fallback version), arm (hard float) and powerpc (hard float, soft
float and e500). Other architecture versions are untested.
* math/fegetmode.c: New file.
* math/fesetmode.c: Likewise.
* sysdeps/i386/fpu/fegetmode.c: Likewise.
* sysdeps/i386/fpu/fesetmode.c: Likewise.
* sysdeps/x86_64/fpu/fegetmode.c: Likewise.
* sysdeps/x86_64/fpu/fesetmode.c: Likewise.
* math/fenv.h: Update comment on inclusion of <bits/fenv.h>.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (fegetmode): New function
declaration.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (fesetmode): Likewise.
* bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New
typedef.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro.
* sysdeps/aarch64/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)]
(femode_t): New typedef.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro.
* sysdeps/alpha/fpu/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)]
(femode_t): New typedef.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro.
* sysdeps/arm/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)]
(femode_t): New typedef.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro.
* sysdeps/hppa/fpu/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)]
(femode_t): New typedef.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro.
* sysdeps/ia64/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)]
(femode_t): New typedef.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro.
* sysdeps/m68k/fpu/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)]
(femode_t): New typedef.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro.
* sysdeps/microblaze/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)]
(femode_t): New typedef.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro.
* sysdeps/mips/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)]
(femode_t): New typedef.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro.
* sysdeps/nios2/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)]
(femode_t): New typedef.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro.
* sysdeps/powerpc/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)]
(femode_t): New typedef.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (__fe_dfl_mode): New variable
declaration.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro.
* sysdeps/s390/fpu/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)]
(femode_t): New typedef.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro.
* sysdeps/sh/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)]
(femode_t): New typedef.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro.
* sysdeps/sparc/fpu/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)]
(femode_t): New typedef.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro.
* sysdeps/tile/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)]
(femode_t): New typedef.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro.
* sysdeps/x86/fpu/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)]
(femode_t): New typedef.
[__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro.
* manual/arith.texi (FE_DFL_MODE): Document macro.
(fegetmode): Document function.
(fesetmode): Likewise.
* math/Versions (fegetmode): New libm symbol at version
GLIBC_2.25.
(fesetmode): Likewise.
* math/Makefile (libm-support): Add fegetmode and fesetmode.
(tests): Add test-femode and test-femode-traps.
* math/test-femode-traps.c: New file.
* math/test-femode.c: Likewise.
* sysdeps/powerpc/fpu/fenv_const.c (__fe_dfl_mode): Declare as
alias for __fe_dfl_env.
* sysdeps/powerpc/nofpu/fenv_const.c (__fe_dfl_mode): Likewise.
* sysdeps/powerpc/powerpc32/e500/nofpu/fenv_const.c
(__fe_dfl_mode): Likewise.
* sysdeps/powerpc/Versions (__fe_dfl_mode): New libm symbol at
version GLIBC_2.25.
* sysdeps/nacl/libm.abilist: Update.
* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist:
Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist:
Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist:
Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist:
Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libm.abilist:
Likewise.
* sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libm.abilist:
Likewise.
* sysdeps/unix/sysv/linux/tile/tilepro/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.
There is transition penalty when SSE instructions are mixed with 256-bit
AVX or 512-bit AVX512 load instructions. Since _dl_runtime_resolve_avx
and _dl_runtime_profile_avx512 save/restore 256-bit YMM/512-bit ZMM
registers, there is transition penalty when SSE instructions are used
with lazy binding on AVX and AVX512 processors.
To avoid SSE transition penalty, if only the lower 128 bits of the first
8 vector registers are non-zero, we can preserve %xmm0 - %xmm7 registers
with the zero upper bits.
For AVX and AVX512 processors which support XGETBV with ECX == 1, we can
use XGETBV with ECX == 1 to check if the upper 128 bits of YMM registers
or the upper 256 bits of ZMM registers are zero. We can restore only the
non-zero portion of vector registers with AVX/AVX512 load instructions
which will zero-extend upper bits of vector registers.
This patch adds _dl_runtime_resolve_sse_vex which saves and restores
XMM registers with 128-bit AVX store/load instructions. It is used to
preserve YMM/ZMM registers when only the lower 128 bits are non-zero.
_dl_runtime_resolve_avx_opt and _dl_runtime_resolve_avx512_opt are added
and used on AVX/AVX512 processors supporting XGETBV with ECX == 1 so
that we store and load only the non-zero portion of vector registers.
This avoids SSE transition penalty caused by _dl_runtime_resolve_avx and
_dl_runtime_profile_avx512 when only the lower 128 bits of vector
registers are used.
_dl_runtime_resolve_avx_slow is added and used for AVX processors which
don't support XGETBV with ECX == 1. Since there is no SSE transition
penalty on AVX512 processors which don't support XGETBV with ECX == 1,
_dl_runtime_resolve_avx512_slow isn't provided.
[BZ #20495]
[BZ #20508]
* sysdeps/x86/cpu-features.c (init_cpu_features): For Intel
processors, set Use_dl_runtime_resolve_slow and set
Use_dl_runtime_resolve_opt if XGETBV suports ECX == 1.
* sysdeps/x86/cpu-features.h (bit_arch_Use_dl_runtime_resolve_opt):
New.
(bit_arch_Use_dl_runtime_resolve_slow): Likewise.
(index_arch_Use_dl_runtime_resolve_opt): Likewise.
(index_arch_Use_dl_runtime_resolve_slow): Likewise.
* sysdeps/x86_64/dl-machine.h (elf_machine_runtime_setup): Use
_dl_runtime_resolve_avx512_opt and _dl_runtime_resolve_avx_opt
if Use_dl_runtime_resolve_opt is set. Use
_dl_runtime_resolve_slow if Use_dl_runtime_resolve_slow is set.
* sysdeps/x86_64/dl-trampoline.S: Include <cpu-features.h>.
(_dl_runtime_resolve_opt): New. Defined for AVX and AVX512.
(_dl_runtime_resolve): Add one for _dl_runtime_resolve_sse_vex.
* sysdeps/x86_64/dl-trampoline.h (_dl_runtime_resolve_avx_slow):
New.
(_dl_runtime_resolve_opt): Likewise.
(_dl_runtime_profile): Define only if _dl_runtime_profile is
defined.
on s390x the test elf/check-localplt is failing after recent commits:
"elf: Do not use memalign for TCB/TLS blocks allocation [BZ #17730]"
"elf: Avoid using memalign for TLS allocations [BZ #17730]"
"elf: dl-minimal malloc needs to respect fundamental alignment"
due to "Missing required PLT reference: ld.so: __libc_memalign".
After the commits __libc_memalign is only called in elf/dl-minimal.c in
malloc() function in ld.so and gcc -O2/-O3 leads to R_390_GLOB_DAT
instead of R_390_JMP_SLOT. __libc_memalign is called via
function-pointer loaded from GOT instead of calling via a plt-stub. In
this case there is the R_390_GLOB_DAT relocation in section .rela.dyn
instead of R_390_JMP_SLOT in .rela.plt.
This patch marks ld.so: __libc_memalign with R_390_GLOB_DAT in
localplt.data to allow both relocations.
If build with -fno-optimize-sibling-calls or on s390(31bit) a
R_390_JMP_SLOT is generated.
ChangeLog:
* sysdeps/unix/sysv/linux/s390/localplt.data: Mark
ld.so: __libc_memalign with "+ RELA R_390_GLOB_DAT".
The support functions for sin and cos have a lot of identical
functionality, so inlining them gives a pretty decent jump in
functionality: ~19% in the sincos function. On SPEC2006 this
translates to about 2.1% in the tonto test.
* sysdeps/ieee754/dbl-64/s_sin.c (do_cos): Mark as inline.
(do_cos_slow): Likewise.
(do_sin): Likewise.
(do_sin_slow): Likewise.
(slow): Likewise.
(slow1): Likewise.
(slow2): Likewise.
(sloww): Likewise.
(sloww1): Likewise.
(sloww2): Likewise.
(bsloww): Likewise.
(bsloww1): Likewise.
(bsloww2): Likewise.
(cslow2): Likewise.
The only code looks slightly different from do_sin but on closer
examination, should give exactly the same result. Drop it in favour
of the do_sin function call.
* sysdeps/ieee754/dbl-64/s_sin.c (__sin): Use do_sin.
All calls to do_cos are preceded by code that partitions x into a
larger double that gives an offset into the sincos table and a smaller
double that is used in a polynomial computation. Consolidate all of
them into do_cos and do_sin to reduce code duplication.
* sysdeps/ieee754/dbl-64/s_sin.c (do_cos): Accept X and DX as input
arguments. Consolidate input partitioning from callers here.
(do_cos_slow): Likewise.
(do_sin): Likewise.
(do_sin_slow): Likewise.
(do_sincos_1): Remove the no longer necessary input partitioning.
(do_sincos_2): Likewise.
(__sin): Likewise.
(__cos): Likewise.
(slow1): Likewise.
(slow2): Likewise.
(sloww1): Likewise.
(sloww2): Likewise.
(bsloww1): Likewise.
(bsloww2): Likewise.
(cslow2): Likewise.
This is only used for the float and double variants.
Instead, just add it to the type specific list of files,
and remove all stubs, and remove the declaration from
math_private.h.
I verified x86_64, i486, ia64, m68k, and ppc64 build.
With the exception of those machines using the ldbl-opt in
an Implies file, this is a trivial transformation.
nextdownl is not subject to the non-trivial versioning rules
of the other generated functions, so to keep things simple,
it is handled as a one-off case in ldbl-opt to preserve the
existing behavior.
The only difference is the usage of math_narrow_eval when
building s_fdiml.c. This should be harmless for long double,
but I did observe some code generation changes on m68k, but
lack the resources to test it.
Likewise, to more easily support overriding symbol generation,
the aliasing macros are always conditionally defined on their
absence to reduce boilerplate.
I also ran builds for i486, ppc64, sparcv9, aarch64,
s390x and observed no changes to s_fdim* objects.
Macros which are also defined in <linux/quota.h> are removed, and
<linux/quota.h> is included instead.
This commit cleans up the definition of fs_to_dq_blocks and struct
dqblock and struct dqinfo, too.
Add a layer of macro indirection for long double files
which need to be built using another typename. Likewise,
add the L(num) macro used in a later patch to override
real constants.
These macros are only defined through the ldbl-128
math_ldbl.h header, thereby implicitly restricting
these macros to machines which back long double
with an IEEE binary128 format.
Likewise, appropriate changes are made for the few
files which indirectly include such ldbl-128 files.
These changes produce identical binaries for s390x,
aarch64, and ppc64.
On s390 feraiseexcept (FE_OVERFLOW|FE_UNDERFLOW) sets FE_INEXACT, too.
This patch uses z196 zarch load rounded instruction which can suppress
FE_INEXACT exception if gcc has z196 support in used configuration.
Otherwise FE_INEXACT flag is set as before. The gcc support is tested
in a new configure-check.
A comment in fsetexcptflg.c is corrected as new exceptions are not
executed with the next floating-point instruction if fpc is set with
_FPU_SETCW macro. It seems the comment was copied e.g. from
sysdeps/x86_64/fpu/fsetexcptflg.c file.
ChangeLog:
* config.h.in (HAVE_S390_MIN_Z196_ZARCH_ASM_SUPPORT):
New undefine.
* sysdeps/s390/configure.ac: Add test for z196 zarch support.
* sysdeps/s390/configure: Regenerated.
* sysdeps/s390/fpu/fraiseexcpt.c (__feraiseexcept): Use ledbra
instruction for raising over-/underflow if z196 zarch is supported
by default.
* sysdeps/s390/fpu/fsetexcptflg.c (fesetexceptflag):
Correct comment.