The routine being assembled here is strcasecmp_l, so ask for that via
__STRCMP and STRCMP defines. That change means tweaking the power7
override. Needed for later powerpc64le changes where we want the base
symbols, not just a power7 variant.
* sysdeps/powerpc/powerpc64/multiarch/strcasecmp_l-power7.S:
(__STRCMP, STRCMP, __strcasecmp_l): Define.
(__strcasecmp): Don't define.
These functions aren't used in ld.so at the moment since we don't have
strcmp or strncmp ifuncs for them there. Remove the ld.so bloat.
* sysdeps/powerpc/powerpc64/multiarch/strcmp-power8.S: Wrap in
IS_IN (libc).
* sysdeps/powerpc/powerpc64/multiarch/strcmp-power9.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncmp-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncmp-power9.S: Likewise.
USE_AS_STPNCPY is defined by sysdeps/powerpc/powerpc64/power8/stpncpy.S,
included by this file.
* sysdeps/powerpc/powerpc64/multiarch/stpncpy-power8.S: Don't define
USE_AS_STPNCPY.
It seems to me that libc.a should not contain any of the __GI_
symbols, and certainly --enable-multi-arch ought to not add to the
list. At the end of this patch series we have the following in both
--enable-multi-arch and --disable-multi-arch libc.a:
0000000000000000 T __GI___readdir64
0000000000000000 T __GI___fxstatat64
0000000000000000 T __GI_getrlimit
0000000000000000 T __GI___getrlimit
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-ppc64.S (hidden_def):
Redefine only when SHARED.
POWER9 DD2.1 and earlier has an issue where some cache inhibited
vector load traps to the kernel, causing a performance degradation. To
handle this in memcpy and memmove, lvx/stvx is used for aligned
addresses instead of lxvd2x/stxvd2x.
Reference: https://patchwork.ozlabs.org/patch/814059/
* sysdeps/powerpc/powerpc64/power7/memcpy.S: Replace
lxvd2x/stxvd2x with lvx/stvx.
* sysdeps/powerpc/powerpc64/power7/memmove.S: Likewise.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
cfi info for stack adjust needs to be on the insn doing the adjust.
cfi describing register saves can be anywhere after the save insn but
before the reg is altered. Fewer locations with cfi result in smaller
cfi programs and possibly slightly faster exception handling. Thus
the LR cfi_offset move.
The idea behind ajusting sp after restoring regs is to break a
register dependency chain, in this case not be using r1 immediately
after it is modified.
The missing LR cfi_restore meant that code after the blr,
unaligned_lt_16 and other labels, would have cfi that said LR was at
cfa+16, but that code is reached without LR being saved.
* sysdeps/powerpc/powerpc64/power8/strncpy.S: Move LR cfi.
Adjust stack after restoring regs. Add missing LR cfi_restore.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com>
This patch moves the frame setup and teardown to immediately around
the single memset call, as has been done for power8. I've also
decreased FRAMESIZE to that needed to save the two callee-saved
registers used. Plus added cfi.
* sysdeps/powerpc/powerpc64/power7/strncpy.S: Decrease FRAMESIZE.
Move LR save and frame setup/teardown and LR restore to
immediately around memset call. Provide cfi.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com>
The bits/floatn.h header currently only has defines relating to
_Float128. This patch adds defines relating to other _FloatN /
_FloatNx types.
The approach taken is to add defines for all _FloatN / _FloatNx types
known to GCC, and to put them in a common bits/floatn-common.h header
included at the end of all the individual bits/floatn.h headers. If
in future some defines become different for different glibc
configurations, they will move out into the separate bits/floatn.h
headers.
Some defines are expected always to be the same across glibc ports.
Corresponding defines are nevertheless put in this header. The intent
is that where there are conditionals (in headers or in non-installed
files) that can just repeat the same or nearly the same logic for each
floating-point type, they should do so, even if in fact the cases for
some types could be unconditionally present or absent because the same
conditionals are true or false for all glibc configurations. This
should make the glibc code with such conditionals easier to read,
because the reader can just see that the same conditionals are
repeated for each type, rather than seeing different conditionals for
different types and needing to reason, at each location with such
differences, why those differences are indeed correct there. (Cases
involving per-format rather than per-type logic are more likely still
to need differences in how they handle different types.)
Having such defines and conditionals also helps in incremental
preparation for adding _Float32 / _Float64 / _Float32x / _Float64x
function aliases. I intend subsequent patches to add such
conditionals corresponding to those already present for _Float128, as
well as making more architecture-specific function implementations use
common macros to define aliases in preparation for adding such _FloatN
/ _FloatNx aliases.
Tested for x86_64.
* bits/floatn-common.h: New file.
* math/Makefile (headers): Add bits/floatn-common.h.
* bits/floatn.h: Include <bits/floatn-common.h>.
* sysdeps/ia64/bits/floatn.h: Likewise.
* sysdeps/ieee754/ldbl-128/bits/floatn.h: Likewise.
* sysdeps/mips/ieee754/bits/floatn.h: Likewise.
* sysdeps/powerpc/bits/floatn.h: Likewise.
* sysdeps/x86/bits/floatn.h: Likewise.
A performance regression was introduced by commit
84d74e427a "powerpc: Cleanup fenv_private.h".
In the powerpc implementation of SET_RESTORE_ROUND, there is the
following code in the "SET" function (slightly simplified):
--
old.fenv = fegetenv_register ();
new.l = (old.l & _FPU_MASK_TRAPS_RN) | r; (1)
if (new.l != old.l) (2)
{
if ((old.l & _FPU_ALL_TRAPS) != 0)
(void) __fe_mask_env ();
fesetenv_register (new.fenv); (3)
--
Line (1) sets the value of "new" to the current value of FPSCR,
but masks off summary bits, exceptions, non-IEEE mode, and
rounding mode, then ORs in the new rounding mode.
Line (2) compares this new value to the current value in order to
avoid setting a new value in the FPSCR (line (3)) unless something
significant has changed (exception enables or rounding mode).
The summary bits are not germane to the comparison, but are cleared
in "new" and preserved in "old", resulting in false negative
comparisons, and unnecessarily setting the FPSCR in those cases
with associated negative performance impacts.
The solution is to treat the summaries identically for "new" and "old":
- save them in SET
- leave them alone otherwise
- restore the saved values in RESTORE
Also minor changes:
- expand _FPU_MASK_RN to 64bit hex, to match other MASKs
- treat bit 52 (left-to-right) as reserved (since it is)
* sysdeps/powerpc/fpu/fenv_private.h (_FPU_MASK_TRAPS_RN):
(_FPU_MASK_FRAC_INEX_RET_CC): Fix masks to more properly handle
summary bits.
(_FPU_MASK_RN): Expand _FPU_MASK_RN to 64bit hex.
(_FPU_MASK_NOT_RN_NI): Treat bit 52 (left-to-right) as reserved.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com>
Normally, TLS relocations against local symbols are optimised by the linker
to be absolute. However, gold does not do this, and so it is possible to
end up with, for example, R_SPARC_TLS_DTPMOD64 referring to a local symbol.
Since sym_map is left as null in elf_machine_rela for the special local
symbol case, the relocation handling thinks it has nothing to do, and so
the module gets left as 0. Havoc then ensues when the variable in question
is accessed.
Before this fix, the main_local_gold program would receive a SIGBUS on
sparc64, and SIGSEGV on powerpc32. With this fix applied, that test now
passes like the rest of them.
* sysdeps/powerpc/powerpc32/dl-machine.h (elf_machine_rela):
Assign sym_map to be map for local symbols, as TLS relocations
use sym_map to determine whether the symbol is defined and to
extract the TLS information.
* sysdeps/sparc/sparc32/dl-machine.h (elf_machine_rela): Likewise.
* sysdeps/sparc/sparc64/dl-machine.h (elf_machine_rela): Likewise.
Fix the ifdef clause that was being used in the opposite way, setting
a wrong value of the carry bit.
This is also correcting 2 memory accesses that were mistakenly referring
to r0 while they were supposed to mean the immediate value 0.
[BZ #22142]
* stdio-common/tst-printf.c (fp_test): Add tests for DBL_MAX and
-DBL_MAX.
(do_test): Likewise.
* stdio-common/tst-printf.sh: Likewise.
* sysdeps/powerpc/powerpc64/power7/add_n.S: Invert the initial
ifdef clause in order to set the carry bit right. Replace r0 by
0 without changing the behavior.
Recent commit 59ba2d2b54 missed to add __memrchr_power8 in
ifunc list. Also handled discarding unwanted bytes for
unaligned inputs in power8 optimization.
2017-10-05 Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com>
* sysdeps/powerpc/powerpc64/multiarch/memrchr-ppc64.c: Revert
back to powerpc32 file.
* sysdeps/powerpc/powerpc64/multiarch/memrchr.c
(memrchr): Add __memrchr_power8 to ifunc list.
* sysdeps/powerpc/powerpc64/power8/memrchr.S: Mask
extra bytes for unaligned inputs.
This patch makes dbl-64 modf use libm_alias_double. Both the dbl-64
and dbl-64/wordsize-64 versions are changed, and the ldbl-opt version
is changed to define the libc compat symbol only. Because of
multiarch wrappers, the changed implementations are made not to define
aliases at all if __modf is defined as a macro, as with other
functions, so avoiding duplicate compat symbols while allowing those
wrappers to be simplified.
Tested for x86_64, and verified with build-many-glibcs.py that
installed stripped shared libraries are unchanged by the patch.
* sysdeps/ieee754/dbl-64/s_modf.c: Include <libm-alias-double.h>.
(modf): Define using libm_alias_double, only if [!__modf].
* sysdeps/ieee754/dbl-64/wordsize-64/s_modf.c: Include
<libm-alias-double.h>.
(modf): Define using libm_alias_double, only if [!__modf].
* sysdeps/ieee754/ldbl-opt/s_modf.c (modfl): Only define libc
compat symbol here.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modf-ppc32.c
(weak_alias): Do not undefine and redefine.
(strong_alias): Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf-ppc64.c
(weak_alias): Likewise.
(strong_alias): Likewise.
This patch makes dbl-64 logb use libm_alias_double. Both the dbl-64
and dbl-64/wordsize-64 versions are changed, and the ldbl-opt version
is removed. Because of multiarch wrappers, the changed
implementations are made not to define aliases at all if __logb is
defined as a macro, as with other functions, so avoiding duplicate
compat symbols while allowing those wrappers to be simplified.
Tested for x86_64, and verified with build-many-glibcs.py that
installed stripped shared libraries are unchanged (except on alpha
where changes from using the wordsize-64 version are expected).
* sysdeps/ieee754/dbl-64/s_logb.c: Include <libm-alias-double.h>.
(logb): Define using libm_alias_double, only if [!__logb].
* sysdeps/ieee754/dbl-64/wordsize-64/s_logb.c: Include
<libm-alias-double.h>.
(logb): Define using libm_alias_double, only if [!__logb].
* sysdeps/ieee754/ldbl-opt/s_logb.c: Remove file.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_logb-ppc32.c
(weak_alias): Do not undefine and redefine.
(strong_alias): Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logb-ppc64.c
(weak_alias): Likewise.
(strong_alias): Likewise.
All representations of floating-point numbers in types with IEC 60559
binary exchange format are canonical. On the other hand, types with IEC
60559 extended formats, such as those implemented under ldbl-96 and
ldbl-128ibm, contain representations that are not canonical.
TS 18661-1 introduced the type-generic macro iscanonical, which returns
whether a floating-point value is canonical or not. In Glibc, this
type-generic macro is implemented using the macro __MATH_TG, which, when
support for float128 is enabled, relies on __builtin_types_compatible_p
to select between floating-point types. However, this use of
iscanonical breaks C++ applications, because the builtin is only
available in C mode.
This patch provides a C++ implementation of iscanonical that relies on
function overloading, rather than builtins, to select between
floating-point types.
Unlike the C++ implementations for iszero and issignaling, this
implementation ignores __NO_LONG_DOUBLE_MATH. The double type always
matches IEC 60559 double format, which is always canonical. Thus, when
double and long double are the same (__NO_LONG_DOUBLE_MATH), iscanonical
always returns 1 and is not implemented with __MATH_TG.
Tested for powerpc64, powerpc64le and x86_64.
[BZ #22235]
* math/math.h: Trivial fix for unbalanced parentheses in comment.
* math/Makefile [CXX] (tests): Add test-math-iscanonical.cc.
(CFLAGS-test-math-iscanonical.cc): New variable.
* math/test-math-iscanonical.cc: New file.
* sysdeps/ieee754/ldbl-96/bits/iscanonical.h (iscanonical):
Provide a C++ implementation based on function overloading,
rather than using __MATH_TG, which uses C-only builtins.
* sysdeps/ieee754/ldbl-128ibm/bits/iscanonical.h (iscanonical):
Likewise.
* sysdeps/powerpc/powerpc64le/Makefile
(CFLAGS-test-math-iscanonical.cc): New variable.
The new generic expf and exp2f code don't need wrappers any more, they
set errno inline, so only use the wrappers on targets that need it.
(If the wrapper is needed, then the top level wrapper code is included,
otherwise empty w_exp*f.c is used to suppress the wrapper.)
A powerpc64 expf implementation includes the expf c code directly which
needed some changes.
* sysdeps/ieee754/flt-32/e_exp2f.c (__exp2f): Define without wrapper.
* sysdeps/ieee754/flt-32/e_expf.c (__expf): Likewise
* sysdeps/ieee754/flt-32/w_exp2f.c: New file.
* sysdeps/ieee754/flt-32/w_expf.c: New file.
* sysdeps/powerpc/powerpc64/fpu/multiarch/e_expf-ppc64.c: Update for
the new expf code.
* sysdeps/powerpc/powerpc64/fpu/multiarch/w_expf.c: New file.
* sysdeps/powerpc/powerpc64/power8/fpu/w_expf.c: New file.
* sysdeps/m68k/m680x0/fpu/w_exp2f.c: New file.
* sysdeps/m68k/m680x0/fpu/w_expf.c: New file.
* sysdeps/i386/fpu/w_exp2f.c: New file.
* sysdeps/i386/fpu/w_expf.c: New file.
* sysdeps/i386/i686/fpu/multiarch/w_expf.c: New file.
* sysdeps/x86_64/fpu/w_expf.c: New file.
Vectorized loops are used for sizes greater than 32B to improve
performance over power7 optimization. This shows as an average
of 25% improvement depending on the position of search
character. The performance is same for shorter strings.
A few math functions still use __fabs(f/l) rather than fabs, which
means they won't be inlined. Rename them so they are inlined.
Also add -fno-builtin-fabsl to nofpu powerpc makefile to work around
BZ #29253.
* sysdeps/ieee754/dbl-64/e_lgamma_r.c
(__ieee754_lgamma_r): Use fabs rather than __fabs.
* sysdeps/ieee754/dbl-64/e_log10.c (__ieee754_log10): Likewise.
* sysdeps/ieee754/dbl-64/e_log2.c (__ieee754_log2): Likewise.
* sysdeps/ieee754/flt-32/e_lgammaf_r.c
(__ieee754_lgammaf_r): Use fabsf rather than __fabsf.
* sysdeps/ieee754/flt-32/e_log10f.c (__ieee754_log10f): Likewise.
* sysdeps/ieee754/flt-32/e_log2f.c (__ieee754_log2f): Likewise.
* sysdeps/ieee754/ldbl-128/e_lgammal_r.c
(__ieee754_lgammal_r): Use fabsl rather than __fabsl.
* sysdeps/ieee754/ldbl-128/e_log10l.c (__ieee754_log10l): Likewise.
* sysdeps/ieee754/ldbl-128/e_log2l.c (__ieee754_log2l): Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_lgammal_r.c
(__ieee754_lgammal_r): Use fabsl rather than __fabsl.
* sysdeps/ieee754/ldbl-128ibm/e_log10l.c (__ieee754_log10l): Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_log2l.c (__ieee754_log2l): Likewise.
* sysdeps/powerpc/nofpu/Makefile: Add -fno-builtin-fabsl for BZ #29253.
Simplify the C99 isgreater macros. Although some support was added
in GCC 2.97, not all targets added support until GCC 3.1. Therefore
only use the builtins in math.h from GCC 3.1 onwards, and defer to
generic macros otherwise. Improve the generic isunordered macro
to use compares rather than call fpclassify twice - this is not only
faster but also correct for signaling NaNs.
* math/math.h: Improve handling of C99 isgreater macros.
* sysdeps/alpha/fpu/bits/mathinline.h: Remove isgreater macros.
* sysdeps/m68k/m680x0/fpu/bits/mathinline.h: Likewise.
* sysdeps/powerpc/bits/mathinline.h: Likewise.
* sysdeps/sparc/fpu/bits/mathinline.h: Likewise.
* sysdeps/x86/fpu/bits/mathinline.h: Likewise.
On powerpc64le, compiler support for float128 is not enabled by default
on gcc. To enable it, the flag -mfloat128 must be passed as a command
line option to the compiler. This means that only the few files that
actively have -mfloat128 passed as an argument get compiler support for
float128, whereas all other files don't.
When -mfloat128 becomes enabled by default on powerpc [1], all the files
that do not currently have compiler support for float128 enabled during
their compilation, will start to have it. This will lead to build
errors in s_finite.c, s_isinf.c, and s_isnan.c.
The errors are due to the unintended macro expansion of __finitef128 to
__redirect_finitef128 in math/bits/mathcalls-helper-functions.h. In
that header, __MATHDECL_1 takes '__finite' and 'f128' as arguments and
concatenates them. However, since '__finite' has been redefined in
s_finite.c, the function declaration becomes __redirect_finitef128:
extern int __redirect___finitef128 (_Float128 __value) __attribute__ ((__nothrow__ )) __attribute__ ((__const__));
This declaration itself is OK. The problem arises when include/math.h
creates the hidden prototype ('hidden_proto (__finitef128)'), which
expands to:
extern __typeof (__finitef128) __finitef128 __attribute__ ((visibility ("hidden")));
Since __finitef128 is not declared, __typeof fails. This effect was
already true for the 'float' and 'long double' versions and is now true
for float128. Likewise for isinsff128 and isnanf128.
This patch defines __finitef128 as __redirect___finitef128 in
sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite.c, similarly to what's
done for the float and long double versions of these functions, to get
rid of the build error. Likewise for isinff128 and isnanf128.
[1] https://gcc.gnu.org/ml/gcc-patches/2017-08/msg01028.html
Tested for powerpc64 and powerpc64le.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite.c
(__finitef128): Define to __redirect___finitef128.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf.c
(__isinff128): Define to __redirect___isinff128.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan.c
(__isnanf128): Define to __redirect___isnanf128.
On powerpc64le, not all files can have the flag -mfloat128 passed as an
option on the compile command, since that could conflict with other
flags, such as -mno-vsx. Each file that needs the flag, gets it through
a CFLAGS-filename variable on sysdeps/powerpc/powerpc64le/Makefile.
The test cases tst-strtod-nan-locale and tst-wcstod-nan-locale are
missing this flag.
Tested for powerpc64le.
* sysdeps/powerpc/powerpc64le/Makefile
(CFLAGS-tst-strtod-nan-locale.c): New variable.
(CFLAGS-tst-wcstod-nan-locale.c): New variable.
As per the section "3.1.4.2 Alignment Interrupts" of the "POWER8 Processor
User's Manual for the Single-Chip Module", alignment interrupt is reported
for misaligned stores in Caching-inhibited storage. As memset is used in
some drivers for DMA (like xorg), this patch avoids misaligned stores for
sizes less than 8 in memset.
Some math functions have to be distributed in libc because they're
required by printf.
libc and libm require their own builds of these functions, e.g. libc
functions have to call __stack_chk_fail_local in order to bypass the
PLT, while libm functions have to call __stack_chk_fail.
While math/Makefile treat the generic cases, i.e. s_isinff, the
multiarch Makefile has to treat its own files, i.e. s_isinff-ppc64.
[BZ #21745]
* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile:
[$(subdir) = math] (sysdep_calls): New variable. Has the
previous contents of sysdep_routines, but re-sorted..
[$(subdir) = math] (sysdep_routines): Re-use the contents from
sysdep_calls.
[$(subdir) = math] (libm-sysdep_routines): Remove the functions
defined in sysdep_calls and replace by the respective m_* names.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-ppc64.S:
(compat_symbol): Undefine to avoid duplicated compat symbols in
libc.
This patch obsoletes the pow10, pow10f and pow10l functions (makes
them into compat symbols, not available for new ports or static
linking). The exp10 names for these functions are standardized (in TS
18661-4) and were added in the same glibc version (2.1) as pow10 so
source code can change to use them without any loss of portability.
Since pow10 is deliberately not provided for _Float128, only exp10,
this slightly simplifies moving to the new wrapper templates in the
!LIBM_SVID_COMPAT case, by avoiding needing to arrange for pow10,
pow10f and pow10l to be defined by those templates.
Tested for x86_64, and with build-many-glibcs.py.
* manual/math.texi (pow10): Do not document.
(pow10f): Likewise.
(pow10l): Likewise.
* math/bits/mathcalls.h [__USE_GNU] (pow10): Do not declare.
* math/bits/math-finite.h [__USE_GNU] (pow10): Likewise.
* math/libm-test-exp10.inc (pow10_test): Remove.
(do_test): Do not call pow10.
* math/w_exp10_compat.c (pow10): Make into compat symbol.
[NO_LONG_DOUBLE] (pow10l): Likewise.
* math/w_exp10f_compat.c (pow10f): Likewise.
* math/w_exp10l_compat.c (pow10l): Likewise.
* sysdeps/ia64/fpu/e_exp10.S: Include <shlib-compat.h>.
(pow10): Make into compat symbol.
* sysdeps/ia64/fpu/e_exp10f.S: Include <shlib-compat.h>.
(pow10f): Make into compat symbol.
* sysdeps/ia64/fpu/e_exp10l.S: Include <shlib-compat.h>.
(pow10l): Make into compat symbol.
* sysdeps/ieee754/ldbl-opt/Makefile (libnldbl-calls): Remove
pow10.
(CFLAGS-nldbl-pow10.c): Remove variable..
* sysdeps/ieee754/ldbl-opt/nldbl-pow10.c: Remove file.
* sysdeps/ieee754/ldbl-opt/w_exp10_compat.c (pow10l): Condition on
[SHLIB_COMPAT (libm, GLIBC_2_1, GLIBC_2_27)].
* sysdeps/ieee754/ldbl-opt/w_exp10l_compat.c (compat_symbol):
Undefine and redefine.
(pow10l): Make into compat symbol.
* sysdeps/aarch64/libm-test-ulps: Remove pow10 ulps.
* sysdeps/alpha/fpu/libm-test-ulps: Likewise.
* sysdeps/arm/libm-test-ulps: Likewise.
* sysdeps/hppa/fpu/libm-test-ulps: Likewise.
* sysdeps/i386/fpu/libm-test-ulps: Likewise.
* sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise.
* sysdeps/microblaze/libm-test-ulps: Likewise.
* sysdeps/mips/mips32/libm-test-ulps: Likewise.
* sysdeps/mips/mips64/libm-test-ulps: Likewise.
* sysdeps/nios2/libm-test-ulps: Likewise.
* sysdeps/powerpc/fpu/libm-test-ulps: Likewise.
* sysdeps/powerpc/nofpu/libm-test-ulps: Likewise.
* sysdeps/s390/fpu/libm-test-ulps: Likewise.
* sysdeps/sh/libm-test-ulps: Likewise.
* sysdeps/sparc/fpu/libm-test-ulps: Likewise.
* sysdeps/tile/libm-test-ulps: Likewise.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
When signaling nans are enabled (with -fsignaling-nans), the C++ version
of iszero uses the fpclassify macro, which is defined with __MATH_TG.
However, when support for float128 is available, __MATH_TG uses the
builtin __builtin_types_compatible_p, which is only available in C mode.
This patch refactors the C++ version of iszero so that it uses function
overloading to select between the floating-point types, instead of
relying on fpclassify and __MATH_TG.
Tested for powerpc64le, s390x, x86_64, and with build-many-glibcs.py.
[BZ #21930]
* math/math.h [defined __cplusplus && defined __SUPPORT_SNAN__]
(iszero): New C++ implementation that does not use
fpclassify/__MATH_TG/__builtin_types_compatible_p, when
signaling nans are enabled, since __builtin_types_compatible_p
is a C-only feature.
* math/test-math-iszero.cc: When __HAVE_DISTINCT_FLOAT128 is
defined, include ieee754_float128.h for access to the union and
member ieee854_float128.ieee.
[__HAVE_DISTINCT_FLOAT128] (do_test): Call check_float128.
[__HAVE_DISTINCT_FLOAT128] (check_float128): New function.
* sysdeps/powerpc/powerpc64le/Makefile [subdir == math]
(CXXFLAGS-test-math-iszero.cc): Add -mfloat128 to the build
options of test-math-zero on powerpc64le.
This patch removes the powerpc32-specific wrappers for sqrt and sqrtf.
These wrappers, by adding architecture-specific uses of _LIB_VERSION
and __kernel_standard, unnecessarily complicate cleanups of libm error
handling. They also do not serve a useful optimization purpose. GCC
knows about sqrt as a built-in function, and can generate direct calls
to a hardware square root instruction, either on its own, in the
-fno-math-errno case, or together with an inline check for the
argument being negative and a call to the out-of-line sqrt function
for error handling only in that case (and has been able to do so for a
long time). Thus in practice the wrapper will only be called only in
the case of negative arguments, which is not a case it is useful to
optimize for.
Tested with build-many-glibcs.py for powerpc-linux-gnu-power4.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/w_sqrt_compat-power5.S:
Remove file.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/w_sqrt_compat-ppc32.S:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/w_sqrt_compat.c:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/w_sqrtf_compat-power5.S:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/w_sqrtf_compat-ppc32.S:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/w_sqrtf_compat.c:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/w_sqrt_compat.S: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/w_sqrtf_compat.S: Likewise.
* sysdeps/powerpc/powerpc32/power5/fpu/w_sqrt_compat.S: Likewise.
* sysdeps/powerpc/powerpc32/power5/fpu/w_sqrtf_compat.S: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile
(libm-sysdep-routines): Remove w_sqrt_compat-power5,
w_sqrt_compat-ppc32, w_sqrtf_compat-power5 and
w_sqrtf_compat-ppc32.
During the development of float128 on powerpc64le, the ulps have been
generated since early versions of the patches. On those versions, the
functions gamma and pow10 were mistakenly thought to be part of the API.
After review, the functions were removed, however the ulps for them were
carried in the ulps file.
This patch removes such entries from the ulps file, as well as it
shrinks the ulps for the cpow and lgamma functions.
Tested for powerpc64le.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
The macro __MATH_TG contains the logic to select between long double and
_Float128, when these types are ABI-distinct. This logic relies on
__builtin_types_compatible_p, which is not available in C++ mode.
On the other hand, C++ function overloading provides the means to
distinguish between the floating-point types. The overloading
resolution will match the correct parameter regardless of type
qualifiers, i.e.: const and volatile.
Tested for powerpc64le, s390x, and x86_64.
* math/math.h [defined __cplusplus] (issignaling): Provide a C++
definition for issignaling that does not rely on __MATH_TG,
since __MATH_TG uses __builtin_types_compatible_p, which is only
available in C mode.
(CFLAGS-test-math-issignaling.cc): New variable.
* math/Makefile [CXX] (tests): Add test-math-issignaling.
* math/test-math-issignaling.cc: New test for C++ implementation
of type-generic issignaling.
* sysdeps/powerpc/powerpc64le/Makefile [subdir == math]
(CXXFLAGS-test-math-issignaling.cc): Add -mfloat128 to the build
options of test-math-issignaling on powerpc64le.
This patch obsoletes support for SVID libm error handling (the system
where a user-defined function matherr is called on a libm function
error; only enabled if you also set _LIB_VERSION = _SVID_ or
_LIB_VERSION = _XOPEN_) and the use of the _LIB_VERSION global
variable to control libm error handling. matherr and _LIB_VERSION are
made into compat symbols, not supported for new ports or for static
linking. The libieee.a object file (which sets _LIB_VERSION = _IEEE_,
so disabling errno setting for some functions) is also removed, and
all the related definitions are removed from math.h.
The manual already recommends against using matherr, and it's already
not supported for _Float128 functions (those use new wrappers that
don't support matherr, only errno) - this patch means that it becomes
possible to e.g. add sinf32 as an alias to sinf without that resulting
in undesired matherr support in sinf32 for existing glibc ports.
matherr support is not part of any standard supported by glibc (it was
removed in XPG4).
Because matherr is a function to be defined by the user, of course
user programs defining such a function will still continue to link; it
just quietly won't be used. If they try to write to the library's
copy of _LIB_VERSION to enable SVID error handling, however, they will
get a link error (but if they define their own _LIB_VERSION variable,
they won't).
I expect the most likely case of build failures from this patch to be
programs with unconditional cargo-culted uses of -lieee (based on a
notion of "I want IEEE floating point", not any actual requirement for
that library).
Ideally, the new-port-or-static-linking case would use the new
wrappers used for _Float128. This is not implemented in this patch,
because of the complication of architecture-specific (powerpc32 and
sparc) sqrt wrappers that use _LIB_VERSION and __kernel_standard
directly. Thus, the old wrappers and __kernel_standard are still
built unconditionally, and _LIB_VERSION still exists in static libm.
But when the old wrappers and __kernel_standard are built in the
non-compat case, _LIB_VERSION and matherr are defined as macros so
code to support those features isn't actually built into static libm
or new ports' shared libm after this patch.
I intend to move to the new wrappers for static libm and new ports in
followup patches. I believe the sqrt wrappers for powerpc32 and sparc
can reasonably be removed. GCC already optimizes the normal case of
sqrt by generating code that uses a hardware instruction and only
calls the sqrt function if the argument was negative (if
-fno-math-errno, of course, it just uses the hardware instruction
without any check for negative argument being needed). Thus those
wrappers will only actually get called in the case of negative
arguments, which is not a case it makes sense to optimize for. But
even without removing the powerpc32 and sparc wrappers it should still
be possible to move to the new wrappers for static libm and new ports,
just without having those dubious architecture-specific optimizations
in static libm.
Everything said about matherr equally applies to matherrf and matherrl
(IA64-specific, undocumented), except that the structure of IA64 libm
means it won't be converted to using the new wrappers (it doesn't use
the old ones either, but its own error-handling code instead).
As with other tests of compat symbols, I expect test-matherr and
test-matherr-2 to need to become appropriately conditional once we
have a system for disabling such tests for ports too new to have the
relevant symbols.
Tested for x86_64 and x86, and with build-many-glibcs.py.
* math/math.h [__USE_MISC] (_LIB_VERSION_TYPE): Remove.
[__USE_MISC] (_LIB_VERSION): Likewise.
[__USE_MISC] (struct exception): Likewise.
[__USE_MISC] (matherr): Likewise.
[__USE_MISC] (DOMAIN): Likewise.
[__USE_MISC] (SING): Likewise.
[__USE_MISC] (OVERFLOW): Likewise.
[__USE_MISC] (UNDERFLOW): Likewise.
[__USE_MISC] (TLOSS): Likewise.
[__USE_MISC] (PLOSS): Likewise.
[__USE_MISC] (HUGE): Likewise.
[__USE_XOPEN] (MAXFLOAT): Define even if [__USE_MISC].
* math/math-svid-compat.h: New file.
* conform/linknamespace.pl (@whitelist): Remove matherr, matherrf
and matherrl.
* include/math.h [!_ISOMAC] (__matherr): Remove.
* manual/arith.texi (FP Exceptions): Do not document matherr.
* math/Makefile (tests): Change test-matherr to test-matherr-3.
(tests-internal): New variable.
(install-lib): Do not add libieee.a.
(non-lib.a): Likewise.
(extra-objs): Do not add libieee.a and ieee-math.o.
(CPPFLAGS-s_lib_version.c): Remove variable.
($(objpfx)libieee.a): Remove rule.
($(addprefix $(objpfx), $(tests-internal)): Depend on $(libm).
* math/ieee-math.c: Remove.
* math/libm-test-support.c (matherr): Remove.
* math/test-matherr.c: Use <support/test-driver.c>. Add copyright
and license notices. Include <math-svid-compat.h> and
<shlib-compat.h>.
(matherr): Undefine as macro. Use compat_symbol_reference.
(_LIB_VERSION): Likewise.
* math/test-matherr-2.c: New file.
* math/test-matherr-3.c: Likewise.
* sysdeps/generic/math_private.h (__kernel_standard): Remove
declaration.
(__kernel_standard_f): Likewise.
(__kernel_standard_l): Likewise.
* sysdeps/ieee754/s_lib_version.c: Do not include <math.h> or
<math_private.h>. Include <math-svid-compat.h>.
(_LIB_VERSION): Undefine as macro.
(_LIB_VERSION_INTERNAL): Always initialize to _POSIX_. Define
only if [LIBM_SVID_COMPAT || !defined SHARED]. If
[LIBM_SVID_COMPAT], use compat_symbol.
* sysdeps/ieee754/s_matherr.c: Do not include <math.h> or
<math_private.h>. Include <math-svid-compat.h>.
(matherr): Undefine as macro.
(__matherr): Define only if [LIBM_SVID_COMPAT]. Use
compat_symbol.
* sysdeps/ia64/fpu/libm_error.c: Include <math-svid-compat.h>.
[_LIBC && LIBM_SVID_COMPAT] (matherrf): Use
compat_symbol_reference.
[_LIBC && LIBM_SVID_COMPAT] (matherrl): Likewise.
[_LIBC && !LIBM_SVID_COMPAT] (matherrf): Define as macro.
[_LIBC && !LIBM_SVID_COMPAT] (matherrl): Likewise.
* sysdeps/ia64/fpu/libm_support.h: Include <math-svid-compat.h>.
(MATHERR_D): Remove declaration.
[!_LIBC] (_LIB_VERSION_TYPE): Likewise
[!LIBM_BUILD] (_LIB_VERSIONIMF): Likewise.
[LIBM_BUILD] (pmatherrf): Likewise.
[LIBM_BUILD] (pmatherr): Likewise.
[LIBM_BUILD] (pmatherrl): Likewise.
(DOMAIN): Likewise.
(SING): Likewise.
(OVERFLOW): Likewise.
(UNDERFLOW): Likewise.
(TLOSS): Likewise.
(PLOSS): Likewise.
* sysdeps/ia64/fpu/s_matherrf.c: Include <math-svid-compat.h>.
(__matherrf): Define only if [LIBM_SVID_COMPAT]. Use
compat_symbol.
* sysdeps/ia64/fpu/s_matherrl.c: Include <math-svid-compat.h>.
(__matherrl): Define only if [LIBM_SVID_COMPAT]. Use
compat_symbol.
* math/lgamma-compat.h: Include <math-svid-compat.h>.
* math/w_acos_compat.c: Likewise.
* math/w_acosf_compat.c: Likewise.
* math/w_acosh_compat.c: Likewise.
* math/w_acoshf_compat.c: Likewise.
* math/w_acoshl_compat.c: Likewise.
* math/w_acosl_compat.c: Likewise.
* math/w_asin_compat.c: Likewise.
* math/w_asinf_compat.c: Likewise.
* math/w_asinl_compat.c: Likewise.
* math/w_atan2_compat.c: Likewise.
* math/w_atan2f_compat.c: Likewise.
* math/w_atan2l_compat.c: Likewise.
* math/w_atanh_compat.c: Likewise.
* math/w_atanhf_compat.c: Likewise.
* math/w_atanhl_compat.c: Likewise.
* math/w_cosh_compat.c: Likewise.
* math/w_coshf_compat.c: Likewise.
* math/w_coshl_compat.c: Likewise.
* math/w_exp10_compat.c: Likewise.
* math/w_exp10f_compat.c: Likewise.
* math/w_exp10l_compat.c: Likewise.
* math/w_exp2_compat.c: Likewise.
* math/w_exp2f_compat.c: Likewise.
* math/w_exp2l_compat.c: Likewise.
* math/w_fmod_compat.c: Likewise.
* math/w_fmodf_compat.c: Likewise.
* math/w_fmodl_compat.c: Likewise.
* math/w_hypot_compat.c: Likewise.
* math/w_hypotf_compat.c: Likewise.
* math/w_hypotl_compat.c: Likewise.
* math/w_j0_compat.c: Likewise.
* math/w_j0f_compat.c: Likewise.
* math/w_j0l_compat.c: Likewise.
* math/w_j1_compat.c: Likewise.
* math/w_j1f_compat.c: Likewise.
* math/w_j1l_compat.c: Likewise.
* math/w_jn_compat.c: Likewise.
* math/w_jnf_compat.c: Likewise.
* math/w_jnl_compat.c: Likewise.
* math/w_lgamma_main.c: Likewise.
* math/w_lgamma_r_compat.c: Likewise.
* math/w_lgammaf_main.c: Likewise.
* math/w_lgammaf_r_compat.c: Likewise.
* math/w_lgammal_main.c: Likewise.
* math/w_lgammal_r_compat.c: Likewise.
* math/w_log10_compat.c: Likewise.
* math/w_log10f_compat.c: Likewise.
* math/w_log10l_compat.c: Likewise.
* math/w_log2_compat.c: Likewise.
* math/w_log2f_compat.c: Likewise.
* math/w_log2l_compat.c: Likewise.
* math/w_log_compat.c: Likewise.
* math/w_logf_compat.c: Likewise.
* math/w_logl_compat.c: Likewise.
* math/w_pow_compat.c: Likewise.
* math/w_powf_compat.c: Likewise.
* math/w_powl_compat.c: Likewise.
* math/w_remainder_compat.c: Likewise.
* math/w_remainderf_compat.c: Likewise.
* math/w_remainderl_compat.c: Likewise.
* math/w_scalb_compat.c: Likewise.
* math/w_scalbf_compat.c: Likewise.
* math/w_scalbl_compat.c: Likewise.
* math/w_sinh_compat.c: Likewise.
* math/w_sinhf_compat.c: Likewise.
* math/w_sinhl_compat.c: Likewise.
* math/w_sqrt_compat.c: Likewise.
* math/w_sqrtf_compat.c: Likewise.
* math/w_sqrtl_compat.c: Likewise.
* math/w_tgamma_compat.c: Likewise.
* math/w_tgammaf_compat.c: Likewise.
* math/w_tgammal_compat.c: Likewise.
* sysdeps/ieee754/dbl-64/w_exp_compat.c: Likewise.
* sysdeps/ieee754/flt-32/w_expf_compat.c: Likewise.
* sysdeps/ieee754/k_standard.c: Likewise.
* sysdeps/ieee754/k_standardf.c: Likewise.
* sysdeps/ieee754/k_standardl.c: Likewise.
* sysdeps/ieee754/ldbl-128/w_expl_compat.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/w_expl_compat.c: Likewise.
* sysdeps/ieee754/ldbl-96/w_expl_compat.c: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/w_sqrt_compat.S: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/w_sqrtf_compat.S: Likewise.
* sysdeps/powerpc/powerpc32/power5/fpu/w_sqrt_compat.S: Likewise.
* sysdeps/powerpc/powerpc32/power5/fpu/w_sqrtf_compat.S: Likewise.
* sysdeps/sparc/sparc32/fpu/w_sqrt_compat.S: Likewise.
* sysdeps/sparc/sparc32/fpu/w_sqrtf_compat.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/w_sqrt_compat-vis3.S:
Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/w_sqrtf_compat-vis3.S:
Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/w_sqrt_compat.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/w_sqrtf_compat.S: Likewise.
* sysdeps/sparc/sparc64/fpu/w_sqrt_compat.S: Likewise.
* sysdeps/sparc/sparc64/fpu/w_sqrtf_compat.S: Likewise.
POWER ISA 3.0 introduces the xssqrtqp instructions, which expects
operands to be in Vector Registers (Altivec/VMX), even though this
instruction belongs to the Vector-Scalar Instruction Set.
In GCC's Extended Assembly for POWER, the 'wq' register constraint is
provided for use with IEEE 754 128-bit floating-point values. However,
this constraint does not limit the register allocation to Vector
Registers (Altivec/VMX) and could assign a Vector-Scalar Register (VSX)
to the operands of the instruction.
This patch changes the register constraint used in sqrtf128 from 'wq' to
'v', in order to request a Vector Register (Altivec/VMX) for use with
the xssqrtqp instruction.
Tested for powerpc64le and --with-cpu=power9.
[BZ #21941]
* sysdeps/powerpc/fpu/math_private.h (__ieee754_sqrtf128): Since
xssqrtqp requires operands to be in Vector Registers
(Altivec/VMX), replace the register constraint 'wq' with 'v'.
* sysdeps/powerpc/powerpc64le/power9/fpu/e_sqrtf128.c
(__ieee754_sqrtf128): Likewise.
This makes the __tls_get_addr_opt test run as a shared library, and so
actually test that DTPMOD64/DTPREL64 pairs are processed by ld.so to
support the __tls_get_adfr_opt call stub fast return. After a
2017-01-24 patch (binutils f0158f4416) ld.bfd no longer emitted
unnecessary dynamic relocations against local thread variables,
instead setting up the __tls_index GOT entries for the call stub fast
return. This meant tst-tlsopt-powerpc passed but did not check ld.so
relocation support. After a 2017-07-16 patch (binutils 676ee2b5fa)
ld.bfd no longer set up the __tls_index GOT entries for the call stub
fast return, and tst-tlsopt-powerpc failed.
Compiling mod-tlsopt-powerpc.c with -DSHARED exposed a bug in
powerpc64/tls-macros.h, which defines a __TLS_GET_ADDR macro that
clashes with one defined in dl-tls.h. The tls-macros.h version is
only used in that file, so delete it and expand.
* sysdeps/powerpc/mod-tlsopt-powerpc.c: Extract from
tst-tlsopt-powerpc.c with function name change and no test harness.
* sysdeps/powerpc/tst-tlsopt-powerpc.c: Remove body of test.
Call tls_get_addr_opt_test.
* sysdeps/powerpc/Makefile (LDFLAGS-tst-tlsopt-powerpc): Don't define.
(modules-names): Add mod-tlsopt-powerpc.
(mod-tlsopt-powerpc.so-no-z-defs): Define.
(tst-tlsopt-powerpc): Depend on .so.
* sysdeps/powerpc/powerpc64/tls-macros.h (__TLS_GET_ADDR): Don't
define. Expand use in TLS_GD and TLS_LD.
The patch proposed by Peter Bergner [1] to libgcc in order to fix
[BZ #21707] adds a dependency on a symbol provided by the loader,
forcing the loader to be linked to tests after libgcc was linked.
It also requires to read the thread pointer during IRELA relocations.
Tested on powerpc, powerpc64, powerpc64le, s390x and x86_64.
[1] https://sourceware.org/ml/libc-alpha/2017-06/msg01383.html
[BZ #21707]
* csu/libc-start.c (LIBC_START_MAIN): Perform IREL{,A}
relocations before or after initializing the TCB on statically
linked executables. That's a per-architecture definition.
* elf/rtld.c (dl_main): Add a comment about thread-local
variables initialization.
* sysdeps/generic/libc-start.h: New file. Define
ARCH_APPLY_IREL and ARCH_SETUP_IREL.
* sysdeps/powerpc/Makefile:
[$(subdir) = elf && $(multi-arch) != no] (tests-static-internal): Add tst-tlsifunc-static.
[$(subdir) = elf && $(multi-arch) != no && $(build-shared) == yes]
(tests-internal): Add tst-tlsifunc.
* sysdeps/powerpc/tst-tlsifunc.c: New file.
* sysdeps/powerpc/tst-tlsifunc-static.c: Likewise.
* sysdeps/powerpc/powerpc64le/Makefile (f128-loader-link): New
variable.
[$(subdir) = math] (test-float128% test-ifloat128%): Force
linking to the loader after linking to libgcc.
[$(subdir) = wcsmbs || $(subdir) = stdlib] (bug-strtod bug-strtod2)
(bug-strtod2 tst-strtod-round tst-wcstod-round tst-strtod6 tst-strrom)
(tst-strfrom-locale strfrom-skeleton): Likewise.
* sysdeps/unix/sysv/linux/powerpc/libc-start.h: New file. Define
ARCH_APPLY_IREL and ARCH_SETUP_IREL.
On powerpc64le, the compilation of the files related to float128 support
requires the option -mfloat128 to be passed to gcc. However, not all
possible object suffixes were covered in the Makefile. This patch uses
$(all-object-suffixes) in all remaining rules.
Tested for powerpc64le.
* sysdeps/powerpc/powerpc64le/Makefile: Use $(all-object-suffixes)
to iterate over all possible object suffixes. Add a comment
explaining the use of sysdep-CFLAGS instead of CFLAGS.
In math/math.h, __MATH_TG will expand signbit to __builtin_signbit*,
e.g.: __builtin_signbitf128, before GCC 6. However, there has never
been a __builtin_signbitf128 in GCC and the type-generic builtin is
only available since GCC 6. For older GCC, this patch defines
__builtin_signbitf128 to __signbitf128, so that the internal function
is used instead of the non-existent builtin.
This patch also changes the implementation of __signbitf128, because
it was reusing the implementation of __signbitl from ldbl-128, which
calls __builtin_signbitl. Using the long double version of the
builtin is not correct on machines where _Float128 is ABI-distinct
from long double (i.e.: ia64, powerpc64le, x86, x86_84). The new
implementation does not rely on builtins when being built with GCC
versions older than 6.0.
The new code does not currently affect powerpc64le builds, because
only GCC 6.2 fulfills the requirements from configure. It might
affect powerpc64le builds if those requirements are backported to
older versions of the compiler. The new code affects x86_64 builds,
since glibc is supposed to build correctly with older versions of GCC.
Tested for powerpc64le and x86_64.
* include/math.h (__signbitf128): Define as hidden.
* sysdeps/ieee754/float128/s_signbitf128.c (__signbitf128):
Reimplement without builtins.
* sysdeps/ia64/bits/floatn.h [!__GNUC_PREREQ (6, 0)]
(__builtin_signbitf128): Define to __signbitf128.
* sysdeps/powerpc/bits/floatn.h: Likewise.
* sysdeps/x86/bits/floatn.h: Likewise.
The ucontext_t type has a tag struct ucontext. As with previous such
issues for siginfo_t and stack_t, this tag is not permitted by POSIX
(is not in a reserved namespace), and so namespace conformance means
breaking C++ name mangling for this type.
In this case, the type does need to have some tag rather than just a
typedef name, because it includes a pointer to itself. This patch
uses struct ucontext_t as the new tag, so the type is mangled as
ucontext_t (the POSIX *_t reservation applies in all namespaces, not
just the namespace of ordinary identifiers). Another reserved name
such as struct __ucontext could of course be used.
Because of other namespace issues, this patch does not by itself fix
bug 21457 or allow any XFAILs to be removed.
Tested for x86_64, and with build-many-glibcs.py.
[BZ #21457]
* sysdeps/arm/sys/ucontext.h (struct ucontext): Rename to struct
ucontext_t.
* sysdeps/generic/sys/ucontext.h (struct ucontext): Likewise.
* sysdeps/i386/sys/ucontext.h (struct ucontext): Likewise.
* sysdeps/m68k/sys/ucontext.h (struct ucontext): Likewise.
* sysdeps/mips/sys/ucontext.h (struct ucontext): Likewise.
* sysdeps/unix/sysv/linux/aarch64/sys/ucontext.h (struct
ucontext): Likewise.
* sysdeps/unix/sysv/linux/alpha/sys/ucontext.h (struct ucontext):
Likewise.
* sysdeps/unix/sysv/linux/arm/sys/ucontext.h (struct ucontext):
Likewise.
* sysdeps/unix/sysv/linux/hppa/sys/ucontext.h (struct ucontext):
Likewise.
* sysdeps/unix/sysv/linux/ia64/sys/ucontext.h (struct ucontext):
Likewise.
* sysdeps/unix/sysv/linux/m68k/sys/ucontext.h (struct ucontext):
Likewise.
* sysdeps/unix/sysv/linux/mips/sys/ucontext.h (struct ucontext):
Likewise.
* sysdeps/unix/sysv/linux/nios2/sys/ucontext.h (struct ucontext):
Likewise.
* sysdeps/unix/sysv/linux/powerpc/sys/ucontext.h (struct
ucontext): Likewise.
* sysdeps/unix/sysv/linux/s390/sys/ucontext.h (struct ucontext):
Likewise.
* sysdeps/unix/sysv/linux/sh/sys/ucontext.h (struct ucontext):
Likewise.
* sysdeps/unix/sysv/linux/sparc/sys/ucontext.h (struct ucontext):
Likewise.
* sysdeps/unix/sysv/linux/tile/sys/ucontext.h (struct ucontext):
Likewise.
* sysdeps/unix/sysv/linux/x86/sys/ucontext.h (struct ucontext):
Likewise.
* sysdeps/powerpc/powerpc32/backtrace.c (struct
rt_signal_frame_32): Likewise.
* sysdeps/powerpc/powerpc64/backtrace.c (struct signal_frame_64):
Likewise.
* sysdeps/unix/sysv/linux/aarch64/kernel_rt_sigframe.h (struct
kernel_rt_sigframe): Likewise.
* sysdeps/unix/sysv/linux/aarch64/sigcontextinfo.h (SIGCONTEXT):
Likewise.
* sysdeps/unix/sysv/linux/arm/register-dump.h (register_dump):
Likewise.
* sysdeps/unix/sysv/linux/arm/sigcontextinfo.h (SIGCONTEXT):
Likewise.
* sysdeps/unix/sysv/linux/hppa/profil-counter.h
(__profil_counter): Likewise.
* sysdeps/unix/sysv/linux/microblaze/sigcontextinfo.h
(SIGCONTEXT): Likewise.
* sysdeps/unix/sysv/linux/mips/kernel_rt_sigframe.h (struct
kernel_rt_sigframe): Likewise.
* sysdeps/unix/sysv/linux/nios2/kernel_rt_sigframe.h (struct
kernel_rt_sigframe): Likewise.
* sysdeps/unix/sysv/linux/nios2/sigcontextinfo.h (SIGCONTEXT):
Likewise.
* sysdeps/unix/sysv/linux/sh/makecontext.S (__makecontext):
Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/makecontext.c
(__start_context): Likewise.
* sysdeps/unix/sysv/linux/tile/sigcontextinfo.h (SIGCONTEXT):
Likewise.
* sysdeps/unix/sysv/linux/x86_64/register-dump.h (register_dump):
Likewise.
* sysdeps/unix/sysv/linux/x86_64/sigcontextinfo.h (SIGCONTEXT):
Likewise.
This patch adds ULPs for the float128 type, updates the abilist for libc
and libm, and adds the files bits/floatn.h and float128-abi.h, in order to
enable the new type for powerpc64le.
This patch also adds the implementation of sqrtf128 for powerpc64le, since
it is not implemented in libgcc. The sfp-machine.h header is taken from
libgcc.
Tested for powerpc64le (GCC 6.2 and GCC 7.1), powerpc64 and s390x.
* manual/math.texi (Mathematics): Mention the enabling of float128
for powerpc64le.
* sysdeps/powerpc/bits/floatn.h: New file.
* sysdeps/powerpc/fpu/libm-test-ulps: Regenerated.
* sysdeps/powerpc/fpu/math_private.h:
(__ieee754_sqrtf128): New inline override.
* sysdeps/powerpc/powerpc64le/Implies-before: New file.
* sysdeps/powerpc/powerpc64le/Makefile: New file.
* sysdeps/powerpc/powerpc64le/fpu/e_sqrtf128.c: New file.
* sysdeps/powerpc/powerpc64le/fpu/sfp-machine.h: New file.
* sysdeps/powerpc/powerpc64le/power9/fpu/e_sqrtf128.c: New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist:
Updated.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist:
Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64le/float128-abi.h:
New file.
Support for powerpc64le requires POWER8 or newer processors. Builds for
older processors are not tested. Require at least POWER8 to avoid
unintentional builds.
* sysdeps/powerpc/powerpc64le/configure.ac: Check for POWER8.
* sysdeps/powerpc/powerpc64le/configure: Update.
On powerpc64le, support for float128 will be enabled, which requires some
compiler features to be present. This patch adds a configure test to check
for such features, which are provided for powerpc64le since GCC 6.2.
Tested for powerpc64 and powerpc64le.
* INSTALL: Regenerate.
* manual/install.texi (Recommended Tools for Compilation): Mention
the powerpc64le-specific requirement in the manual.
* sysdeps/powerpc/powerpc64le/configure.ac: New file with checks
for the compiler features required for building float128.
* sysdeps/powerpc/powerpc64le/configure: New, auto-generated file.
sysdeps/powerpc/powerpc64/fpu/multiarch/s_sinf-ppc64.c should fall back to
sysdeps/powerpc/fpu/s_sinf.c not to sysdeps/ieee754/flt-32/s_sinf.c.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_sinf-ppc64.c: Change
s_sinf.c from sysdeps/ieee754/flt-32/ to sysdeps/powerpc/fpu/.
Linux commit ID a4700a26107241cc7b9ac8528b2c6714ff99983d reserved 2 more
bits for the instructions darn (Deliver a Random Number) and scv (System
Call Vectored).
Linux commit ID 6997e57d693b07289694239e52a10d2f02c3a46f reserved
another bit for internal usage.
* sysdeps/powerpc/bits/hwcap.h: Add PPC_FEATURE2_DARN and
PPC_FEATURE2_SCV.
* sysdeps/powerpc/dl-procinfo.c (_dl_powerpc_cap_flags): Add scv
and darn.
<locale.h> is specified to define locale_t in POSIX.1-2008, and so are
all of the headers that define functions that take locale_t arguments.
Under _GNU_SOURCE, the additional headers that define such functions
have also always defined locale_t. Therefore, there is no need to use
__locale_t in public function prototypes, nor in any internal code.
* ctype/ctype-c99_l.c, ctype/ctype.h, ctype/ctype_l.c
* include/monetary.h, include/stdlib.h, include/time.h
* include/wchar.h, locale/duplocale.c, locale/freelocale.c
* locale/global-locale.c, locale/langinfo.h, locale/locale.h
* locale/localeinfo.h, locale/newlocale.c
* locale/nl_langinfo_l.c, locale/uselocale.c
* localedata/bug-usesetlocale.c, localedata/tst-xlocale2.c
* stdio-common/vfscanf.c, stdlib/monetary.h, stdlib/stdlib.h
* stdlib/strfmon_l.c, stdlib/strtod_l.c, stdlib/strtof_l.c
* stdlib/strtol.c, stdlib/strtol_l.c, stdlib/strtold_l.c
* stdlib/strtoll_l.c, stdlib/strtoul_l.c, stdlib/strtoull_l.c
* string/strcasecmp.c, string/strcoll_l.c, string/string.h
* string/strings.h, string/strncase.c, string/strxfrm_l.c
* sysdeps/ieee754/float128/strtof128_l.c
* sysdeps/ieee754/float128/wcstof128.c
* sysdeps/ieee754/float128/wcstof128_l.c
* sysdeps/ieee754/ldbl-128ibm/strtold_l.c
* sysdeps/ieee754/ldbl-64-128/strtold_l.c
* sysdeps/ieee754/ldbl-opt/nldbl-compat.c
* sysdeps/ieee754/ldbl-opt/nldbl-strfmon_l.c
* sysdeps/ieee754/ldbl-opt/nldbl-strtold_l.c
* sysdeps/ieee754/ldbl-opt/nldbl-wcstold_l.c
* sysdeps/powerpc/powerpc32/power7/strcasecmp.S
* sysdeps/powerpc/powerpc64/power7/strcasecmp.S
* sysdeps/x86_64/strcasecmp_l-nonascii.c
* sysdeps/x86_64/strncase_l-nonascii.c, time/strftime_l.c
* time/strptime_l.c, time/time.h, wcsmbs/mbsrtowcs_l.c
* wcsmbs/wchar.h, wcsmbs/wcscasecmp.c, wcsmbs/wcsncase.c
* wcsmbs/wcstod.c, wcsmbs/wcstod_l.c, wcsmbs/wcstof.c
* wcsmbs/wcstof_l.c, wcsmbs/wcstol_l.c, wcsmbs/wcstold.c
* wcsmbs/wcstold_l.c, wcsmbs/wcstoll_l.c, wcsmbs/wcstoul_l.c
* wcsmbs/wcstoull_l.c, wctype/iswctype_l.c
* wctype/towctrans_l.c, wctype/wcfuncs_l.c
* wctype/wctrans_l.c, wctype/wctype.h, wctype/wctype_l.c:
Change all uses of __locale_t to locale_t.
These machine-dependent inline string functions have never been on by
default, and even if they were a good idea at the time they were
introduced, they haven't really been touched in ten to fifteen years
and probably aren't a good idea on current-gen processors. Current
thinking is that this class of optimization is best left to the
compiler.
* bits/string.h, string/bits/string.h
* sysdeps/aarch64/bits/string.h
* sysdeps/m68k/m680x0/m68020/bits/string.h
* sysdeps/s390/bits/string.h, sysdeps/sparc/bits/string.h
* sysdeps/x86/bits/string.h: Delete file.
* string/string.h: Don't include bits/string.h.
* string/bits/string3.h: Rename to bits/string_fortified.h.
No need to undef various symbols that the removed headers
might have defined as macros.
* string/Makefile (headers): Remove bits/string.h, change
bits/string3.h to bits/string_fortified.h.
* string/string-inlines.c: Update commentary. Remove definitions
of various macros that nothing looks at anymore. Don't directly
include bits/string.h. Set _STRING_INLINE_unaligned here, based on
compiler-predefined macros.
* string/strncat.c: If STRNCAT is not defined, or STRNCAT_PRIMARY
_is_ defined, provide internal hidden alias __strncat.
* include/string.h: Declare internal hidden alias __strncat.
Only forward __stpcpy to __builtin_stpcpy if __NO_STRING_INLINES is
not defined.
* include/bits/string3.h: Rename to bits/string_fortified.h,
update to match above.
* sysdeps/i386/string-inlines.c: Define compat symbols for
everything formerly defined by sysdeps/x86/bits/string.h.
Make existing definitions into compat symbols as well.
Remove some no-longer-necessary messing around with macros.
* sysdeps/powerpc/powerpc32/power4/multiarch/mempcpy.c
* sysdeps/powerpc/powerpc64/multiarch/mempcpy.c
* sysdeps/powerpc/powerpc64/multiarch/stpcpy.c
* sysdeps/s390/multiarch/mempcpy.c
No need to define _HAVE_STRING_ARCH_mempcpy.
Do define __NO_STRING_INLINES and NO_MEMPCPY_STPCPY_REDIRECT.
* sysdeps/i386/i686/multiarch/strncat-c.c
* sysdeps/s390/multiarch/strncat-c.c
* sysdeps/x86_64/multiarch/strncat-c.c
Define STRNCAT_PRIMARY. Don't change definition of libc_hidden_def.
ELFv2 functions with localentry:0 are those with a single entry point,
ie. global entry == local entry, that have no requirement on r2 or
r12 and guarantee r2 is unchanged on return. Such an external
function can be called via the PLT without saving r2 or restoring it
on return, avoiding a common load-hit-store for small functions.
This patch implements the ld.so changes necessary for this
optimization. ld.so needs to check that an optimized plt call
sequence is in fact calling a function implemented with localentry:0,
end emit a fatal error otherwise.
The elf/testobj6.c change is to stop "error while loading shared
libraries: expected localentry:0 `preload'" when running
elf/preloadtest, which we'd get otherwise.
* elf/elf.h (PPC64_OPT_LOCALENTRY): Define.
* sysdeps/alpha/dl-machine.h (elf_machine_fixup_plt): Add
refsym and sym parameters. Adjust callers.
* sysdeps/aarch64/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/arm/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/generic/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/hppa/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/i386/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/ia64/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/m68k/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/microblaze/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/mips/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/nios2/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/powerpc/powerpc32/dl-machine.h (elf_machine_fixup_plt):
Likewise.
* sysdeps/s390/s390-32/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/s390/s390-64/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/sh/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/sparc/sparc32/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/sparc/sparc64/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/tile/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/x86_64/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/powerpc/powerpc64/dl-machine.c (_dl_error_localentry): New.
(_dl_reloc_overflow): Increase buffser size. Formatting.
* sysdeps/powerpc/powerpc64/dl-machine.h (ppc64_local_entry_offset):
Delete reloc param, add refsym and sym. Check optimized plt
call stubs for localentry:0 functions. Adjust callers.
(elf_machine_fixup_plt, elf_machine_plt_conflict): Add refsym
and sym parameters. Adjust callers.
(_dl_reloc_overflow): Move attribute.
(_dl_error_localentry): Declare.
* elf/dl-runtime.c (_dl_fixup): Save original sym. Pass
refsym and sym to elf_machine_fixup_plt.
* elf/testobj6.c (preload): Call printf.
Makes __stpncpy_power8 call __memset_power8 directly rather than via an
IFUNC. Fixes a missing _mcount, and removes some redundant NOPS. The
*_is_local defines are also used in a followup patch.
* sysdeps/powerpc/powerpc64/multiarch/strncpy-power7.S: Define
MEMSET_is_local.
* sysdeps/powerpc/powerpc64/multiarch/strncpy-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/stpncpy-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/stpncpy-power8.S: Likewise.
Define MEMSET.
* sysdeps/powerpc/powerpc64/multiarch/strstr-power7.S: Define
STRLEN_is_local, STRNLEN_is_local, and STRCHR_is_local.
* sysdeps/powerpc/powerpc64/power7/strstr.S: Likewise. Don't add
nop after local calls.
* sysdeps/powerpc/powerpc64/power7/strncpy.S: Define MEMSET_is_local.
Don't add nop after local call.
* sysdeps/powerpc/powerpc64/power8/strncpy.S: Likewise. Add missing
CALL_MCOUNT.
.align on some targets takes a byte alignment, on others like powerpc,
log2 of the byte alignment. It's a good idea to avoid .align,
particularly since x86 and powerpc are different. This patch fixes
the occurrences of .align in powerpc64/sysdep.h, renames DOT_LABEL
since the macro doesn't have anything to do with adding dots, removes
extraneous semicolons, and fixes some formatting.
* sysdeps/powerpc/powerpc64/sysdep.h: Formatting.
(FUNC_LABEL): Rename from DOT_LABEL.
(ENTRY_1): Use FUNC_LABEL and remove leading space from label.
Use .p2align rather than .align.
(TRACEBACK, TRACEBACK_MASK): Use .p2align rather than .align.
(ABORT_TRANSACTION): Likewise.
(ENTRY_1, ENTRY_2, END_2, LOCALENTRY): Remove unnecessary semicolons,
particularly at end. Add semicolon at invocation as necessary.
(TRACEBACK, TRACEBACK_MASK, PSEUDO, PSEUDO_NOERRNO): Likewise.
(PSEUDO_ERRVAL, PPC64_LOAD_FUNCPTR, OPD_ENT): Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strrchr-power8.S (ENTRY,
END): Adjust to suit.
I think FRAME_PARM[1-9]_SAVE confuse the code, particularly
FRAME_PARM9_SAVE. There are only 8 parameter save slots!
* sysdeps/powerpc/powerpc64/sysdep.h: (FRAME_BACKCHAIN,
FRAME_CR_SAVE, FRAME_LR_SAVE): Move out of conditional.
(FRAME_PARM1_SAVE, FRAME_PARM2_SAVE, FRAME_PARM3_SAVE,
FRAME_PARM4_SAVE, FRAME_PARM5_SAVE, FRAME_PARM6_SAVE,
FRAME_PARM7_SAVE, FRAME_PARM8_SAVE, FRAME_PARM9_SAVE): Delete.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/makecontext.S: Replace
uses of FRAME_PARM[1-9]_SAVE with FRAME_PARM_SAVE plus offset.
The macros used in assembly were broken on powerpc64 ELFv1.
* sysdeps/powerpc/powerpc64/sysdep.h: (call_mcount_parm_offset): Delete.
(SAVE_ARG, REST_ARG, CFI_SAVE_ARG): Correct.
This patch optimizes the generic spinlock code.
The type pthread_spinlock_t is a typedef to volatile int on all archs.
Passing a volatile pointer to the atomic macros which are not mapped to the
C11 atomic builtins can lead to extra stores and loads to stack if such
a macro creates a temporary variable by using "__typeof (*(mem)) tmp;".
Thus, those macros which are used by spinlock code - atomic_exchange_acquire,
atomic_load_relaxed, atomic_compare_exchange_weak - have to be adjusted.
According to the comment from Szabolcs Nagy, the type of a cast expression is
unqualified (see http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_423.htm):
__typeof ((__typeof (*(mem)) *(mem)) tmp;
Thus from spinlock perspective the variable tmp is of type int instead of
type volatile int. This patch adjusts those macros in include/atomic.h.
With this construct GCC >= 5 omits the extra stores and loads.
The atomic macros are replaced by the C11 like atomic macros and thus
the code is aligned to it. The pthread_spin_unlock implementation is now
using release memory order instead of sequentially consistent memory order.
The issue with passed volatile int pointers applies to the C11 like atomic
macros as well as the ones used before.
I've added a glibc_likely hint to the first atomic exchange in
pthread_spin_lock in order to return immediately to the caller if the lock is
free. Without the hint, there is an additional jump if the lock is free.
I've added the atomic_spin_nop macro within the loop of plain reads.
The plain reads are also realized by C11 like atomic_load_relaxed macro.
The new define ATOMIC_EXCHANGE_USES_CAS determines if the first try to acquire
the spinlock in pthread_spin_lock or pthread_spin_trylock is an exchange
or a CAS. This is defined in atomic-machine.h for all architectures.
The define SPIN_LOCK_READS_BETWEEN_CMPXCHG is now removed.
There is no technical reason for throwing in a CAS every now and then,
and so far we have no evidence that it can improve performance.
If that would be the case, we have to adjust other spin-waiting loops
elsewhere, too! Using a CAS loop without plain reads is not a good idea
on many targets and wasn't used by one. Thus there is now no option to
do so.
Architectures are now using the generic spinlock automatically if they
do not provide an own implementation. Thus the pthread_spin_lock.c files
in sysdeps folder are deleted.
ChangeLog:
* NEWS: Mention new spinlock implementation.
* include/atomic.h:
(__atomic_val_bysize): Cast type to omit volatile qualifier.
(atomic_exchange_acq): Likewise.
(atomic_load_relaxed): Likewise.
(ATOMIC_EXCHANGE_USES_CAS): Check definition.
* nptl/pthread_spin_init.c (pthread_spin_init):
Use atomic_store_relaxed.
* nptl/pthread_spin_lock.c (pthread_spin_lock):
Use C11-like atomic macros.
* nptl/pthread_spin_trylock.c (pthread_spin_trylock):
Likewise.
* nptl/pthread_spin_unlock.c (pthread_spin_unlock):
Use atomic_store_release.
* sysdeps/aarch64/nptl/pthread_spin_lock.c: Delete File.
* sysdeps/arm/nptl/pthread_spin_lock.c: Likewise.
* sysdeps/hppa/nptl/pthread_spin_lock.c: Likewise.
* sysdeps/m68k/nptl/pthread_spin_lock.c: Likewise.
* sysdeps/microblaze/nptl/pthread_spin_lock.c: Likewise.
* sysdeps/mips/nptl/pthread_spin_lock.c: Likewise.
* sysdeps/nios2/nptl/pthread_spin_lock.c: Likewise.
* sysdeps/aarch64/atomic-machine.h (ATOMIC_EXCHANGE_USES_CAS): Define.
* sysdeps/alpha/atomic-machine.h: Likewise.
* sysdeps/arm/atomic-machine.h: Likewise.
* sysdeps/i386/atomic-machine.h: Likewise.
* sysdeps/ia64/atomic-machine.h: Likewise.
* sysdeps/m68k/coldfire/atomic-machine.h: Likewise.
* sysdeps/m68k/m680x0/m68020/atomic-machine.h: Likewise.
* sysdeps/microblaze/atomic-machine.h: Likewise.
* sysdeps/mips/atomic-machine.h: Likewise.
* sysdeps/powerpc/powerpc32/atomic-machine.h: Likewise.
* sysdeps/powerpc/powerpc64/atomic-machine.h: Likewise.
* sysdeps/s390/atomic-machine.h: Likewise.
* sysdeps/sparc/sparc32/atomic-machine.h: Likewise.
* sysdeps/sparc/sparc32/sparcv9/atomic-machine.h: Likewise.
* sysdeps/sparc/sparc64/atomic-machine.h: Likewise.
* sysdeps/tile/tilegx/atomic-machine.h: Likewise.
* sysdeps/tile/tilepro/atomic-machine.h: Likewise.
* sysdeps/unix/sysv/linux/hppa/atomic-machine.h: Likewise.
* sysdeps/unix/sysv/linux/m68k/coldfire/atomic-machine.h: Likewise.
* sysdeps/unix/sysv/linux/nios2/atomic-machine.h: Likewise.
* sysdeps/unix/sysv/linux/sh/atomic-machine.h: Likewise.
* sysdeps/x86_64/atomic-machine.h: Likewise.
The types affected are __sig_atomic_t, sig_atomic_t, __sigset_t,
sigset_t, sigval_t, sigevent_t, and siginfo_t. __sig_atomic_t is a
scalar, so it's now directly available from bits/types.h. The others
get bits/types/ headers.
Side effects include: There have been small changes to which
non-signal headers expose which subset of the signal-related types.
A couple of architectures' nested siginfo_t fields had to be renamed
to prevent undesired macro expansion. Internal code that wants to
manipulate signal masks must now include <sigsetops.h> (which is not
installed) and should be aware that __sigaddset, __sigandset,
__sigdelset, __sigemptyset, and __sigorset no longer return a value
(unlike the public API). Relatedly, the public signal.h no longer
declares any of those functions. The obsolete sigmask() macro no
longer has a system-specific definition -- in the cases where it
matters, it didn't work anyway.
New Linux architectures should create bits/siginfo-arch.h and/or
bits/siginfo-consts-arch.h to customize their siginfo_t, rather than
duplicating everything in bits/siginfo.h (which no longer exists).
Add new __SI_* macros if necessary. Ports to other operating systems
are strongly encouraged to generalize this scheme further.
* bits/sigevent-consts.h
* bits/siginfo-consts.h
* bits/types/__sigset_t.h
* bits/types/sigevent_t.h
* bits/types/siginfo_t.h
* sysdeps/unix/sysv/linux/bits/sigevent-consts.h
* sysdeps/unix/sysv/linux/bits/siginfo-consts.h
* sysdeps/unix/sysv/linux/bits/types/__sigset_t.h
* sysdeps/unix/sysv/linux/bits/types/sigevent_t.h
* sysdeps/unix/sysv/linux/bits/types/siginfo_t.h:
New system-dependent bits headers.
* sysdeps/unix/sysv/linux/bits/siginfo-arch.h
* sysdeps/unix/sysv/linux/bits/siginfo-consts-arch.h
* sysdeps/unix/sysv/linux/ia64/bits/siginfo-arch.h
* sysdeps/unix/sysv/linux/ia64/bits/siginfo-consts-arch.h
* sysdeps/unix/sysv/linux/mips/bits/siginfo-arch.h
* sysdeps/unix/sysv/linux/sparc/bits/siginfo-arch.h
* sysdeps/unix/sysv/linux/tile/bits/siginfo-arch.h
* sysdeps/unix/sysv/linux/tile/bits/siginfo-consts-arch.h
* sysdeps/unix/sysv/linux/x86/bits/siginfo-arch.h:
New Linux-only system-dependent bits headers.
* signal/bits/types/sig_atomic_t.h
* signal/bits/types/sigset_t.h
* signal/bits/types/sigval_t.h:
New non-system-dependent bits headers.
* sysdeps/generic/sigsetops.h
* sysdeps/unix/sysv/linux/sigsetops.h:
New internal headers.
* include/bits/types/sig_atomic_t.h
* include/bits/types/sigset_t.h
* include/bits/types/sigval_t.h:
New wrappers.
* signal/sigsetops.h
* bits/siginfo.h
* bits/sigset.h
* sysdeps/unix/sysv/linux/bits/siginfo.h
* sysdeps/unix/sysv/linux/bits/sigset.h
* sysdeps/unix/sysv/linux/ia64/bits/siginfo.h
* sysdeps/unix/sysv/linux/mips/bits/siginfo.h
* sysdeps/unix/sysv/linux/s390/bits/siginfo.h
* sysdeps/unix/sysv/linux/sparc/bits/siginfo.h
* sysdeps/unix/sysv/linux/tile/bits/siginfo.h
* sysdeps/unix/sysv/linux/x86/bits/siginfo.h:
Deleted.
* signal/Makefile, sysdeps/unix/sysv/linux/Makefile:
Update lists of installed headers.
* posix/bits/types.h: Define __sig_atomic_t here.
* signal/signal.h: Use the new bits headers; no need to handle
__need_sig_atomic_t nor __need_sigset_t. Don't use __sigmask
to define sigmask.
* include/signal.h: No need to handle __need_sig_atomic_t
nor __need_sigset_t. Don't define __sigemptyset.
* io/sys/poll.h, setjmp/setjmp.h
* sysdeps/arm/sys/ucontext.h, sysdeps/generic/sys/ucontext.h
* sysdeps/i386/sys/ucontext.h, sysdeps/m68k/sys/ucontext.h
* sysdeps/mach/hurd/i386/bits/sigcontext.h
* sysdeps/mips/sys/ucontext.h, sysdeps/powerpc/novmxsetjmp.h
* sysdeps/pthread/bits/sigthread.h
* sysdeps/unix/sysv/linux/hppa/sys/ucontext.h
* sysdeps/unix/sysv/linux/m68k/sys/ucontext.h
* sysdeps/unix/sysv/linux/mips/sys/ucontext.h
* sysdeps/unix/sysv/linux/nios2/sys/ucontext.h
* sysdeps/unix/sysv/linux/powerpc/sys/ucontext.h
* sysdeps/unix/sysv/linux/s390/sys/ucontext.h
* sysdeps/unix/sysv/linux/sh/sys/ucontext.h
* sysdeps/unix/sysv/linux/sparc/sys/ucontext.h
* sysdeps/unix/sysv/linux/tile/sys/ucontext.h
* sysdeps/unix/sysv/linux/x86/sys/ucontext.h:
Use bits/types/__sigset_t.h.
* misc/sys/select.h, posix/spawn.h
* sysdeps/unix/sysv/linux/powerpc/sys/ucontext.h
* sysdeps/unix/sysv/linux/sys/epoll.h
* sysdeps/unix/sysv/linux/sys/signalfd.h:
Use bits/types/sigset_t.h.
* resolv/netdb.h, rt/mqueue.h: Use bits/types/sigevent_t.h.
* rt/aio.h: Use bits/types/sigevent_t.h and bits/sigevent-consts.h.
* socket/sys/socket.h: Don't include bits/sigset.h.
* login/utmp_file.c, shadow/lckpwdf.c, signal/sigandset.c
* signal/sigisempty.c, stdlib/abort.c, sysdeps/posix/profil.c
* sysdeps/posix/sigignore.c, sysdeps/posix/sigintr.c
* sysdeps/posix/signal.c, sysdeps/posix/sigset.c
* sysdeps/posix/sprofil.c, sysdeps/posix/sysv_signal.c
* sysdeps/unix/sysv/linux/nptl-signals.h:
Include sigsetops.h.
* signal/sigaddset.c, signal/sigandset.c, signal/sigdelset.c
* signal/sigorset.c, stdlib/abort.c, sysdeps/posix/sigignore.c
* sysdeps/posix/signal.c, sysdeps/posix/sigset.c:
__sigaddset, __sigandset, __sigdelset, __sigemptyset, __sigorset
now return no value.
* signal/sigaddset.c, signal/sigdelset.c, signal/sigismem.c
Include <errno.h>, <signal.h>, and <sigsetops.h> instead of
"sigsetops.h".
* signal/sigsetops.c: Explicitly define __sigismember,
__sigaddset, and __sigdelset as compatibility symbols.
* signal/Versions: Correct commentary on __sigpause,
__sigaddset, __sigdelset, __sigismember.
* inet/rcmd.c: Include sigsetops.h. Convert old code using
__sigblock/__sigsetmask to use __sigprocmask and friends.
This implementation is based on the one already used at
sysdeps/powerpc/powerpc64/fpu/multiarch/s_sinf-power8.S.
* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
[$(subdir) = math] (libm-sysdep_routines): Add s_cosf-power8 and
s_cosf-ppc64.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_cosf-power8.S: New file.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_cosf-ppc64.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_cosf.c: Likewise.
* sysdeps/powerpc/powerpc64/power8/fpu/s_cosf.S: Likewise.
This patch adds a new build module called 'testsuite'.
IS_IN (testsuite) implies _ISOMAC, as do IS_IN_build and __cplusplus
(which means several ad-hoc tests for __cplusplus can go away).
libc-symbols.h now suppresses almost all of *itself* when _ISOMAC is
defined; in particular, _ISOMAC mode does not get config.h
automatically anymore.
There are still quite a few tests that need to see internal gunk of
one variety or another. For them, we now have 'tests-internal' and
'test-internal-extras'; files in this category will still be compiled
with MODULE_NAME=nonlib, and everything proceeds as it always has.
The bulk of this patch is moving tests from 'tests' to
'tests-internal'. There is also 'tests-static-internal', which has
the same effect on files in 'tests-static', and 'modules-names-tests',
which has the *inverse* effect on files in 'modules-names' (it's
inverted because most of the things in modules-names are *not* tests).
For both of these, the file must appear in *both* the new variable and
the old one.
There is also now a special case for when libc-symbols.h is included
without MODULE_NAME being defined at all. (This happens during the
creation of libc-modules.h, and also when preprocessing Versions
files.) When this happens, IS_IN is set to be always false and
_ISOMAC is *not* defined, which was the status quo, but now it's
explicit.
The remaining changes to C source files in this patch seemed likely to
cause problems in the absence of the main change. They should be
relatively self-explanatory. In a few cases I duplicated a definition
from an internal header rather than move the test to tests-internal;
this was a judgement call each time and I'm happy to change those
however reviewers feel is more appropriate.
* Makerules: New subdir configuration variables 'tests-internal'
and 'test-internal-extras'. Test files in these categories will
still be compiled with MODULE_NAME=nonlib. Test files in the
existing categories (tests, xtests, test-srcs, test-extras) are
now compiled with MODULE_NAME=testsuite.
New subdir configuration variable 'modules-names-tests'. Files
which are in both 'modules-names' and 'modules-names-tests' will
be compiled with MODULE_NAME=testsuite instead of
MODULE_NAME=extramodules.
(gen-as-const-headers): Move to tests-internal.
(do-tests-clean, common-mostlyclean): Support tests-internal.
* Makeconfig (built-modules): Add testsuite.
* Makefile: Change libof-check-installed-headers-c and
libof-check-installed-headers-cxx to 'testsuite'.
* Rules: Likewise. Support tests-internal.
* benchtests/strcoll-inputs/filelist#en_US.UTF-8:
Remove extra-modules.mk.
* config.h.in: Don't check for __OPTIMIZE__ or __FAST_MATH__ here.
* include/libc-symbols.h: Move definitions of _GNU_SOURCE,
PASTE_NAME, PASTE_NAME1, IN_MODULE, IS_IN, and IS_IN_LIB to the
very top of the file and rationalize their order.
If MODULE_NAME is not defined at all, define IS_IN to always be
false, and don't define _ISOMAC.
If any of IS_IN (testsuite), IS_IN_build, or __cplusplus are
true, define _ISOMAC and suppress everything else in this file,
starting with the inclusion of config.h.
Do check for inappropriate definitions of __OPTIMIZE__ and
__FAST_MATH__ here, but only if _ISOMAC is not defined.
Correct some out-of-date commentary.
* include/math.h: If _ISOMAC is defined, undefine NO_LONG_DOUBLE
and _Mlong_double_ before including math.h.
* include/string.h: If _ISOMAC is defined, don't expose
_STRING_ARCH_unaligned. Move a comment to a more appropriate
location.
* include/errno.h, include/stdio.h, include/stdlib.h, include/string.h
* include/time.h, include/unistd.h, include/wchar.h: No need to
check __cplusplus nor use __BEGIN_DECLS/__END_DECLS.
* misc/sys/cdefs.h (__NTHNL): New macro.
* sysdeps/m68k/m680x0/fpu/bits/mathinline.h
(__m81_defun): Use __NTHNL to avoid errors with GCC 6.
* elf/tst-env-setuid-tunables.c: Include config.h with _LIBC
defined, for HAVE_TUNABLES.
* inet/tst-checks-posix.c: No need to define _ISOMAC.
* intl/tst-gettext2.c: Provide own definition of N_.
* math/test-signgam-finite-c99.c: No need to define _ISOMAC.
* math/test-signgam-main.c: No need to define _ISOMAC.
* stdlib/tst-strtod.c: Convert to test-driver. Split locale_test to...
* stdlib/tst-strtod1i.c: ...this new file.
* stdlib/tst-strtod5.c: Convert to test-driver and add copyright notice.
Split tests of __strtod_internal to...
* stdlib/tst-strtod5i.c: ...this new file.
* string/test-string.h: Include stdint.h. Duplicate definition of
inhibit_loop_to_libcall here (from libc-symbols.h).
* string/test-strstr.c: Provide dummy definition of
libc_hidden_builtin_def when including strstr.c.
* sysdeps/ia64/fpu/libm-symbols.h: Suppress entire file in _ISOMAC
mode; no need to test __STRICT_ANSI__ nor __cplusplus as well.
* sysdeps/x86_64/fpu/math-tests-arch.h: Include cpu-features.h.
Don't include init-arch.h.
* sysdeps/x86_64/multiarch/test-multiarch.h: Include cpu-features.h.
Don't include init-arch.h.
* elf/Makefile: Move tst-ptrguard1-static, tst-stackguard1-static,
tst-tls1-static, tst-tls2-static, tst-tls3-static, loadtest,
unload, unload2, circleload1, neededtest, neededtest2,
neededtest3, neededtest4, tst-tls1, tst-tls2, tst-tls3,
tst-tls6, tst-tls7, tst-tls8, tst-dlmopen2, tst-ptrguard1,
tst-stackguard1, tst-_dl_addr_inside_object, and all of the
ifunc tests to tests-internal.
Don't add $(modules-names) to test-extras.
* inet/Makefile: Move tst-inet6_scopeid_pton to tests-internal.
Add tst-deadline to tests-static-internal.
* malloc/Makefile: Move tst-mallocstate and tst-scratch_buffer to
tests-internal.
* misc/Makefile: Move tst-atomic and tst-atomic-long to tests-internal.
* nptl/Makefile: Move tst-typesizes, tst-rwlock19, tst-sem11,
tst-sem12, tst-sem13, tst-barrier5, tst-signal7, tst-tls3,
tst-tls3-malloc, tst-tls5, tst-stackguard1, tst-sem11-static,
tst-sem12-static, and tst-stackguard1-static to tests-internal.
Link tests-internal with libpthread also.
Don't add $(modules-names) to test-extras.
* nss/Makefile: Move tst-field to tests-internal.
* posix/Makefile: Move bug-regex5, bug-regex20, bug-regex33,
tst-rfc3484, tst-rfc3484-2, and tst-rfc3484-3 to tests-internal.
* stdlib/Makefile: Move tst-strtod1i, tst-strtod3, tst-strtod4,
tst-strtod5i, tst-tls-atexit, and tst-tls-atexit-nodelete to
tests-internal.
* sunrpc/Makefile: Move tst-svc_register to tests-internal.
* sysdeps/powerpc/Makefile: Move test-get_hwcap and
test-get_hwcap-static to tests-internal.
* sysdeps/unix/sysv/linux/Makefile: Move tst-setgetname to
tests-internal.
* sysdeps/x86_64/fpu/Makefile: Add all libmvec test modules to
modules-names-tests.
Now with read consolidation which uses SYSCALL_CANCEL macro, a frame
pointer is created in the syscall code and this makes the powerpc
backtrace obtain a bogus entry for the signal handling patch.
It is because it does not setup the correct frame pointer register
(r1) based on the saved value from the kernel sigreturn. It was not
failing because the syscall frame pointer register was the same one
for the next frame (the function that actually called the syscall).
This patch fixes it by setup the next stack frame using the saved
one by the kernel sigreturn. It fixes tst-backtrace{5,6} from
the read consolidation patch.
Checked on powerpc-linux-gnu and powerpc64le-linux-gnu.
* sysdeps/powerpc/powerpc32/backtrace.c (is_sigtramp_address): Use
void* for argument type and use VDSO_SYMBOL macro.
(is_sigtramp_address_rt): Likewise.
(__backtrace): Setup expected frame pointer address for signal
handling.
* sysdeps/powerpc/powerpc64/backtrace.c (is_sigtramp_address): Use
void* for argumetn type and use VSDO_SYMBOL macro.
(__backtrace): Setup expected frame pointer address for signal
handling.
This patch removes all the replicated pthread definition accross the
architectures and consolidates it on shared headers. The new
organization is as follow:
* Architecture specific definition (such as pthread types sizes) are
place in the new pthreadtypes-arch.h header in arch specific path.
* All shared structure definition are moved to a common NPTL header
at sysdeps/nptl/bits/pthreadtypes.h (with now includes the arch
specific one for internal definitions).
* Also, for C11 future thread support, both mutex and condition
definition are placed in a common header at
sysdeps/nptl/bits/thread-shared-types.h.
It is also a refactor patch without expected functional changes.
Checked with a build for all major ABI (aarch64-linux-gnu, alpha-linux-gnu,
arm-linux-gnueabi, i386-linux-gnu, ia64-linux-gnu,
m68k-linux-gnu, microblaze-linux-gnu, mips{64}-linux-gnu, nios2-linux-gnu,
powerpc{64le}-linux-gnu, s390{x}-linux-gnu, sparc{64}-linux-gnu,
tile{pro,gx}-linux-gnu, and x86_64-linux-gnu).
* posix/Makefile (headers): Add pthreadtypes-arch.h and
thread-shared-types.h.
* sysdeps/aarch64/nptl/bits/pthreadtypes-arch.h: New file: arch
specific thread definition.
* sysdeps/alpha/nptl/bits/pthreadtypes-arch.h: Likewise.
* sysdeps/arm/nptl/bits/pthreadtypes-arch.h: Likewise.
* sysdeps/hppa/nptl/bits/pthreadtypes-arch.h: Likewise.
* sysdeps/ia64/nptl/bits/pthreadtypes-arch.h: Likewise.
* sysdeps/m68k/nptl/bits/pthreadtypes-arch.h: Likewise.
* sysdeps/microblaze/nptl/bits/pthreadtypes-arch.h: Likewise.
* sysdeps/mips/nptl/bits/pthreadtypes-arch.h: Likewise.
* sysdeps/nios2/nptl/bits/pthreadtypes-arch.h: Likewise.
* sysdeps/powerpc/nptl/bits/pthreadtypes-arch.h: Likewise.
* sysdeps/s390/nptl/bits/pthreadtypes-arch.h: Likewise.
* sysdeps/sh/nptl/bits/pthreadtypes-arch.h: Likewise.
* sysdeps/sparc/nptl/bits/pthreadtypes-arch.h: Likewise.
* sysdeps/tile/nptl/bits/pthreadtypes-arch.h: Likewise.
* sysdeps/x86/nptl/bits/pthreadtypes-arch.h: Likewise.
* sysdeps/nptl/bits/thread-shared-types.h: New file: shared
thread definition between POSIX and C11.
* sysdeps/aarch64/nptl/bits/pthreadtypes.h.: Remove file.
* sysdeps/alpha/nptl/bits/pthreadtypes.h: Likewise.
* sysdeps/arm/nptl/bits/pthreadtypes.h: Likewise.
* sysdeps/hppa/nptl/bits/pthreadtypes.h: Likewise.
* sysdeps/m68k/nptl/bits/pthreadtypes.h: Likewise.
* sysdeps/microblaze/nptl/bits/pthreadtypes.h: Likewise.
* sysdeps/mips/nptl/bits/pthreadtypes.h: Likewise.
* sysdeps/nios2/nptl/bits/pthreadtypes.h: Likewise.
* sysdeps/ia64/nptl/bits/pthreadtypes.h: Likewise.
* sysdeps/powerpc/nptl/bits/pthreadtypes.h: Likewise.
* sysdeps/s390/nptl/bits/pthreadtypes.h: Likewise.
* sysdeps/sh/nptl/bits/pthreadtypes.h: Likewise.
* sysdeps/sparc/nptl/bits/pthreadtypes.h: Likewise.
* sysdeps/tile/nptl/bits/pthreadtypes.h: Likewise.
* sysdeps/x86/nptl/bits/pthreadtypes.h: Likewise.
* sysdeps/nptl/bits/pthreadtypes.h: New file: common thread
definitions shared across all architectures.
1. Fix the results for negative subnormals by ignoring the signal when
normalizing the value.
2. Fix the output when the high part is a power of 2 and the low part
is a nonzero number with opposite sign. This fix is based on commit
380bd0fd24.
After applying this patch, logbl() tests pass cleanly on POWER >= 7.
Tested on powerpc, powerpc64 and powerpc64le
[BZ #21280]
* sysdeps/powerpc/power7/fpu/s_logbl.c (__logbl): Ignore the
signal of subnormals and adjust the exponent of power of 2 down
when low part has opposite sign.
float128 on powerpc64le requires the addition of the ieee754/float128
sysdep, whereas powerpc64 doesn't. This requires creating a bunch of
submachine and cpu directories and Implies files which just point
towards their powerpc64 equivalent.
Tested on P7, P8, and generic powerpc64le targets with and without
multiarch.
* sysdeps/powerpc/powerpc64le/Implies: New file.
* sysdeps/powerpc/powerpc64le/fpu/Implies: New file.
* sysdeps/powerpc/powerpc64le/fpu/multiarch/Implies: New file.
* sysdeps/powerpc/powerpc64le/multiarch/Implies: New file.
* sysdeps/powerpc/powerpc64le/power7/Implies: New file.
* sysdeps/powerpc/powerpc64le/power7/fpu/Implies: New file.
* sysdeps/powerpc/powerpc64le/power7/fpu/multiarch/Implies: New file.
* sysdeps/powerpc/powerpc64le/power7/multiarch/Implies: New file.
* sysdeps/powerpc/powerpc64le/power8/Implies: New file.
* sysdeps/powerpc/powerpc64le/power8/fpu/Implies: New file.
* sysdeps/powerpc/powerpc64le/power8/fpu/multiarch/Implies: New file.
* sysdeps/powerpc/powerpc64le/power8/multiarch/Implies: New file.
* sysdeps/powerpc/powerpc64le/power9/Implies: New file.
* sysdeps/powerpc/powerpc64le/power9/fpu/Implies: New file.
* sysdeps/powerpc/powerpc64le/power9/fpu/multiarch/Implies: New file.
* sysdeps/powerpc/powerpc64le/power9/multiarch/Implies: New file.
* sysdeps/powerpc/preconfigure: New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc64le/Implies: New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc64le/fpu/Implies: New file.
P7 code is used for <=32B strings and for > 32B vectorized loops are used.
This shows as an average 25% improvement depending on the position of search
character. The performance is same for shorter strings.
Tested on ppc64 and ppc64le.
With new optimized strnlen for POWER8 [1], this patch adds
strncat for power8 to make use of optimized strlen and strnlen.
This is faster than POWER7 current implementation for larger strings.
Tested on powerpc64 and powerpc64le.
[1] https://sourceware.org/ml/libc-alpha/2017-03/msg00491.html
* sysdeps/powerpc/powerpc64/multiarch/Makefile (sysdep_routines): Add
strncat-power8.
* sysdeps/powerpc/powerpc64/multiarch/strncat.c (strncat): Add
__strncat_power8 to ifunc list.
* sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
(strncat): Add __strncat_power8 to list of strncat functions.
* sysdeps/powerpc/powerpc64/multiarch/strncat-power8.c: New file.
Clean up the IFUNC implementations for powerpc in order to remove
unneeded macro definitions.
Tested on ppc64le with and without --disable-multi-arch flag.
* sysdeps/powerpc/powerpc64/multiarch/memcmp-power4.S: Define the
implementation-specific function name and remove unneeded
macros definition.
* sysdeps/powerpc/powerpc64/multiarch/memcmp-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memmove-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/power4/memcmp.S: Set a default function
name if not defined and pass as parameter to macros accordingly.
* sysdeps/powerpc/powerpc64/power7/memcmp.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/memmove.S: Likewise.
Clean up the IFUNC implementations for powerpc in order to remove
unneeded macro definitions.
Tested on ppc64le with and without --disable-multi-arch flag.
* sysdeps/powerpc/powerpc64/multiarch/memcpy-a2.S: Define the
implementation-specific function name and remove unneeded
macros definition.
* sysdeps/powerpc/powerpc64/multiarch/memcpy-cell.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memcpy-power4.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memcpy-power6.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memcpy-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memcpy-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/mempcpy-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/a2/memcpy.S: Set a default function
name if not defined and pass as parameter to macros accordingly.
* sysdeps/powerpc/powerpc64/cell/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/power4/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/power6/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/mempcpy.S: Likewise.
Clean up the IFUNC implementations for powerpc in order to remove
unneeded macro definitions.
Tested on ppc64le with and without --disable-multi-arch flag.
* sysdeps/powerpc/powerpc64/multiarch/memchr-power7.S: Define the
implementation-specific function name and remove unneeded macros
definition.
* sysdeps/powerpc/powerpc64/multiarch/memrchr-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/rawmemchr-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/memchr.S: Set a default
function name if not defined and pass as parameter to macros
accordingly.
* sysdeps/powerpc/powerpc64/power7/memrchr.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/rawmemchr.S: Likewise.
Clean up the IFUNC implementations for powerpc in order to remove
unneeded macro definitions.
Tested on ppc64le with and without --disable-multi-arch flag.
* sysdeps/powerpc/powerpc64/multiarch/memset-power4.S: Define the
implementation-specific function name and remove unneeded macros
definition.
* sysdeps/powerpc/powerpc64/multiarch/memset-power6.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memset-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memset-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memset-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/memset.S: Set a default function name if
not defined and pass as parameter to macros accordingly.
* sysdeps/powerpc/powerpc64/power4/memset.S: Likewise.
* sysdeps/powerpc/powerpc64/power6/memset.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/memset.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/memset.S: Likewise.
Clean up the IFUNC implementations for powerpc in order to remove
unneeded macro definitions.
Tested on ppc64le with and without --disable-multi-arch flag.
* sysdeps/powerpc/powerpc64/multiarch/strcasestr-power8.S: Define the
strcasestr implementation name and remove unneeded macros definition.
* sysdeps/powerpc/powerpc64/multiarch/strstr-power7.S: Define
strstr implementation name and remove unneeded macros definition.
* sysdeps/powerpc/powerpc64/power7/strstr.S: Set a default function
name if not defined and pass as parameter to macros accordingly.
* sysdeps/powerpc/powerpc64/power8/strcasestr.S: Likewise.
Clean up the IFUNC implementations for powerpc in order to remove
unneeded macro definitions.
Tested on ppc64le with and without --disable-multi-arch flag.
* sysdeps/powerpc/powerpc64/multiarch/strchr-power7.S: Define the
implementation-specific function name and remove unneeded macros
definition.
* sysdeps/powerpc/powerpc64/multiarch/strchr-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strchr-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strchrnul-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strchrnul-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strrchr-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/strchr.S: Set a default
function name if not defined and pass as parameter to macros
accordingly.
* sysdeps/powerpc/powerpc64/power7/strchrnul.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/strrchr.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/strchr.S: Likewise.
* sysdeps/powerpc/powerpc64/strchr.S: Likewise.
Clean up the IFUNC implementations for powerpc in order to remove
unneeded macro definitions.
Tested on ppc64le with and without --disable-multi-arch flag.
* sysdeps/powerpc/powerpc64/multiarch/strlen-power7.S: Define
the strlen implementation name and remove unneeded macros definition.
* sysdeps/powerpc/powerpc64/multiarch/strlen-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strlen-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strnlen-power7.S: Define
the strnlen implementation name and remove unneeded macros definition.
* sysdeps/powerpc/powerpc64/power7/strlen.S: Set a default function
name if not defined and pass as parameter to macros accordingly.
* sysdeps/powerpc/powerpc64/power7/strnlen.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/strlen.S: Likewise.
* sysdeps/powerpc/powerpc64/strlen.S: Likewise.
Clean up the IFUNC implementations for powerpc in order to remove
unneeded macro definitions.
Tested on ppc64le with and without --disable-multi-arch flag.
* sysdeps/powerpc/powerpc64/multiarch/strcasecmp_l-power7.S: Define
the implementation-specific function name and remove unneeded
macros definition.
* sysdeps/powerpc/powerpc64/multiarch/strcmp-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcmp-power8.S Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcmp-power9.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcmp-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncmp-power4.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncmp-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncmp-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncmp-power9.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncmp-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/power4/strncmp.S: Set a default function
name if not defined and pass as parameter to macros accordingly.
* sysdeps/powerpc/powerpc64/power7/strcmp.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/strncmp.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/strcmp.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/strncmp.S: Likewise.
* sysdeps/powerpc/powerpc64/power9/strcmp.S: Likewise.
* sysdeps/powerpc/powerpc64/power9/strncmp.S: Likewise.
* sysdeps/powerpc/powerpc64/strcmp.S: Likewise.
* sysdeps/powerpc/powerpc64/strncmp.S: Likewise.
Clean up the IFUNC implementations for powerpc in order to remove
unneeded macro definitions.
Tested on ppc64le with and without --disable-multi-arch flag.
* sysdeps/powerpc/powerpc64/multiarch/stpcpy-power8.S: Define the
implementation-specific function name and remove unneeded macros
definition.
* sysdeps/powerpc/powerpc64/multiarch/stpncpy-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/stpncpy-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcpy-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncpy-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncpy-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/strncpy.S: Set a default
function name if not defined.
* sysdeps/powerpc/powerpc64/power8/strcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/strncpy.S: Likewise.
This patch moves all arch specific pthreadtypes.h to a similar path
for all architectures (sysdeps/unix/sysv/<arch>/bits). No functional
or build change is expected. The idea is mainly to organize the
header placement for all architectures.
Checked with a build for all major ABI (aarch64-linux-gnu, alpha-linux-gnu,
arm-linux-gnueabi, i386-linux-gnu, ia64-linux-gnu,
m68k-linux-gnu, microblaze-linux-gnu [1], mips{64}-linux-gnu, nios2-linux-gnu,
powerpc{64le}-linux-gnu, s390{x}-linux-gnu, sparc{64}-linux-gnu,
tile{pro,gx}-linux-gnu, and x86_64-linux-gnu).
* sysdeps/unix/sysv/linux/x86/Implies: New file.
* sysdeps/unix/sysv/linux/alpha/bits/pthreadtypes.h: Move to ...
* sysdeps/alpha/nptl/bits/pthreadtypes.h: ... here.
* sysdeps/unix/sysv/linux/powerpc/bits/pthreadtypes.h: Move to ...
* sysdeps/powerpc/nptl/bits/pthreadtypes.h: ... here.
* sysdeps/x86/bits/pthreadtypes.h: Move to ...
* sysdeps/x86/nptl/bits/pthreadtypes.h: ... here.
As noted in [1], divdi3 object is only exported in a handful ABIs
(i386, m68k, powerpc32, s390-32, and ia64), however it is built
for all current architectures regardless.
This patch refact the make rules for this object to so only the
aforementioned architectures that actually require it builds it.
Also, to avoid internal PLT calls to the exported symbol from the
module, glibc uses an internal header (symbol-hacks.h) which is
unrequired (and in fact breaks the build for architectures that
intend to get symbol definitions from libgcc.a). The patch also
changes it to create its own header (divdi3-symbol-hacks.h) and
adjust the architectures that require it accordingly.
I checked the build/check (with run-built-tests=no) on the
following architectures (which I think must cover all supported
ABI/builds) using GCC 6.3:
aarch64-linux-gnu
alpha-linux-gnu
arm-linux-gnueabihf
hppa-linux-gnu
ia64-linux-gnu
m68k-linux-gnu
microblaze-linux-gnu
mips64-n32-linux-gnu
mips-linux-gnu
mips64-linux-gnu
nios2-linux-gnu
powerpc-linux-gnu
powerpc-linux-gnu-power4
powerpc64-linux-gnu
powerpc64le-linux-gnu
s390x-linux-gnu
s390-linux-gnu
sh4-linux-gnu
sh4-linux-gnu-soft
sparc64-linux-gnu
sparcv9-linux-gnu
tilegx-linux-gnu
tilegx-linux-gnu-32
tilepro-linux-gnu
x86_64-linux-gnu
x86_64-linux-gnu-x32
i686-linux-gnu
I only saw one regression on sparcv9-linux-gnu (extra PLT call to
.udiv) which I address in next patch in the set. It also correctly
build SH with GCC 7.0.1 (without any regression from c89721e25d).
[1] https://sourceware.org/ml/libc-alpha/2017-03/msg00243.html
* sysdeps/i386/symbol-hacks.h: New file.
* sysdeps/m68k/symbol-hacks.h: New file.
* sysdeps/powerpc/powerpc32/symbol-hacks.h: New file.
* sysdeps/s390/s390-32/symbol-hacks.h: New file.
* sysdeps/unix/sysv/linux/i386/Makefile
[$(subdir) = csu] (sysdep_routines): New rule: divdi3 object.
[$(subdir) = csu] (sysdep-only-routines): Likewise.
[$(subdir) = csu] (CFLAGS-divdi3.c): Likewise.
* sysdeps/unix/sysv/linux/m68k/Makefile
[$(subdir) = csu] (sysdep_routines): Likewise.
[$(subdir) = csu] (sysdep-only-routines): Likewise.
[$(subdir) = csu] (CFLAGS-divdi3.c): Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/Makefile
[$(subdir) = csu] (sysdep_routines): Likewise.
[$(subdir) = csu] (sysdep-only-routines): Likewise.
[$(subdir) = csu] (CFLAGS-divdi3.c): Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/Makefile
[$(subdir) = csu] (sysdep_routines): Likewise.
[$(subdir) = csu] (sysdep-only-routines): Likewise.
[$(subdir) = csu] (CFLAGS-divdi3.c): Likewise.
* sysdeps/wordsize-32/Makefile: Remove file.
* sysdeps/wordsize-32/symbol-hacks.h: Definitions move to ...
* sysdeps/wordsize-32/divdi3-symbol-hacks.h: ... here.
Added strnlen POWER8 otimized for long strings. It delivers
same performance as POWER7 implementation for short strings.
This takes advantage of reasonably performing unaligned loads
and bit permutes to check the first 1-16 bytes until
quadword aligned, then checks in 64 bytes strides until unsafe,
then 16 bytes, truncating the count if need be.
Likewise, the POWER7 code is recycled for less than 32 bytes strings.
Tested on ppc64 and ppc64le.
* sysdeps/powerpc/powerpc64/multiarch/Makefile
(sysdep_routines): Add strnlen-power8.
* sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
(strnlen): Add __strnlen_power8 to list of strnlen functions.
* sysdeps/powerpc/powerpc64/multiarch/strnlen-power8.S:
New file.
* sysdeps/powerpc/powerpc64/multiarch/strnlen.c
(__strnlen): Add __strnlen_power8 to ifunc list.
* sysdeps/powerpc/powerpc64/power8/strnlen.S: New file.
These are a grab bag of changes where the testsuite was using internal
symbols of some variety, but this was straightforward to fix, and the
fixed code should work with or without the change to compile the
testsuite under _ISOMAC.
Four of these are just more #include adjustments, but I want to highlight
sysdeps/powerpc/fpu/tst-setcontext-fpscr.c, which appears to have been
written before the advent of sys/auxv.h. I think a big chunk of this file
could be replaced by a simple call to getauxval, but I'll let someone who
actually has a powerpc machine to test on do that.
dlfcn/tst-dladdr.c was including ldsodefs.h just so it could use
DL_LOOKUP_ADDRESS to print an additional diagnostic; as requested by Carlos,
I have removed this.
math/test-misc.c was using #ifndef NO_LONG_DOUBLE, which is an internal
configuration macro, to decide whether to do certain tests involving
'long double'. I changed the test to #if LDBL_MANT_DIG > DBL_MANT_DIG
instead, which uses only public float.h macros and is equivalent on
all supported platforms. (Note that NO_LONG_DOUBLE doesn't mean 'the
compiler doesn't support long double', it means 'long double is the
same as double'.)
tst-writev.c has a configuration macro 'ARTIFICIAL_LIMIT' that the
Makefiles are expected to define, and sysdeps/unix/sysv/linux/Makefile
was using the internal __getpagesize in the definition; changed to
sysconf(_SC_PAGESIZE) which is the POSIX equivalent.
ia64-linux doesn't supply 'clone', only '__clone2', which is not
defined in the public headers(!) All the other clone tests have local
extern declarations of __clone2, but tst-clone.c doesn't; it was
getting away with this because include/sched.h does declare __clone2.
* nss/tst-cancel-getpwuid_r.c: Include nss.h.
* string/strcasestr.c: No need to include config.h.
* sysdeps/powerpc/fpu/tst-setcontext-fpscr.c: Include
sys/auxv.h. Don't include sysdep.h.
* sysdeps/powerpc/tst-set_ppr.c: Don't include dl-procinfo.h.
* dlfcn/tst-dladdr.c: Don't include ldsodefs.h. Don't use
DL_LOOKUP_ADDRESS.
* math/test-misc.c: Instead of testing NO_LONG_DOUBLE, test whether
LDBL_MANT_DIG is greater than DBL_MANT_DIG.
* sysdeps/unix/sysv/linux/Makefile (CFLAGS-tst-writev.c): Use
sysconf (_SC_PAGESIZE) instead of __getpagesize in definition
of ARTIFICIAL_LIMIT.
* sysdeps/unix/sysv/linux/tst-clone.c [__ia64__]: Add extern
declaration of __clone2.
A few 'long double'-related tests include math_private.h just for
their variety of math_ldbl.h, which contains macros for assembling and
disassembling the binary representation of 'long double'. math_ldbl.h
insists on being included from math_private.h, but if we relax this
restriction (and fix some portability sloppiness) we can use it
directly and not have to expose all of math_private.h to the testsuite.
* sysdeps/generic/math_private.h: Use __BIG_ENDIAN and
__LITTLE_ENDIAN, not BIG_ENDIAN and LITTLE_ENDIAN.
* sysdeps/generic/math_ldbl.h
* sysdeps/ia64/fpu/math_ldbl.h
* sysdeps/ieee754/ldbl-128/math_ldbl.h
* sysdeps/ieee754/ldbl-128ibm/math_ldbl.h
* sysdeps/ieee754/ldbl-96/math_ldbl.h
* sysdeps/powerpc/fpu/math_ldbl.h
* sysdeps/x86_64/fpu/math_ldbl.h:
Allow direct inclusion. Use uintNN_t instead of u_intNN_t.
Use __BIG_ENDIAN and __LITTLE_ENDIAN, not BIG_ENDIAN and
LITTLE_ENDIAN. Include endian.h and/or stdint.h if necessary.
Add copyright notices.
* sysdeps/ieee754/ldbl-128ibm/math_ldbl.h (ldbl_canonicalize_int):
Don't use EXTRACT_WORDS64.
* sysdeps/ieee754/ldbl-96/test-canonical-ldbl-96.c
* sysdeps/ieee754/ldbl-96/test-totalorderl-ldbl-96.c
* sysdeps/ieee754/ldbl-128ibm/test-canonical-ldbl-128ibm.c
* sysdeps/ieee754/ldbl-128ibm/test-totalorderl-ldbl-128ibm.c:
Include math_ldbl.h, not math_private.h.
The sys/platform/ppc.h header defines a class of __ppc_set_ppr functions
used to set the Program Priority Register (PPR) in PowerPC.
This patch implements test cases for these functions.
Tested on ppc64le, ppc64, and ppc.
* sysdeps/powerpc/tst-set_ppr.c: New file.
Implement test cases for __ppc_set_ppr_* functions.
* sysdeps/powerpc/Makefile ($(subdir),misc): Add tst-set_ppr
in the list of tests.
Change the powerpc tests to use <support/test-driver.c>.
Also replace some of pthread calls to its xpthread equivalent.
Tested on ppc64le.
* sysdeps/powerpc/test-get_hwcap.c: Use <support/test-driver.c>
instead of test-skeleton.c.
(do_test): Replaced pthread_create and pthread_join with
xpthread_create and xpthread_join. Use TEST_VERIFY_EXIT macro.
Removed unneeded status variable.
* sysdeps/powerpc/test-gettimebase.c: Use <support/test-driver.c>
instead of test-skeleton.c.
* sysdeps/powerpc/tst-tlsopt-powerpc.c: Likewise.
For strings >16B and <32B existing algorithm takes more time than default
implementation when strings are placed closed to end of page. This is due
to byte by byte access for handling page cross. This is improved by
following >32B code path where the address is adjusted to aligned memory
before doing load doubleword operation instead of loading bytes.
Tested on powerpc64 and powerpc64le.
Based on comments on previous attempt to address BZ#16640 [1],
the idea is not support invalid use of strtok (the original
bug report proposal). This leader to a new strtok optimized
strtok implementation [2].
The idea of this patch is to fix BZ#16640 to align all the
implementations to a same contract. However, with newer strtok
code it is better to get remove the old assembly ones instead of
fix them.
For x86 is a gain in all cases since the new implementation can
potentially use sse2/sse42 implementation for strspn and strcspn.
This shows a better performance on both i686 and x86_64 using
the string benchtests.
On powerpc64 the gains are mixed, where only for larger inputs
or keys some gains are showns (based on benchtest it seems that
it shows some gains for keys larger than 10 and inputs larger
than 32). I would prefer to remove the optimized implementation
based on first code simplicity and second because some more gain
could be optimized using a better optimized strcspn/strspn
code (as for x86). However if powerpc arch maintainers prefer I
can send a v2 with the assembly code adjusted instead.
Checked on x86_64-linux-gnu, i686-linux-gnu, and powerpc64le-linux-gnu.
[BZ #16640]
* sysdeps/i386/i686/strtok.S: Remove file.
* sysdeps/i386/i686/strtok_r.S: Likewise.
* sysdeps/i386/strtok.S: Likewise.
* sysdeps/i386/strtok_r.S: Likewise.
* sysdeps/powerpc/powerpc64/strtok.S: Likewise.
* sysdeps/powerpc/powerpc64/strtok_r.S: Likewise.
* sysdeps/x86_64/strtok.S: Likewise.
* sysdeps/x86_64/strtok_r.S: Likewise.
[1] https://sourceware.org/ml/libc-alpha/2016-10/msg00411.html
[2] https://sourceware.org/ml/libc-alpha/2016-12/msg00461.html
After this update, math/test-ildouble, math/test-ldouble and
math/test-ldouble-finite pass on hard float, POWER < 7 builds.
Tested on powerpc, powerpc64 and powerpc64le.
This commit moves one step towards the deprecation of wrappers that
use _LIB_VERSION / matherr / __kernel_standard functionality, by
adding the suffix '_compat' to their filenames and adjusting Makefiles
and #includes accordingly.
New template wrappers that do not use such functionality will be added
by future patches and will be first used by the float128 wrappers.
Since commit 6e46de42fe default strcat implementation is essentially
the same for specialized ia64 and powerpc ones. This patch removes the
redundant implementation and adjust powerpc64 ifunc code to use the
default one.
Checked on powerpc32-linux-gnu (default and power4) and ia64-linux build
and on powerpc64le-linux-gnu.
* sysdeps/ia64/strcat.c: Remove file.
* sysdeps/powerpc/strcat.c: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcat-power7.c: Use default
C implementation.
* sysdeps/powerpc/powerpc64/multiarch/strcat-power8.c: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcat-ppc64.c: Likewise.
The same error fixed in commit b224637928
happens in the 32-bit implementation of memchr for power7.
This patch adopts the same solution, with a minimal change: it
implements a saturated addition where overflows sets the maximum pointer
size to UINTPTR_MAX.
The P7 code is used for <=32B strings and for > 32B vectorized loops are used.
This shows as an average 25% improvement depending on the position of search
character. The performance is same for shorter strings.
Tested on ppc64 and ppc64le.
When dynamically linking, ifunc resolvers are called before TLS is
initialized, so they cannot be safely stack-protected.
We avoid disabling stack-protection on large numbers of files by
using __attribute__ ((__optimize__ ("-fno-stack-protector")))
to turn it off just for the resolvers themselves. (We provide
the attribute even when statically linking, because we will later
use it elsewhere too.)
Current optimized powercp64/power7 memchr uses a strategy to check for
p versus align(p+n) (where 'p' is the input char pointer and n the
maximum size to check for the byte) without taking care for possible
overflow on the pointer addition in case of large 'n'.
It was triggered by 3038145ca2 where default rawmemchr (used to
created ppc64 rawmemchr in ifunc selection) now uses memchr (p, c, (size_t)-1)
on its implementation.
This patch fixes it by implement a satured addition where overflows
sets the maximum pointer size to UINTPTR_MAX.
Checked on powerpc64le-linux-gnu.
[BZ# 20971]
* sysdeps/powerpc/powerpc64/power7/memchr.S (__memchr): Avoid
overflow in pointer addition.
* string/test-memchr.c (do_test): Add an argument to pass as
the size on memchr.
(test_main): Add check for SIZE_MAX.