Normally, TLS relocations against local symbols are optimised by the linker
to be absolute. However, gold does not do this, and so it is possible to
end up with, for example, R_SPARC_TLS_DTPMOD64 referring to a local symbol.
Since sym_map is left as null in elf_machine_rela for the special local
symbol case, the relocation handling thinks it has nothing to do, and so
the module gets left as 0. Havoc then ensues when the variable in question
is accessed.
Before this fix, the main_local_gold program would receive a SIGBUS on
sparc64, and SIGSEGV on powerpc32. With this fix applied, that test now
passes like the rest of them.
* sysdeps/powerpc/powerpc32/dl-machine.h (elf_machine_rela):
Assign sym_map to be map for local symbols, as TLS relocations
use sym_map to determine whether the symbol is defined and to
extract the TLS information.
* sysdeps/sparc/sparc32/dl-machine.h (elf_machine_rela): Likewise.
* sysdeps/sparc/sparc64/dl-machine.h (elf_machine_rela): Likewise.
This patch makes SPARC fabsl implementation use libm_alias_ldouble, to
prepare them for also defining _Float128 function aliases.
Tested with build-many-glibcs.py that installed stripped shared
libraries (sparc64-linux-gnu and sparcv9-linux-gnu) are unchanged by
the patch.
* sysdeps/sparc/sparc32/fpu/s_fabsl.c: Include
<libm-alias-ldouble.h>.
(fabsl): Define using libm_alias_ldouble.
* sysdeps/sparc/sparc64/fpu/s_fabsl.c: Include
<libm-alias-ldouble.h>.
(fabsl): Define using libm_alias_ldouble.
This patch makes dbl-64 fma use libm_alias_double. The ldbl-opt
version is removed. The sparc32 version no longer needs to handle
compat symbols, while alpha needs a new wrapper to avoid getting the
ldbl-128 version (where ldbl-opt is earlier in the list of sysdeps
directories, so previously fma came from there).
Tested for x86_64, and tested with build-many-glibcs.py that installed
stripped shared libraries are unchanged by the patch.
* sysdeps/ieee754/dbl-64/s_fma.c: Include <libm-alias-double.h>.
(fma): Define using libm_alias_double.
* sysdeps/ieee754/ldbl-opt/s_fma.c: Remove file.
* sysdeps/sparc/sparc32/fpu/s_fma.c: Do not include
<math_ldbl_opt.h>.
(fmal): Do not define as compat symbol here.
* sysdeps/alpha/fpu/s_fma.c: New file.
32-bit SPARC libm should have compat symbols for copysignl
(GLIBC_2.0), fabsl (GLIBC_2.0), fmal (GLIBC_2.1), pointing to the
double functions; they were present in glibc 2.8, for example, but are
now missing, probably when optimized SPARC function implementations
were added without appropriate compat symbol handling. The same
applies to copysignl in libc. This patch restores those compat
symbols.
Tested with build-many-glibcs.py for sparcv9-linux-gnu.
[BZ #22229]
* sysdeps/sparc/sparc32/fpu/s_copysign.S: Include
<math_ldbl_opt.h>
(copysignl): Define as compat symbol at version GLIBC_2_0 for libm
and libc.
* sysdeps/sparc/sparc32/fpu/s_fabs.S: Include <math_ldbl_opt.h>.
(fabsl): Define as compat symbol at version GLIBC_2_0 for libm.
* sysdeps/sparc/sparc32/fpu/s_fma.c: Include <math_ldbl_opt.h>.
(fmal): Define as compat symbol at version GLIBC_2_1 for libm.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysign.S:
Include <math_ldbl_opt.h>
(copysignl): Define as compat symbol at version GLIBC_2_0 for libm
and libc.
(compat_symbol): Undefine and redefine.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fabs.S: Include
<math_ldbl_opt.h>
(fabsl): Define as compat symbol at version GLIBC_2_0 for libm.
(compat_symbol): Undefine and redefine.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fma.c
[HAVE_AS_VIS3_SUPPORT]: Include <math_ldbl_opt.h>.
[HAVE_AS_VIS3_SUPPORT] (fmal): Define as compat symbol at version
GLIBC_2_1 for libm.
* sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist: Add
GLIBC_2.0 copysignl symbol.
* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Add
GLIBC_2.0 copysignl and fabsl and GLIBC_2.1 fmal symbols.
Continuing the process of setting up common macros for libm function
aliases, with a view to using them to define _FloatN / _FloatNx
aliases in future, this patch adds a libm_alias_double macro and uses
it in the type-generic templates.
This macro handles defining aliases for double, and for long double in
the NO_LONG_DOUBLE case. It also handles defining compat symbols for
long double = double for architectures that changed their long double
format. By so doing, it eliminates the need for the
M_LIBM_NEED_COMPAT and declare_mgen_libm_compat macros; the single
declare_mgen_alias call in each template now suffices to define all
required compat symbols. When used for more double functions (not
based on type-generic templates), I expect it will eliminate the need
for most ldbl-opt wrappers for such functions.
A few special cases are needed. __clog10l is a public symbol (for
historical reasons) so needs to be given appropriate compat versions
for architectures that changed their long double format, but is not
defined as an alias using the normal macros since __clog10* are *not*
public symbols for _FloatN / _FloatNx types. For scalbn, scalbln and
log1p, the changes adding errno setting support for those functions
left compat symbols pointing directly to the non-errno-setting
implementations. There is no requirement for the compat symbols not
to set errno; that just made for the simplest patches at that time.
Now, with these common macros, it's natural to redirect the compat
symbols to the errno-setting wrappers, which I intend to do in a
separate patch.
Tested for x86_64, and with build-many-glibcs.py. For ldbl-opt
platforms the stripped libm.so binaries are changed (disassembly
unchanged) because the details of how the clog10l compat symbol is
created mean it ceases to be weak as it was before; for other
platforms, stripped libm.so binaries are unchanged.
2017-09-13 Joseph Myers <joseph@codesourcery.com>
* sysdeps/generic/libm-alias-double.h: New file.
* sysdeps/ieee754/ldbl-opt/libm-alias-double.h: Likewise.
* sysdeps/generic/math-type-macros-double.h: Include
<libm-alias-double.h>.
[declare_mgen_alias] (declare_mgen_alias): Define to use
libm_alias_double.
* sysdeps/generic/math-type-macros.h [!M_LIBM_NEED_COMPAT]
(M_LIBM_NEED_COMPAT): Remove macro.
[!M_LIBM_NEED_COMPAT] (declare_mgen_libm_compat): Likewise.
* sysdeps/ieee754/ldbl-opt/math-type-macros-double.h: Remove.
* math/cabs_template.c [M_LIBM_NEED_COMPAT]: Remove conditional
code.
* math/carg_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/cimag_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/conj_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/creal_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_cacos_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_cacosh_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_casin_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_casinh_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_catan_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_catanh_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_ccos_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_ccosh_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_cexp_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_clog10_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_clog_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_cpow_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_cproj_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_csin_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_csinh_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_csqrt_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_ctan_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_ctanh_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_fdim_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_fmax_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_fmin_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_nan_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/w_ilogb_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* sysdeps/ieee754/ldbl-opt/s_clog10.c: New file.
* sysdeps/ieee754/ldbl-opt/s_ldexp.c (M_LIBM_NEED_COMPAT): Remove
macro.
(declare_mgen_alias): New macro.
* sysdeps/ieee754/ldbl-opt/w_log1p.c: New file.
* sysdeps/ieee754/ldbl-opt/w_scalbln.c: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fdim-vis3.c
(M_LIBM_NEED_COMPAT): Remove macro.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fdim.c
[HAVE_AS_VIS3_SUPPORT]: Include <math_ldbl_opt.h> and
<first-versions.h>.
[HAVE_AS_VIS3_SUPPORT && LONG_DOUBLE_COMPAT (libm,
FIRST_VERSION_libm_fdiml)]: Define fdiml as compat symbol.
This patch removes the SPARC-specific wrappers for sqrt and sqrtf.
These wrappers, by adding architecture-specific uses of _LIB_VERSION
and __kernel_standard, unnecessarily complicate cleanups of libm error
handling. They also do not serve a useful optimization purpose. GCC
knows about sqrt as a built-in function, and can generate direct calls
to a hardware square root instruction, either on its own, in the
-fno-math-errno case, or together with an inline check for the
argument being negative and a call to the out-of-line sqrt function
for error handling only in that case (and has been able to do so for a
long time). Thus in practice the wrapper will only be called only in
the case of negative arguments, which is not a case it is useful to
optimize for.
The removal of the wrappers also uncovers, and fixes, an old bug.
32-bit SPARC libm used (checked with glibc 2.8 binaries) to have a
sqrtl compat symbol, version GLIBC_2.0, for old binaries when sqrtl
was an alias of sqrt (I don't have pre-glibc-2.4 binaries for SPARC to
hand to check for the sqrtl symbol in those). This disappeared,
probably with:
commit 8847f03770
Author: David S. Miller <davem@davemloft.net>
Date: Tue Feb 28 22:37:58 2012 -0800
Add sparc optimized sqrt{,f}.
Removing the wrappers brings back the generic ldbl-opt logic for
creating such compat symbols, and so restores the compat symbol that
should be there. This could of course also be fixed in the wrappers -
but as noted above, the wrappers are optimizing a case it's not useful
to optimize, so the bug of the missing compat symbol serves to
illustrate the risks involved with the extra complexity of
architecture-specific function versions where not needed.
Tested with build-many-glibcs.py.
[BZ #21973]
* sysdeps/sparc/sparc32/fpu/w_sqrt_compat.S: Remove file.
* sysdeps/sparc/sparc32/fpu/w_sqrtf_compat.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/w_sqrt_compat-vis3.S:
Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/w_sqrt_compat.S:
Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/w_sqrtf_compat-vis3.S:
Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/w_sqrtf_compat.S:
Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/w_sqrt_compat.S : Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/w_sqrtf_compat.S: Likewise.
* sysdeps/sparc/sparc64/fpu/w_sqrt_compat.S: Likewise.
* sysdeps/sparc/sparc64/fpu/w_sqrtf_compat.S: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Add
GLIBC_2.0 sqrtl symbol.
This patch obsoletes support for SVID libm error handling (the system
where a user-defined function matherr is called on a libm function
error; only enabled if you also set _LIB_VERSION = _SVID_ or
_LIB_VERSION = _XOPEN_) and the use of the _LIB_VERSION global
variable to control libm error handling. matherr and _LIB_VERSION are
made into compat symbols, not supported for new ports or for static
linking. The libieee.a object file (which sets _LIB_VERSION = _IEEE_,
so disabling errno setting for some functions) is also removed, and
all the related definitions are removed from math.h.
The manual already recommends against using matherr, and it's already
not supported for _Float128 functions (those use new wrappers that
don't support matherr, only errno) - this patch means that it becomes
possible to e.g. add sinf32 as an alias to sinf without that resulting
in undesired matherr support in sinf32 for existing glibc ports.
matherr support is not part of any standard supported by glibc (it was
removed in XPG4).
Because matherr is a function to be defined by the user, of course
user programs defining such a function will still continue to link; it
just quietly won't be used. If they try to write to the library's
copy of _LIB_VERSION to enable SVID error handling, however, they will
get a link error (but if they define their own _LIB_VERSION variable,
they won't).
I expect the most likely case of build failures from this patch to be
programs with unconditional cargo-culted uses of -lieee (based on a
notion of "I want IEEE floating point", not any actual requirement for
that library).
Ideally, the new-port-or-static-linking case would use the new
wrappers used for _Float128. This is not implemented in this patch,
because of the complication of architecture-specific (powerpc32 and
sparc) sqrt wrappers that use _LIB_VERSION and __kernel_standard
directly. Thus, the old wrappers and __kernel_standard are still
built unconditionally, and _LIB_VERSION still exists in static libm.
But when the old wrappers and __kernel_standard are built in the
non-compat case, _LIB_VERSION and matherr are defined as macros so
code to support those features isn't actually built into static libm
or new ports' shared libm after this patch.
I intend to move to the new wrappers for static libm and new ports in
followup patches. I believe the sqrt wrappers for powerpc32 and sparc
can reasonably be removed. GCC already optimizes the normal case of
sqrt by generating code that uses a hardware instruction and only
calls the sqrt function if the argument was negative (if
-fno-math-errno, of course, it just uses the hardware instruction
without any check for negative argument being needed). Thus those
wrappers will only actually get called in the case of negative
arguments, which is not a case it makes sense to optimize for. But
even without removing the powerpc32 and sparc wrappers it should still
be possible to move to the new wrappers for static libm and new ports,
just without having those dubious architecture-specific optimizations
in static libm.
Everything said about matherr equally applies to matherrf and matherrl
(IA64-specific, undocumented), except that the structure of IA64 libm
means it won't be converted to using the new wrappers (it doesn't use
the old ones either, but its own error-handling code instead).
As with other tests of compat symbols, I expect test-matherr and
test-matherr-2 to need to become appropriately conditional once we
have a system for disabling such tests for ports too new to have the
relevant symbols.
Tested for x86_64 and x86, and with build-many-glibcs.py.
* math/math.h [__USE_MISC] (_LIB_VERSION_TYPE): Remove.
[__USE_MISC] (_LIB_VERSION): Likewise.
[__USE_MISC] (struct exception): Likewise.
[__USE_MISC] (matherr): Likewise.
[__USE_MISC] (DOMAIN): Likewise.
[__USE_MISC] (SING): Likewise.
[__USE_MISC] (OVERFLOW): Likewise.
[__USE_MISC] (UNDERFLOW): Likewise.
[__USE_MISC] (TLOSS): Likewise.
[__USE_MISC] (PLOSS): Likewise.
[__USE_MISC] (HUGE): Likewise.
[__USE_XOPEN] (MAXFLOAT): Define even if [__USE_MISC].
* math/math-svid-compat.h: New file.
* conform/linknamespace.pl (@whitelist): Remove matherr, matherrf
and matherrl.
* include/math.h [!_ISOMAC] (__matherr): Remove.
* manual/arith.texi (FP Exceptions): Do not document matherr.
* math/Makefile (tests): Change test-matherr to test-matherr-3.
(tests-internal): New variable.
(install-lib): Do not add libieee.a.
(non-lib.a): Likewise.
(extra-objs): Do not add libieee.a and ieee-math.o.
(CPPFLAGS-s_lib_version.c): Remove variable.
($(objpfx)libieee.a): Remove rule.
($(addprefix $(objpfx), $(tests-internal)): Depend on $(libm).
* math/ieee-math.c: Remove.
* math/libm-test-support.c (matherr): Remove.
* math/test-matherr.c: Use <support/test-driver.c>. Add copyright
and license notices. Include <math-svid-compat.h> and
<shlib-compat.h>.
(matherr): Undefine as macro. Use compat_symbol_reference.
(_LIB_VERSION): Likewise.
* math/test-matherr-2.c: New file.
* math/test-matherr-3.c: Likewise.
* sysdeps/generic/math_private.h (__kernel_standard): Remove
declaration.
(__kernel_standard_f): Likewise.
(__kernel_standard_l): Likewise.
* sysdeps/ieee754/s_lib_version.c: Do not include <math.h> or
<math_private.h>. Include <math-svid-compat.h>.
(_LIB_VERSION): Undefine as macro.
(_LIB_VERSION_INTERNAL): Always initialize to _POSIX_. Define
only if [LIBM_SVID_COMPAT || !defined SHARED]. If
[LIBM_SVID_COMPAT], use compat_symbol.
* sysdeps/ieee754/s_matherr.c: Do not include <math.h> or
<math_private.h>. Include <math-svid-compat.h>.
(matherr): Undefine as macro.
(__matherr): Define only if [LIBM_SVID_COMPAT]. Use
compat_symbol.
* sysdeps/ia64/fpu/libm_error.c: Include <math-svid-compat.h>.
[_LIBC && LIBM_SVID_COMPAT] (matherrf): Use
compat_symbol_reference.
[_LIBC && LIBM_SVID_COMPAT] (matherrl): Likewise.
[_LIBC && !LIBM_SVID_COMPAT] (matherrf): Define as macro.
[_LIBC && !LIBM_SVID_COMPAT] (matherrl): Likewise.
* sysdeps/ia64/fpu/libm_support.h: Include <math-svid-compat.h>.
(MATHERR_D): Remove declaration.
[!_LIBC] (_LIB_VERSION_TYPE): Likewise
[!LIBM_BUILD] (_LIB_VERSIONIMF): Likewise.
[LIBM_BUILD] (pmatherrf): Likewise.
[LIBM_BUILD] (pmatherr): Likewise.
[LIBM_BUILD] (pmatherrl): Likewise.
(DOMAIN): Likewise.
(SING): Likewise.
(OVERFLOW): Likewise.
(UNDERFLOW): Likewise.
(TLOSS): Likewise.
(PLOSS): Likewise.
* sysdeps/ia64/fpu/s_matherrf.c: Include <math-svid-compat.h>.
(__matherrf): Define only if [LIBM_SVID_COMPAT]. Use
compat_symbol.
* sysdeps/ia64/fpu/s_matherrl.c: Include <math-svid-compat.h>.
(__matherrl): Define only if [LIBM_SVID_COMPAT]. Use
compat_symbol.
* math/lgamma-compat.h: Include <math-svid-compat.h>.
* math/w_acos_compat.c: Likewise.
* math/w_acosf_compat.c: Likewise.
* math/w_acosh_compat.c: Likewise.
* math/w_acoshf_compat.c: Likewise.
* math/w_acoshl_compat.c: Likewise.
* math/w_acosl_compat.c: Likewise.
* math/w_asin_compat.c: Likewise.
* math/w_asinf_compat.c: Likewise.
* math/w_asinl_compat.c: Likewise.
* math/w_atan2_compat.c: Likewise.
* math/w_atan2f_compat.c: Likewise.
* math/w_atan2l_compat.c: Likewise.
* math/w_atanh_compat.c: Likewise.
* math/w_atanhf_compat.c: Likewise.
* math/w_atanhl_compat.c: Likewise.
* math/w_cosh_compat.c: Likewise.
* math/w_coshf_compat.c: Likewise.
* math/w_coshl_compat.c: Likewise.
* math/w_exp10_compat.c: Likewise.
* math/w_exp10f_compat.c: Likewise.
* math/w_exp10l_compat.c: Likewise.
* math/w_exp2_compat.c: Likewise.
* math/w_exp2f_compat.c: Likewise.
* math/w_exp2l_compat.c: Likewise.
* math/w_fmod_compat.c: Likewise.
* math/w_fmodf_compat.c: Likewise.
* math/w_fmodl_compat.c: Likewise.
* math/w_hypot_compat.c: Likewise.
* math/w_hypotf_compat.c: Likewise.
* math/w_hypotl_compat.c: Likewise.
* math/w_j0_compat.c: Likewise.
* math/w_j0f_compat.c: Likewise.
* math/w_j0l_compat.c: Likewise.
* math/w_j1_compat.c: Likewise.
* math/w_j1f_compat.c: Likewise.
* math/w_j1l_compat.c: Likewise.
* math/w_jn_compat.c: Likewise.
* math/w_jnf_compat.c: Likewise.
* math/w_jnl_compat.c: Likewise.
* math/w_lgamma_main.c: Likewise.
* math/w_lgamma_r_compat.c: Likewise.
* math/w_lgammaf_main.c: Likewise.
* math/w_lgammaf_r_compat.c: Likewise.
* math/w_lgammal_main.c: Likewise.
* math/w_lgammal_r_compat.c: Likewise.
* math/w_log10_compat.c: Likewise.
* math/w_log10f_compat.c: Likewise.
* math/w_log10l_compat.c: Likewise.
* math/w_log2_compat.c: Likewise.
* math/w_log2f_compat.c: Likewise.
* math/w_log2l_compat.c: Likewise.
* math/w_log_compat.c: Likewise.
* math/w_logf_compat.c: Likewise.
* math/w_logl_compat.c: Likewise.
* math/w_pow_compat.c: Likewise.
* math/w_powf_compat.c: Likewise.
* math/w_powl_compat.c: Likewise.
* math/w_remainder_compat.c: Likewise.
* math/w_remainderf_compat.c: Likewise.
* math/w_remainderl_compat.c: Likewise.
* math/w_scalb_compat.c: Likewise.
* math/w_scalbf_compat.c: Likewise.
* math/w_scalbl_compat.c: Likewise.
* math/w_sinh_compat.c: Likewise.
* math/w_sinhf_compat.c: Likewise.
* math/w_sinhl_compat.c: Likewise.
* math/w_sqrt_compat.c: Likewise.
* math/w_sqrtf_compat.c: Likewise.
* math/w_sqrtl_compat.c: Likewise.
* math/w_tgamma_compat.c: Likewise.
* math/w_tgammaf_compat.c: Likewise.
* math/w_tgammal_compat.c: Likewise.
* sysdeps/ieee754/dbl-64/w_exp_compat.c: Likewise.
* sysdeps/ieee754/flt-32/w_expf_compat.c: Likewise.
* sysdeps/ieee754/k_standard.c: Likewise.
* sysdeps/ieee754/k_standardf.c: Likewise.
* sysdeps/ieee754/k_standardl.c: Likewise.
* sysdeps/ieee754/ldbl-128/w_expl_compat.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/w_expl_compat.c: Likewise.
* sysdeps/ieee754/ldbl-96/w_expl_compat.c: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/w_sqrt_compat.S: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/w_sqrtf_compat.S: Likewise.
* sysdeps/powerpc/powerpc32/power5/fpu/w_sqrt_compat.S: Likewise.
* sysdeps/powerpc/powerpc32/power5/fpu/w_sqrtf_compat.S: Likewise.
* sysdeps/sparc/sparc32/fpu/w_sqrt_compat.S: Likewise.
* sysdeps/sparc/sparc32/fpu/w_sqrtf_compat.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/w_sqrt_compat-vis3.S:
Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/w_sqrtf_compat-vis3.S:
Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/w_sqrt_compat.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/w_sqrtf_compat.S: Likewise.
* sysdeps/sparc/sparc64/fpu/w_sqrt_compat.S: Likewise.
* sysdeps/sparc/sparc64/fpu/w_sqrtf_compat.S: Likewise.
ELFv2 functions with localentry:0 are those with a single entry point,
ie. global entry == local entry, that have no requirement on r2 or
r12 and guarantee r2 is unchanged on return. Such an external
function can be called via the PLT without saving r2 or restoring it
on return, avoiding a common load-hit-store for small functions.
This patch implements the ld.so changes necessary for this
optimization. ld.so needs to check that an optimized plt call
sequence is in fact calling a function implemented with localentry:0,
end emit a fatal error otherwise.
The elf/testobj6.c change is to stop "error while loading shared
libraries: expected localentry:0 `preload'" when running
elf/preloadtest, which we'd get otherwise.
* elf/elf.h (PPC64_OPT_LOCALENTRY): Define.
* sysdeps/alpha/dl-machine.h (elf_machine_fixup_plt): Add
refsym and sym parameters. Adjust callers.
* sysdeps/aarch64/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/arm/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/generic/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/hppa/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/i386/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/ia64/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/m68k/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/microblaze/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/mips/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/nios2/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/powerpc/powerpc32/dl-machine.h (elf_machine_fixup_plt):
Likewise.
* sysdeps/s390/s390-32/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/s390/s390-64/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/sh/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/sparc/sparc32/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/sparc/sparc64/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/tile/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/x86_64/dl-machine.h (elf_machine_fixup_plt): Likewise.
* sysdeps/powerpc/powerpc64/dl-machine.c (_dl_error_localentry): New.
(_dl_reloc_overflow): Increase buffser size. Formatting.
* sysdeps/powerpc/powerpc64/dl-machine.h (ppc64_local_entry_offset):
Delete reloc param, add refsym and sym. Check optimized plt
call stubs for localentry:0 functions. Adjust callers.
(elf_machine_fixup_plt, elf_machine_plt_conflict): Add refsym
and sym parameters. Adjust callers.
(_dl_reloc_overflow): Move attribute.
(_dl_error_localentry): Declare.
* elf/dl-runtime.c (_dl_fixup): Save original sym. Pass
refsym and sym to elf_machine_fixup_plt.
* elf/testobj6.c (preload): Call printf.
The LD_HWCAP_MASK environment variable was ignored in static binaries,
which is inconsistent with the behaviour of dynamically linked
binaries. This seems to have been because of the inability of
ld_hwcap_mask being read early enough to influence anything but now
that it is in tunables, the mask is usable in static binaries as well.
This feature is important for aarch64, which relies on HWCAP_CPUID
being masked out to disable multiarch. A sanity test on x86_64 shows
that there are no failures. Likewise for aarch64.
* elf/dl-hwcaps.h [HAVE_TUNABLES]: Always read hwcap_mask.
* sysdeps/sparc/sparc32/dl-machine.h [HAVE_TUNABLES]:
Likewise.
* sysdeps/x86/cpu-features.c (init_cpu_features): Always set
up hwcap and hwcap_mask.
Drop _dl_hwcap_mask when building with tunables. This completes the
transition of hwcap_mask reading from _dl_hwcap_mask to tunables.
* elf/dl-hwcaps.h: New file.
* elf/dl-hwcaps.c: Include it.
(_dl_important_hwcaps)[HAVE_TUNABLES]: Read and update
glibc.tune.hwcap_mask.
* elf/dl-cache.c: Include dl-hwcaps.h.
(_dl_load_cache_lookup)[HAVE_TUNABLES]: Read
glibc.tune.hwcap_mask.
* sysdeps/sparc/sparc32/dl-machine.h: Likewise.
* elf/dl-support.c (_dl_hwcap2)[HAVE_TUNABLES]: Drop
_dl_hwcap_mask.
* elf/rtld.c (rtld_global_ro)[HAVE_TUNABLES]: Drop
_dl_hwcap_mask.
(process_envvars)[HAVE_TUNABLES]: Likewise.
* sysdeps/generic/ldsodefs.h (rtld_global_ro)[HAVE_TUNABLES]:
Likewise.
* sysdeps/x86/cpu-features.c (init_cpu_features): Don't
initialize dl_hwcap_mask when tunables are enabled.
This patch optimizes the generic spinlock code.
The type pthread_spinlock_t is a typedef to volatile int on all archs.
Passing a volatile pointer to the atomic macros which are not mapped to the
C11 atomic builtins can lead to extra stores and loads to stack if such
a macro creates a temporary variable by using "__typeof (*(mem)) tmp;".
Thus, those macros which are used by spinlock code - atomic_exchange_acquire,
atomic_load_relaxed, atomic_compare_exchange_weak - have to be adjusted.
According to the comment from Szabolcs Nagy, the type of a cast expression is
unqualified (see http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_423.htm):
__typeof ((__typeof (*(mem)) *(mem)) tmp;
Thus from spinlock perspective the variable tmp is of type int instead of
type volatile int. This patch adjusts those macros in include/atomic.h.
With this construct GCC >= 5 omits the extra stores and loads.
The atomic macros are replaced by the C11 like atomic macros and thus
the code is aligned to it. The pthread_spin_unlock implementation is now
using release memory order instead of sequentially consistent memory order.
The issue with passed volatile int pointers applies to the C11 like atomic
macros as well as the ones used before.
I've added a glibc_likely hint to the first atomic exchange in
pthread_spin_lock in order to return immediately to the caller if the lock is
free. Without the hint, there is an additional jump if the lock is free.
I've added the atomic_spin_nop macro within the loop of plain reads.
The plain reads are also realized by C11 like atomic_load_relaxed macro.
The new define ATOMIC_EXCHANGE_USES_CAS determines if the first try to acquire
the spinlock in pthread_spin_lock or pthread_spin_trylock is an exchange
or a CAS. This is defined in atomic-machine.h for all architectures.
The define SPIN_LOCK_READS_BETWEEN_CMPXCHG is now removed.
There is no technical reason for throwing in a CAS every now and then,
and so far we have no evidence that it can improve performance.
If that would be the case, we have to adjust other spin-waiting loops
elsewhere, too! Using a CAS loop without plain reads is not a good idea
on many targets and wasn't used by one. Thus there is now no option to
do so.
Architectures are now using the generic spinlock automatically if they
do not provide an own implementation. Thus the pthread_spin_lock.c files
in sysdeps folder are deleted.
ChangeLog:
* NEWS: Mention new spinlock implementation.
* include/atomic.h:
(__atomic_val_bysize): Cast type to omit volatile qualifier.
(atomic_exchange_acq): Likewise.
(atomic_load_relaxed): Likewise.
(ATOMIC_EXCHANGE_USES_CAS): Check definition.
* nptl/pthread_spin_init.c (pthread_spin_init):
Use atomic_store_relaxed.
* nptl/pthread_spin_lock.c (pthread_spin_lock):
Use C11-like atomic macros.
* nptl/pthread_spin_trylock.c (pthread_spin_trylock):
Likewise.
* nptl/pthread_spin_unlock.c (pthread_spin_unlock):
Use atomic_store_release.
* sysdeps/aarch64/nptl/pthread_spin_lock.c: Delete File.
* sysdeps/arm/nptl/pthread_spin_lock.c: Likewise.
* sysdeps/hppa/nptl/pthread_spin_lock.c: Likewise.
* sysdeps/m68k/nptl/pthread_spin_lock.c: Likewise.
* sysdeps/microblaze/nptl/pthread_spin_lock.c: Likewise.
* sysdeps/mips/nptl/pthread_spin_lock.c: Likewise.
* sysdeps/nios2/nptl/pthread_spin_lock.c: Likewise.
* sysdeps/aarch64/atomic-machine.h (ATOMIC_EXCHANGE_USES_CAS): Define.
* sysdeps/alpha/atomic-machine.h: Likewise.
* sysdeps/arm/atomic-machine.h: Likewise.
* sysdeps/i386/atomic-machine.h: Likewise.
* sysdeps/ia64/atomic-machine.h: Likewise.
* sysdeps/m68k/coldfire/atomic-machine.h: Likewise.
* sysdeps/m68k/m680x0/m68020/atomic-machine.h: Likewise.
* sysdeps/microblaze/atomic-machine.h: Likewise.
* sysdeps/mips/atomic-machine.h: Likewise.
* sysdeps/powerpc/powerpc32/atomic-machine.h: Likewise.
* sysdeps/powerpc/powerpc64/atomic-machine.h: Likewise.
* sysdeps/s390/atomic-machine.h: Likewise.
* sysdeps/sparc/sparc32/atomic-machine.h: Likewise.
* sysdeps/sparc/sparc32/sparcv9/atomic-machine.h: Likewise.
* sysdeps/sparc/sparc64/atomic-machine.h: Likewise.
* sysdeps/tile/tilegx/atomic-machine.h: Likewise.
* sysdeps/tile/tilepro/atomic-machine.h: Likewise.
* sysdeps/unix/sysv/linux/hppa/atomic-machine.h: Likewise.
* sysdeps/unix/sysv/linux/m68k/coldfire/atomic-machine.h: Likewise.
* sysdeps/unix/sysv/linux/nios2/atomic-machine.h: Likewise.
* sysdeps/unix/sysv/linux/sh/atomic-machine.h: Likewise.
* sysdeps/x86_64/atomic-machine.h: Likewise.
David Miller has not been shot yet AFAIK (yes, I googled for any news
that may seem relevant and I poked him on twitter some days ago) so
either nobody uses SPARC or the code is correct or nobody read the
instructions in the comment to shoot him. In all of those cases the
comment is clearly not useful, so getting rid of it.
The SPARC implementations of __signbit* functions have aliases
signbit, signbitf, signbitl. These are useless, as they aren't
exported from the shared libraries (only the __signbit* functions are
exported, to be used by the type-generic signbit macro with older
compilers). This patch removes the useless aliases.
Tested (compilation only) with build-many-glibcs.py for
sparc64-linux-gnu and sparcv9-linux-gnu.
* sysdeps/sparc/sparc32/fpu/s_signbit.S (signbit): Remove alias.
(signbitf): Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_signbit.S (signbit):
Likewise.
(signbitl): Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_signbitf.S (signbitf):
Likewise.
* sysdeps/sparc/sparc64/fpu/s_signbit.S (signbit): Likewise.
(signbitl): Likewise.
* sysdeps/sparc/sparc64/fpu/s_signbitf.S (signbitf): Likewise.
With the removal of divdi3 object from sparcv9-linux-gnu build, its
definition came from libgcc and its functions internall calls .udiv.
Since glibc also exports these symbols for compatibility reasons, it
will end up creating PLT calls internally in libc.so.
To avoid it, this patch uses the linker option --wrap to replace all
the internal libc.so .udiv calls to the wrapper __wrap_.udiv. Along
with strong alias in the udiv implementations, it makes linker do
local calls.
Checked on sparcv9-linux-gnu.
* sysdeps/sparc/sparc32/Makefile (libc.so-gnulib): New rule.
* sysdeps/sparc/sparc32/sparcv8/udiv.S (.udiv): Make a strong_alias
to __wrap_.udiv.
* sysdeps/sparc/sparc32/sparcv9/udiv.S (.udiv): Likewise.
* sysdeps/sparc/sparc32/udiv.S (.udiv): Likewise.
This commit moves one step towards the deprecation of wrappers that
use _LIB_VERSION / matherr / __kernel_standard functionality, by
adding the suffix '_compat' to their filenames and adjusting Makefiles
and #includes accordingly.
New template wrappers that do not use such functionality will be added
by future patches and will be first used by the float128 wrappers.
This was used by --enable-omitfp, and the bulk of it was removed in this
commit:
commit bdeba1354b
Author: Ulrich Drepper <drepper@gmail.com>
Date: Sat Jan 7 11:29:31 2012 -0500
Remove --enable-omitfp support
Current sparc32 sem_init and default one only differ on sem.newsem.pad
initialization. This patch removes sparc32 and sparc32v9 sem_init arch
specific implementation and set sparc32 to use nptl default one.
The default implementation sets the required sem.newsem.pad to 0 (which
is ununsed in other architectures).
I checked on i686 and a sparc32v9 build.
* nptl/sem_init.c (sem_init): Init pad value to 0.
* sysdeps/sparc/sparc32/sem_init.c: Remove file.
* sysdeps/sparc/sparc32/sparcv9/sem_init.c: Likewise.
This patch removes the sparc32 sem_wait.c implementation since it is
identical to default nptl one. The sparcv9 is no longer required with
the removal.
Checked with a sparcv9 build.
* sysdeps/sparc/sparc32/sem_wait.c: Remove file.
* sysdeps/sparc/sparc32/sparcv9/sem_wait.c: Likewise.
Current sparc32 sem_open and default one only differ on:
1. Default one contains a 'futex_supports_pshared' check.
2. sem.newsem.pad is initialized to zero.
This patch removes sparc32 and sparc32v9 sem_open arch specific
implementation and instead set sparc32 to use nptl default one.
Using 1. is fine since it should always evaluate 0 for Linux
(an optimized away by the compiler). Adding 2. to default
implementation should be ok since 'pad' field is used mainly
on sparc32 code.
I checked on i686 and checked a sparc32v9 build.
* nptl/sem_open.c (sem_open): Init pad value to 0.
* sysdeps/sparc/sparc32/sem_open.c: Remove file.
* sysdeps/sparc/sparc32/sparcv9/sem_open.c: Likewise.
The only difference is the usage of math_narrow_eval when
building s_fdiml.c. This should be harmless for long double,
but I did observe some code generation changes on m68k, but
lack the resources to test it.
Likewise, to more easily support overriding symbol generation,
the aliasing macros are always conditionally defined on their
absence to reduce boilerplate.
I also ran builds for i486, ppc64, sparcv9, aarch64,
s390x and observed no changes to s_fdim* objects.
sparc32 passes floating point values in the integer registers. VIS3
instructions gives access to the movwtos instruction to directly
transfer a value from an integer register to a floating point register.
Therefore it makes sense to provide a VIS3 version consisting in the
generic version compiled with -mvis3.
Changelog:
* math/s_fdim.c: Avoid alias renamed.
* math/s_fdimf.c: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/Makefile
[$(subdir) = math && $(have-as-vis3) = yes] (libm-sysdep_routines):
Add s_fdimf-vis3, s_fdim-vis3.
(CFLAGS-s_fdimf-vis3.c): New. Set to -Wa,-Av9d -mvis3.
(CFLAGS-s_fdim-vis3.c): Likewise.
sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fdim-vis3.c: New file.
sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fdim.c: Likewise.
The fdim and fdimf functions on sparc do not fully follow the standard
and do not set errno to ERANGE when the result overflows. Since glibc
2.24 this causes the two following tests to fail:
Failure: fdim (max_value, -max_value): errno set to 0, expected 34 (ERANGE)
Failure: fdim_upward (max_value, -max_value): errno set to 0, expected 34 (ERANGE)
It happens that using GCC with the generic C code generates very similar
code to the sparc specific implementations. Therefore this patches
remove them. Note it might still worth adding a vis3 specific version of
fdim on sparc32/sparcv9, this is done in a following patch to ease
backporting.
Changelog:
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/Makefile
[$(subdir) = math && $(have-as-vis3) = yes] (libm-sysdep_routines):
Remove s_fdimf-vis3, s_fdim-vis3.
* sysdeps/sparc/sparc32/fpu/s_fdim.S: Delete file.
* sysdeps/sparc/sparc32/fpu/s_fdimf.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fdim-vis3.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fdim.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fdimf-vis3.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fdimf.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_fdim.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_fdimf.S: Likewise.
* sysdeps/sparc/sparc64/fpu/s_fdim.S: Likewise.
* sysdeps/sparc/sparc64/fpu/s_fdimf.S: Likewise.
When building for sparc32/sparcv9 or sparc64, we assume that VIS
instructions are available and use them in the sparc specific assembly
code. However we do not tell GCC to use such instructions, resulting in
slightly suboptimal code.
Fix that by passing -Wa,-Av9a -mvis to GCC.
Changelog:
* sysdeps/sparc/sparc32/sparcv9/Makefile (sysdep-CFLAGS): Add -mvis.
* sysdeps/sparc/sparc64/Makefile (sysdep-CFLAGS): New. Define to
-Wa,-Av9a -mvis.
The ceil, floor and trunc functions on sparc do not fully follow the
standard and trigger an inexact exception when presented a value which
is not an integer. Since glibc 2.24 this causes a few tests to fail,
for instance:
testing double (without inline functions)
Failure: ceil (lit_pi): Exception "Inexact" set
Failure: ceil (-lit_pi): Exception "Inexact" set
Failure: ceil (min_subnorm_value): Exception "Inexact" set
Failure: ceil (min_value): Exception "Inexact" set
Failure: ceil (0.1): Exception "Inexact" set
Failure: ceil (0.25): Exception "Inexact" set
Failure: ceil (0.625): Exception "Inexact" set
Failure: ceil (-min_subnorm_value): Exception "Inexact" set
Failure: ceil (-min_value): Exception "Inexact" set
Failure: ceil (-0.1): Exception "Inexact" set
Failure: ceil (-0.25): Exception "Inexact" set
Failure: ceil (-0.625): Exception "Inexact" set
I tried to fix that by using the same strategy than used on other
architectures, that is by saving the FSR register at the beginning
and restoring it at the end of the function. When doing so I noticed
a comment that this operation might be very costly, so I decided to
do some benchmarks.
The benchmarks below represent the time required to run each of the
function 60 millions of times with different input value. I have done
that in the basic V9 code, the VIS2 code, and using the default C
implementation of the libc, for both sparc32 and sparc64, on a Niagara
T1 based machine and an UltraSparc IIIi. Given I don't have access to a
more recent machine), I haven't been able to test the VIS3 version. Also
it should be noted that it doesn't make sense to do this benchmark for
V8 or earlier as in that case we use the default C implementation. The
results are available in the table below, the "+ fix" version correspond
to the one saving and restoring the FSR.
Niagara T1 / sparc32
--------------------
ceilf ceil floorf floor truncf trunc
V9 19.10 22.48 19.10 22.48 16.59 19.27
V9 + fix 19.77 23.34 19.77 23.33 17.27 20.12
VIS2 16.87 19.62 16.87 19.62
VIS2 + fix 17.55 20.47 17.55 20.47
C impl 11.39 13.80 11.40 13.80 10.88 10.84
Niagara T1 / sparc64
--------------------
ceilf ceil floorf floor truncf trunc
V9 18.14 22.23 18.14 22.23 15.64 19.02
V9 + fix 18.82 23.08 18.82 23.08 16.32 19.87
VIS2 15.92 19.37 15.92 19.37
VIS2 + fix 16.59 20.22 16.59 20.22
C impl 11.39 13.60 11.39 15.36 10.88 12.65
UltraSparc IIIi / sparc32
-------------------------
ceilf ceil floorf floor truncf trunc
V9 4.81 7.09 6.61 11.64 4.91 7.05
V9 + fix 7.20 10.42 7.14 10.54 6.76 9.47
VIS2 4.81 7.03 4.76 7.13
VIS2 + fix 6.76 9.51 6.71 9.63
C impl 3.88 8.62 3.90 9.45 3.57 6.62
UltraSparc IIIi / sparc64
-------------------------
ceilf ceil floorf floor truncf trunc
V9 3.48 4.39 3.48 4.41 3.01 3.85
V9 + fix 4.76 5.90 4.76 5.90 4.86 6.26
VIS2 2.95 3.61 2.95 3.61
VIS2 + fix 4.24 5.37 4.30 7.97
C impl 3.63 4.89 3.62 6.38 3.33 4.03
The first thing that should be noted is that the C implementation is
always faster on the Niagara T1 based machine. On the UltraSparc IIIi
the float version on sparc32 is also faster.
Coming back about the fix saving and restoring the FSR, it appears
it has a big impact as expected. In that case the C implementation is
always faster than the fixed implementations.
This patch therefore removes the sparc specific implementations in
favor of the generic ones.
Changelog:
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/Makefile
[$(subdir) = math] (libm-sysdep_routines): Remove.
[$(subdir) = math && $(have-as-vis3) = yes] (libm-sysdep_routines):
Remove s_ceilf-vis3, s_ceil-vis3, s_floorf-vis3, s_floor-vis3,
s_truncf-vis3, s_trunc-vis3.
* sysdeps/sparc/sparc64/fpu/multiarch/Makefile: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_ceil-vis2.S: Delete
file.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_ceil-vis3.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_ceil.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_ceilf-vis2.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_ceilf-vis3.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_ceilf.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_floor-vis2.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_floor-vis3.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_floor.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_floorf-vis2.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_floorf-vis3.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_floorf.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_trunc-vis3.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_trunc.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_truncf-vis3.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_truncf.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_ceil.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_ceilf.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_floor.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_floorf.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_trunc.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_truncf.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_ceil-vis2.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_ceil-vis3.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_ceil.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_ceilf-vis2.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_ceilf-vis3.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_ceilf.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_floor-vis2.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_floor-vis3.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_floor.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_floorf-vis2.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_floorf-vis3.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_floorf.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_trunc-vis3.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_trunc.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_truncf-vis3.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_truncf.S: Likewise.
* sysdeps/sparc/sparc64/fpu/s_ceil.S: Likewise.
* sysdeps/sparc/sparc64/fpu/s_ceilf.S: Likewise.
* sysdeps/sparc/sparc64/fpu/s_floor.S: Likewise.
* sysdeps/sparc/sparc64/fpu/s_floorf.S: Likewise.
* sysdeps/sparc/sparc64/fpu/s_trunc.S: Likewise.
* sysdeps/sparc/sparc64/fpu/s_truncf.S: Likewise.
nearbyint and nearbyintf should not trigger inexact exceptions, but
should still trigger an invalid exception for a sNaN input.
The SPARC specific implementations of these functions save the FSR at
the beginning of the function and restore it at the end to not trigger
an inexact exception. This however doesn't work for an sNaN input which
need to trigger an invalid exception. Fix that by adding a fcmp
instruction using the input value before saving FSR, so that an invalid
exception is triggered for a sNaN input.
This fixes the math/test-nearbyint-except test on SPARC.
Changelog:
* sparc/sparc32/sparcv9/fpu/s_nearbyint.S (__nearbyint): Trigger an
invalid exception for a sNaN input.
* sparc/sparc32/sparcv9/fpu/s_nearbyintf.S (__nearbyintf): Likewise.
* sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyint-vis3.S
(__nearbyint_vis3): Likewise
* sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyintf-vis3.S
(__nearbyintf_vis3): Likewise
* sparc/sparc64/fpu/s_nearbyint.S (__nearbyint): Likewise.
* sparc/sparc64/fpu/s_nearbyintf.S (__nearbyintf): Likewise.
* sparc/sparc64/fpu/multiarch/s_nearbyint-vis3.S (__nearbyint_vis3):
Likewise.
* sparc/sparc64/fpu/multiarch/s_nearbyintf-vis3.S (__nearbyintf_vis3):
Likewise.
The previous barrier implementation did not fulfill the POSIX requirements
for when a barrier can be destroyed. Specifically, it was possible that
threads that haven't noticed yet that their round is complete still access
the barrier's memory, and that those accesses can happen after the barrier
has been legally destroyed.
The new algorithm does not have this issue, and it avoids using a lock
internally.
It was noted in
<https://sourceware.org/ml/libc-alpha/2012-09/msg00305.html> that the
bits/*.h naming scheme should only be used for installed headers.
This patch renames bits/atomic.h to atomic-machine.h to follow that
convention.
This is the only change in this series that needs to change the
filename rather than simply removing a directory level (because both
atomic.h and bits/atomic.h exist at present).
Tested for x86_64 (testsuite, and that installed stripped shared
libraries are unchanged by the patch).
[BZ #14912]
* sysdeps/aarch64/bits/atomic.h: Move to ...
* sysdeps/aarch64/atomic-machine.h: ...here.
(_AARCH64_BITS_ATOMIC_H): Rename macro to
_AARCH64_ATOMIC_MACHINE_H.
* sysdeps/alpha/bits/atomic.h: Move to ...
* sysdeps/alpha/atomic-machine.h: ...here.
* sysdeps/arm/bits/atomic.h: Move to ...
* sysdeps/arm/atomic-machine.h: ...here. Update comments.
* bits/atomic.h: Move to ...
* sysdeps/generic/atomic-machine.h: ...here.
(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
* sysdeps/i386/bits/atomic.h: Move to ...
* sysdeps/i386/atomic-machine.h: ...here.
* sysdeps/ia64/bits/atomic.h: Move to ...
* sysdeps/ia64/atomic-machine.h: ...here.
* sysdeps/m68k/coldfire/bits/atomic.h: Move to ...
* sysdeps/m68k/coldfire/atomic-machine.h: ...here.
(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
* sysdeps/m68k/m680x0/m68020/bits/atomic.h: Move to ...
* sysdeps/m68k/m680x0/m68020/atomic-machine.h: ...here.
* sysdeps/microblaze/bits/atomic.h: Move to ...
* sysdeps/microblaze/atomic-machine.h: ...here.
* sysdeps/mips/bits/atomic.h: Move to ...
* sysdeps/mips/atomic-machine.h: ...here.
(_MIPS_BITS_ATOMIC_H): Rename macro to _MIPS_ATOMIC_MACHINE_H.
* sysdeps/powerpc/bits/atomic.h: Move to ...
* sysdeps/powerpc/atomic-machine.h: ...here. Update comments.
* sysdeps/powerpc/powerpc32/bits/atomic.h: Move to ...
* sysdeps/powerpc/powerpc32/atomic-machine.h: ...here. Update
comments. Include <atomic-machine.h> instead of <bits/atomic.h>.
* sysdeps/powerpc/powerpc64/bits/atomic.h: Move to ...
* sysdeps/powerpc/powerpc64/atomic-machine.h: ...here. Include
<atomic-machine.h> instead of <bits/atomic.h>.
* sysdeps/s390/bits/atomic.h: Move to ...
* sysdeps/s390/atomic-machine.h: ...here.
* sysdeps/sparc/sparc32/bits/atomic.h: Move to ...
* sysdeps/sparc/sparc32/atomic-machine.h: ...here.
(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
* sysdeps/sparc/sparc32/sparcv9/bits/atomic.h: Move to ...
* sysdeps/sparc/sparc32/sparcv9/atomic-machine.h: ...here.
* sysdeps/sparc/sparc64/bits/atomic.h: Move to ...
* sysdeps/sparc/sparc64/atomic-machine.h: ...here.
* sysdeps/tile/bits/atomic.h: Move to ...
* sysdeps/tile/atomic-machine.h: ...here.
* sysdeps/tile/tilegx/bits/atomic.h: Move to ...
* sysdeps/tile/tilegx/atomic-machine.h: ...here. Include
<sysdeps/tile/atomic-machine.h> instead of
<sysdeps/tile/bits/atomic.h>.
(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
* sysdeps/tile/tilepro/bits/atomic.h: Move to ...
* sysdeps/tile/tilepro/atomic-machine.h: ...here. Include
<sysdeps/tile/atomic-machine.h> instead of
<sysdeps/tile/bits/atomic.h>.
(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
* sysdeps/unix/sysv/linux/arm/bits/atomic.h: Move to ...
* sysdeps/unix/sysv/linux/arm/atomic-machine.h: ...here. Include
<sysdeps/arm/atomic-machine.h> instead of
<sysdeps/arm/bits/atomic.h>.
* sysdeps/unix/sysv/linux/hppa/bits/atomic.h: Move to ...
* sysdeps/unix/sysv/linux/hppa/atomic-machine.h: ...here.
(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
* sysdeps/unix/sysv/linux/m68k/coldfire/bits/atomic.h: Move to ...
* sysdeps/unix/sysv/linux/m68k/coldfire/atomic-machine.h: ...here.
(_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H.
* sysdeps/unix/sysv/linux/nios2/bits/atomic.h: Move to ...
* sysdeps/unix/sysv/linux/nios2/atomic-machine.h: ...here.
(_NIOS2_BITS_ATOMIC_H): Rename macro to _NIOS2_ATOMIC_MACHINE_H.
* sysdeps/unix/sysv/linux/sh/bits/atomic.h: Move to ...
* sysdeps/unix/sysv/linux/sh/atomic-machine.h: ...here.
* sysdeps/x86_64/bits/atomic.h: Move to ...
* sysdeps/x86_64/atomic-machine.h: ...here.
* include/atomic.h: Include <atomic-machine.h> instead of
<bits/atomic.h>.
This adds new functions for futex operations, starting with wait,
abstimed_wait, reltimed_wait, wake. They add documentation and error
checking according to the current draft of the Linux kernel futex manpage.
Waiting with absolute or relative timeouts is split into separate functions.
This allows for removing a few cases of code duplication in pthreads code,
which uses absolute timeouts; also, it allows us to put platform-specific
code to go from an absolute to a relative timeout into the platform-specific
futex abstractions..
Futex operations that can be canceled are also split out into separate
functions suffixed by "_cancelable".
There are separate versions for both Linux and NaCl; while they currently
differ only slightly, my expectation is that the separate versions of
lowlevellock-futex.h will eventually be merged into futex-internal.h
when we get to move the lll_ functions over to the new futex API.
This patch combines BUSY_WAIT_NOP and atomic_delay into a new
atomic_spin_nop function and adjusts all clients. The new function is
put into atomic.h because what is best done in a spin loop is
architecture-specific, and atomics must be used for spinning. The
function name is meant to tell users that this has no effect on
synchronization semantics but is a performance aid for spinning.
mq_notify (present in POSIX by 1996) brings in references to
pthread_barrier_init and pthread_barrier_wait (new in the 2001 edition
of POSIX). This patch fixes this by making those functions into weak
aliases of __pthread_barrier_*, exporting the __pthread_barrier_*
names at version GLIBC_PRIVATE and using them in mq_notify.
Tested for x86_64 and x86 (testsuite, and comparison of installed
stripped shared libraries). Changes in addresses from dynamic symbol
table / PLT changes render most comparisons not particularly useful,
but when the addresses of subsequent code don't change there's no sign
of unexpected changes there. This patch does not remove any
linknamespace XFAILs because of other namespace issues remaining with
mqueue.h functions.
[BZ #18544]
* nptl/pthread_barrier_init.c (pthread_barrier_init): Rename to
__pthread_barrier_init and define as weak alias of
__pthread_barrier_init.
* sysdeps/sparc/nptl/pthread_barrier_init.c
(pthread_barrier_init): Likewise.
* nptl/pthread_barrier_wait.c (pthread_barrier_wait): Rename to
__pthread_barrier_wait and define as weak alias of
__pthread_barrier_wait.
* sysdeps/sparc/nptl/pthread_barrier_wait.c
(pthread_barrier_wait): Likewise.
* sysdeps/sparc/sparc32/pthread_barrier_wait.c
(pthread_barrier_wait): Likewise.
* sysdeps/unix/sysv/linux/i386/i486/pthread_barrier_wait.S
(pthread_barrier_wait): Likewise.
* sysdeps/unix/sysv/linux/x86_64/pthread_barrier_wait.S
(pthread_barrier_wait): Likewise.
* nptl/Versions (libpthread): Export __pthread_barrier_init and
__pthread_barrier_wait at version GLIBC_PRIVATE.
* include/pthread.h (__pthread_barrier_init): Declare.
(__pthread_barrier_wait): Likewise.
* sysdeps/unix/sysv/linux/mq_notify.c (notification_function):
Call __pthread_barrier_wait instead of pthread_barrier_wait.
(helper_thread): Likewise.
(init_mq_netlink): Call __pthread_barrier_init instead of
pthread_barrier_init.
The sem_* functions bring in references to tdelete, tfind, tsearch and
twalk. But the t* functions are XSI-shaded, while sem_* aren't. This
patch fixes this by using __t* instead, exporting those functions from
libc at version GLIBC_PRIVATE (since sem_* are in libpthread) and
using libc_hidden_* for the benefit of calls within libc.
Tested for x86_64 and x86 (testsuite, and comparison of disassembly of
installed stripped shared libraries). libpthread gets changes from
PLT reordering; addresses in libc change because of PLT / dynamic
symbol table changes.
[BZ #18536]
* misc/tsearch.c (__tsearch): Use libc_hidden_def.
(__tfind): Likewise.
(__tdelete): Likewise.
(__twalk): Likewise.
* misc/Versions (libc): Add __tdelete, __tfind, __tsearch and
__twalk to GLIBC_PRIVATE.
* include/search.h (__tsearch): Use libc_hidden_proto.
(__tfind): Likewise.
(__tdelete): Likewise.
(__twalk): Likewise.
* nptl/sem_close.c (sem_close): Call __twalk instead of twalk.
Call __tdelete instead of tdelete.
* nptl/sem_open.c (check_add_mapping): Call __tfind instead of
tfind. Call __tsearch instead of tsearch.
* sysdeps/sparc/sparc32/sem_open.c (check_add_mapping): Likewise.
* conform/Makefile (test-xfail-POSIX/semaphore.h/linknamespace):
Remove variable.
(test-xfail-POSIX2008/semaphore.h/linknamespace): Likewise.
* sysdeps/sparc/sparc32/bits/atomic.h
(__sparc32_atomic_do_unlock24): Put the memory barrier before the
unlock not after it.
(__v9_compare_and_exchange_val_32_acq): Use unions to avoid getting
volatile register usage warnings from the compiler.