i586 strcpy.S used a clever trick with LEA to implement jump table:
/* ECX has the last 2 bits of the address of source - 1. */
andl $3, %ecx
call 2f
2: popl %edx
/* 0xb is the distance between 2: and 1:. */
leal 0xb(%edx,%ecx,8), %ecx
jmp *%ecx
.align 8
1: /* ECX == 0 */
orb (%esi), %al
jz L(end)
stosb
xorl %eax, %eax
incl %esi
/* ECX == 1 */
orb (%esi), %al
jz L(end)
stosb
xorl %eax, %eax
incl %esi
/* ECX == 2 */
orb (%esi), %al
jz L(end)
stosb
xorl %eax, %eax
incl %esi
/* ECX == 3 */
L(1): movl (%esi), %ecx
leal 4(%esi),%esi
This fails if there are instruction length changes before L(1):. This
patch replaces it with conditional branches:
cmpb $2, %cl
je L(Src2)
ja L(Src3)
cmpb $1, %cl
je L(Src1)
L(Src0):
which have similar performance and work with any instruction lengths.
Tested on i586 and i686 with and without --disable-multi-arch.
[BZ #22353]
* sysdeps/i386/i586/strcpy.S (STRCPY): Use conditional branches.
(1): Renamed to ...
(L(Src0)): This.
(L(Src1)): New.
(L(Src2)): Likewise.
(L(1)): Renamed to ...
(L(Src3)): This.
POWER9 DD2.1 and earlier has an issue where some cache inhibited
vector load traps to the kernel, causing a performance degradation. To
handle this in memcpy and memmove, lvx/stvx is used for aligned
addresses instead of lxvd2x/stxvd2x.
Reference: https://patchwork.ozlabs.org/patch/814059/
* sysdeps/powerpc/powerpc64/power7/memcpy.S: Replace
lxvd2x/stxvd2x with lvx/stvx.
* sysdeps/powerpc/powerpc64/power7/memmove.S: Likewise.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The glibc implementation of iseqsig relies on ordered comparison
operators raising the "invalid" exception for quiet NaN operands, with
a workaround on platforms where a GCC bug means that exception is not
raised. For x86, that bug has now been fixed for GCC 8, so this patch
disables the workaround in that case. If and when the corresponding
bugs for powerpc and s390 are fixed, the headers for those platforms
should of course be updated similarly.
Tested for x86_64 and x86, including with GCC mainline. Note that
other failures appear with GCC mainline because of spurious use of
ordered comparison instructions for unordered operations
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82692>.
* sysdeps/x86/fpu/fix-fp-int-compare-invalid.h
(FIX_COMPARE_INVALID): Define to 0 if [__GNUC_PREREQ (8, 0)].
As shown in some buildbot issues on aarch64 and powerpc, calling
clone (VFORK) and waitpid (WNOHANG) does not guarantee the child
is ready to be collected. This patch changes the call back to 0
as before fe05e1cb6d fix.
This change can lead to the scenario 4.3 described in the commit,
where the waitpid call can hang undefinitely on the call. However
this is also a very unlikely and also undefinied situation where
both the caller is trying to terminate a pid before posix_spawn
returns and the race pid reuse is triggered. I don't see how to
correct handle this specific situation within posix_spawn.
Checked on x86_64-linux-gnu, aarch64-linux-gnu and
powerpc64-linux-gnu.
* sysdeps/unix/sysv/linux/spawni.c (__spawnix): Use 0 instead of
WNOHANG in waitpid call.
cfi info for stack adjust needs to be on the insn doing the adjust.
cfi describing register saves can be anywhere after the save insn but
before the reg is altered. Fewer locations with cfi result in smaller
cfi programs and possibly slightly faster exception handling. Thus
the LR cfi_offset move.
The idea behind ajusting sp after restoring regs is to break a
register dependency chain, in this case not be using r1 immediately
after it is modified.
The missing LR cfi_restore meant that code after the blr,
unaligned_lt_16 and other labels, would have cfi that said LR was at
cfa+16, but that code is reached without LR being saved.
* sysdeps/powerpc/powerpc64/power8/strncpy.S: Move LR cfi.
Adjust stack after restoring regs. Add missing LR cfi_restore.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com>
This patch moves the frame setup and teardown to immediately around
the single memset call, as has been done for power8. I've also
decreased FRAMESIZE to that needed to save the two callee-saved
registers used. Plus added cfi.
* sysdeps/powerpc/powerpc64/power7/strncpy.S: Decrease FRAMESIZE.
Move LR save and frame setup/teardown and LR restore to
immediately around memset call. Provide cfi.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com>
This patch replaces i386 assembly versions of e_exp2f with generic
e_exp2f.c. For workload-spec2017.wrf, on Nehalem, it improves
performance by:
Before After Improvement
reciprocal-throughput 112.996 40.0454 182%
latency 126.581 54.4479 132%
On Skylake, it improves performance by:
Before After Improvement
reciprocal-throughput 113.14 39.447 186%
latency 136.068 55.684 144%
On IvyBridge with --disable-multi-arch, it improves performance by:
Before After Improvement
reciprocal-throughput 132.521 40.3759 228%
latency 145.791 58.4587 149%
* sysdeps/i386/fpu/e_exp2f.S: Removed.
* sysdeps/i386/fpu/w_exp2f.c: Likewise.
* sysdeps/i386/fpu/libm-test-ulps: Updated for generic e_exp2f.c.
* sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise.
* sysdeps/i386/i686/fpu/multiarch/Makefile (libm-sysdep_routines):
Add e_exp2f-sse2.
(CFLAGS-e_exp2f-sse2.c): New.
* sysdeps/i386/i686/fpu/multiarch/e_exp2f-sse2.c: New file.
* sysdeps/i386/i686/fpu/multiarch/e_exp2f.c: Likewise.
The bits/floatn.h header currently only has defines relating to
_Float128. This patch adds defines relating to other _FloatN /
_FloatNx types.
The approach taken is to add defines for all _FloatN / _FloatNx types
known to GCC, and to put them in a common bits/floatn-common.h header
included at the end of all the individual bits/floatn.h headers. If
in future some defines become different for different glibc
configurations, they will move out into the separate bits/floatn.h
headers.
Some defines are expected always to be the same across glibc ports.
Corresponding defines are nevertheless put in this header. The intent
is that where there are conditionals (in headers or in non-installed
files) that can just repeat the same or nearly the same logic for each
floating-point type, they should do so, even if in fact the cases for
some types could be unconditionally present or absent because the same
conditionals are true or false for all glibc configurations. This
should make the glibc code with such conditionals easier to read,
because the reader can just see that the same conditionals are
repeated for each type, rather than seeing different conditionals for
different types and needing to reason, at each location with such
differences, why those differences are indeed correct there. (Cases
involving per-format rather than per-type logic are more likely still
to need differences in how they handle different types.)
Having such defines and conditionals also helps in incremental
preparation for adding _Float32 / _Float64 / _Float32x / _Float64x
function aliases. I intend subsequent patches to add such
conditionals corresponding to those already present for _Float128, as
well as making more architecture-specific function implementations use
common macros to define aliases in preparation for adding such _FloatN
/ _FloatNx aliases.
Tested for x86_64.
* bits/floatn-common.h: New file.
* math/Makefile (headers): Add bits/floatn-common.h.
* bits/floatn.h: Include <bits/floatn-common.h>.
* sysdeps/ia64/bits/floatn.h: Likewise.
* sysdeps/ieee754/ldbl-128/bits/floatn.h: Likewise.
* sysdeps/mips/ieee754/bits/floatn.h: Likewise.
* sysdeps/powerpc/bits/floatn.h: Likewise.
* sysdeps/x86/bits/floatn.h: Likewise.
As noted by Florian Weimer, current Linux posix_spawn implementation
can trigger an assert if the auxiliary process is terminated before
actually setting the err member:
340 /* Child must set args.err to something non-negative - we rely on
341 the parent and child sharing VM. */
342 args.err = -1;
[...]
362 new_pid = CLONE (__spawni_child, STACK (stack, stack_size), stack_size,
363 CLONE_VM | CLONE_VFORK | SIGCHLD, &args);
364
365 if (new_pid > 0)
366 {
367 ec = args.err;
368 assert (ec >= 0);
Another possible issue is killing the child between setting the err and
actually calling execve. In this case the process will not ran, but
posix_spawn also will not report any error:
269
270 args->err = 0;
271 args->exec (args->file, args->argv, args->envp);
As suggested by Andreas Schwab, this patch removes the faulty assert
and also handles any signal that happens before fork and execve as the
spawn was successful (and thus relaying the handling to the caller to
figure this out). Different than Florian, I can not see why using
atomics to set err would help here, essentially the code runs
sequentially (due CLONE_VFORK) and I think it would not be legal the
compiler evaluate ec without checking for new_pid result (thus there
is no need to compiler barrier).
Summarizing the possible scenarios on posix_spawn execution, we
have:
1. For default case with a success execution, args.err will be 0, pid
will not be collected and it will be reported to caller.
2. For default failure case, args.err will be positive and the it will
be collected by the waitpid. An error will be reported to the
caller.
3. For the unlikely case where the process was terminated and not
collected by a caller signal handler, it will be reported as succeful
execution and not be collected by posix_spawn (since args.err will
be 0). The caller will need to actually handle this case.
4. For the unlikely case where the process was terminated and collected
by caller we have 3 other possible scenarios:
4.1. The auxiliary process was terminated with args.err equal to 0:
it will handled as 1. (so it does not matter if we hit the pid
reuse race since we won't possible collect an unexpected
process).
4.2. The auxiliary process was terminated after execve (due a failure
in calling it) and before setting args.err to -1: it will also
be handle as 1. but with the issue of not be able to report the
caller a possible execve failures.
4.3. The auxiliary process was terminated after args.err is set to -1:
this is the case where it will be possible to hit the pid reuse
case where we will need to collected the auxiliary pid but we
can not be sure if it will be expected one. I think for this
case we need to actually change waitpid to use WNOHANG to avoid
hanging indefinitely on the call and report an error to caller
since we can't differentiate between a default failure as 2.
and a possible pid reuse race issue.
Checked on x86_64-linux-gnu.
* sysdeps/unix/sysv/linux/spawni.c (__spawnix): Handle the case where
the auxiliary process is terminated by a signal before calling _exit
or execve.
In _dl_runtime_resolve, use fxsave/xsave/xsavec to preserve all vector,
mask and bound registers. It simplifies _dl_runtime_resolve and supports
different calling conventions. ld.so code size is reduced by more than
1 KB. However, use fxsave/xsave/xsavec takes a little bit more cycles
than saving and restoring vector and bound registers individually.
Latency for _dl_runtime_resolve to lookup the function, foo, from one
shared library plus libc.so:
Before After Change
Westmere (SSE)/fxsave 345 866 151%
IvyBridge (AVX)/xsave 420 643 53%
Haswell (AVX)/xsave 713 1252 75%
Skylake (AVX+MPX)/xsavec 559 719 28%
Skylake (AVX512+MPX)/xsavec 145 272 87%
Ryzen (AVX)/xsavec 280 553 97%
This is the worst case where portion of time spent for saving and
restoring registers is bigger than majority of cases. With smaller
_dl_runtime_resolve code size, overall performance impact is negligible.
On IvyBridge, differences in build and test time of binutils with lazy
binding GCC and binutils are noises. On Westmere, differences in
bootstrap and "makc check" time of GCC 7 with lazy binding GCC and
binutils are also noises.
[BZ #21265]
* sysdeps/x86/cpu-features-offsets.sym (XSAVE_STATE_SIZE_OFFSET):
New.
* sysdeps/x86/cpu-features.c: Include <libc-pointer-arith.h>.
(get_common_indeces): Set xsave_state_size, xsave_state_full_size
and bit_arch_XSAVEC_Usable if needed.
(init_cpu_features): Remove bit_arch_Use_dl_runtime_resolve_slow
and bit_arch_Use_dl_runtime_resolve_opt.
* sysdeps/x86/cpu-features.h (bit_arch_Use_dl_runtime_resolve_opt):
Removed.
(bit_arch_Use_dl_runtime_resolve_slow): Likewise.
(bit_arch_Prefer_No_AVX512): Updated.
(bit_arch_MathVec_Prefer_No_AVX512): Likewise.
(bit_arch_XSAVEC_Usable): New.
(STATE_SAVE_OFFSET): Likewise.
(STATE_SAVE_MASK): Likewise.
[__ASSEMBLER__]: Include <cpu-features-offsets.h>.
(cpu_features): Add xsave_state_size and xsave_state_full_size.
(index_arch_Use_dl_runtime_resolve_opt): Removed.
(index_arch_Use_dl_runtime_resolve_slow): Likewise.
(index_arch_XSAVEC_Usable): New.
* sysdeps/x86/cpu-tunables.c (TUNABLE_CALLBACK (set_hwcaps)):
Support XSAVEC_Usable. Remove Use_dl_runtime_resolve_slow.
* sysdeps/x86_64/Makefile (tst-x86_64-1-ENV): New if tunables
is enabled.
* sysdeps/x86_64/dl-machine.h (elf_machine_runtime_setup):
Replace _dl_runtime_resolve_sse, _dl_runtime_resolve_avx,
_dl_runtime_resolve_avx_slow, _dl_runtime_resolve_avx_opt,
_dl_runtime_resolve_avx512 and _dl_runtime_resolve_avx512_opt
with _dl_runtime_resolve_fxsave, _dl_runtime_resolve_xsave and
_dl_runtime_resolve_xsavec.
* sysdeps/x86_64/dl-trampoline.S (DL_RUNTIME_UNALIGNED_VEC_SIZE):
Removed.
(DL_RUNTIME_RESOLVE_REALIGN_STACK): Check STATE_SAVE_ALIGNMENT
instead of VEC_SIZE.
(REGISTER_SAVE_BND0): Removed.
(REGISTER_SAVE_BND1): Likewise.
(REGISTER_SAVE_BND3): Likewise.
(REGISTER_SAVE_RAX): Always defined to 0.
(VMOV): Removed.
(_dl_runtime_resolve_avx): Likewise.
(_dl_runtime_resolve_avx_slow): Likewise.
(_dl_runtime_resolve_avx_opt): Likewise.
(_dl_runtime_resolve_avx512): Likewise.
(_dl_runtime_resolve_avx512_opt): Likewise.
(_dl_runtime_resolve_sse): Likewise.
(_dl_runtime_resolve_sse_vex): Likewise.
(USE_FXSAVE): New.
(_dl_runtime_resolve_fxsave): Likewise.
(USE_XSAVE): Likewise.
(_dl_runtime_resolve_xsave): Likewise.
(USE_XSAVEC): Likewise.
(_dl_runtime_resolve_xsavec): Likewise.
* sysdeps/x86_64/dl-trampoline.h (_dl_runtime_resolve_avx512):
Removed.
(_dl_runtime_resolve_avx512_opt): Likewise.
(_dl_runtime_resolve_avx): Likewise.
(_dl_runtime_resolve_avx_opt): Likewise.
(_dl_runtime_resolve_sse): Likewise.
(_dl_runtime_resolve_sse_vex): Likewise.
(_dl_runtime_resolve_fxsave): New.
(_dl_runtime_resolve_xsave): Likewise.
(_dl_runtime_resolve_xsavec): Likewise.
When --enable-static-pie is used to configure glibc, we need to use
_dl_relocate_static_pie to compute load address in static PIE.
* sysdeps/m68k/dl-machine.h (elf_machine_load_address): Use
_dl_relocate_static_pie instead of _dl_start to compute load
address in static PIE.
After commit 37f802f864 (Remove
__need_IOV_MAX and __need_FOPEN_MAX), UIO_MAXIOV is no longer supplied
(indirectly) through <bits/stdio_lim.h>, so sysdeps/posix/sysconf.c no
longer sees the definition.
This patch adds a MIPS-specific bits/floatn.h header. This header is
identical to the ldbl-128 version except for the comment at the top;
the purpose is to ensure that a 32-bit MIPS build installs a header
that is the same as in a 64-bit MIPS build and so properly shows
_Float128 support to be available for 64-bit compilations, on the
general principle of an installation for one multilib providing
headers also suitable for other multilibs.
Tested with build-many-glibcs.py.
* sysdeps/mips/ieee754/bits/floatn.h: New file.
Similar to bug 21987 for SPARC, MIPS64 wrongly installs the ldbl-128
version of bits/long-double.h, meaning incorrect results when using
headers installed from a 64-bit installation for a 32-bit build. (I
haven't actually seen this cause build failures before its interaction
with bits/floatn.h did so - installed headers wrongly expecting
_Float128 to be available in a 32-bit configuration.)
This patch fixes the bug by moving the MIPS header to
sysdeps/mips/ieee754, which comes before sysdeps/ieee754/ldbl-128 in
the sysdeps directory ordering. (bits/floatn.h will need a similar
fix - duplicating the ldbl-128 version for MIPS will suffice - for
headers from a 32-bit installation to be correct for 64-bit builds.)
Tested with build-many-glibcs.py (compilers build for
mips64-linux-gnu, where there was previously a libstdc++ build failure
as at
<https://sourceware.org/ml/libc-testresults/2017-q4/msg00130.html>).
[BZ #22322]
* sysdeps/mips/bits/long-double.h: Move to ....
* sysdeps/mips/ieee754/bits/long-double.h: ... here.
This patch adds support for *f128 function aliases on platforms where
long double has the binary128 format (and thus GCC 7 provides the
_Float128 type with the same ABI as long double but as a distinct type
in terms of C type compatibility). This is the same API as provided
in glibc 2.26 for powerpc64le / x86_64 / x86 / ia64 where _Float128
has a different format from long double, with the bulk of the API
coming from TS 18661-3. All the functions alias the corresponding
long double functions, and __* function names are not provided since
those are only needed once for each floating-point format, not more
than once for different types with the same format (so for example,
-ffinite-math-only maps foof128 to __fool_finite, while type-generic
macros end up calling e.g. __issignalingl for _Float128 arguments on
such platforms).
The preparation for this feature was done in previous patches, so this
one just needs to add the relevant makefile and header definitions,
and update macro definitions of libm_alias_ldouble_other_r, to turn on
the feature, and update documentation and ABI baselines.
Tested (a) for x86_64, (b) for aarch64, (c) with build-many-glibcs.py
with both GCC 6 and GCC 7.
* sysdeps/ieee754/ldbl-128/Makeconfig: New file.
* sysdeps/ieee754/ldbl-128/bits/floatn.h: Likewise.
* sysdeps/ieee754/ldbl-128/float128-abi.h: Likewise.
* sysdeps/generic/libm-alias-ldouble.h: Include <bits/floatn.h>.
[__HAVE_FLOAT128 && !__HAVE_DISTINCT_FLOAT128]
(libm_alias_ldouble_other_r): Also create _Float128 alias.
* sysdeps/ieee754/ldbl-opt/libm-alias-ldouble.h: Include
<bits/floatn.h>.
[__HAVE_FLOAT128 && !__HAVE_DISTINCT_FLOAT128]
(libm_alias_ldouble_other_r): Also create _Float128 alias.
* manual/math.texi (Mathematics): Document additional architecture
support for _Float128.
* sysdeps/unix/sysv/linux/aarch64/libc.abilist: Update.
* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/alpha/libc.abilist: Likewise.
* sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise.
This patch rewrites aarch64 elf_machine_load_address to use special _DYNAMIC
symbol instead of _dl_start.
The static address of _DYNAMIC symbol is stored in the first GOT entry.
Here is the change which makes this solution work (part of binutils 2.24):
https://sourceware.org/ml/binutils/2013-06/msg00248.html
i386, x86_64 targets use the same method to do this as well.
The original implementation relies on a trick that R_AARCH64_ABS32 relocation
being resolved at link time and the static address fits in the 32bits.
However, in LP64, normally, the address is defined to be 64 bit.
Here is the C version one which should be portable in all cases.
* sysdeps/aarch64/dl-machine.h (elf_machine_load_address): Use
_DYNAMIC symbol to calculate load address.
A performance regression was introduced by commit
84d74e427a "powerpc: Cleanup fenv_private.h".
In the powerpc implementation of SET_RESTORE_ROUND, there is the
following code in the "SET" function (slightly simplified):
--
old.fenv = fegetenv_register ();
new.l = (old.l & _FPU_MASK_TRAPS_RN) | r; (1)
if (new.l != old.l) (2)
{
if ((old.l & _FPU_ALL_TRAPS) != 0)
(void) __fe_mask_env ();
fesetenv_register (new.fenv); (3)
--
Line (1) sets the value of "new" to the current value of FPSCR,
but masks off summary bits, exceptions, non-IEEE mode, and
rounding mode, then ORs in the new rounding mode.
Line (2) compares this new value to the current value in order to
avoid setting a new value in the FPSCR (line (3)) unless something
significant has changed (exception enables or rounding mode).
The summary bits are not germane to the comparison, but are cleared
in "new" and preserved in "old", resulting in false negative
comparisons, and unnecessarily setting the FPSCR in those cases
with associated negative performance impacts.
The solution is to treat the summaries identically for "new" and "old":
- save them in SET
- leave them alone otherwise
- restore the saved values in RESTORE
Also minor changes:
- expand _FPU_MASK_RN to 64bit hex, to match other MASKs
- treat bit 52 (left-to-right) as reserved (since it is)
* sysdeps/powerpc/fpu/fenv_private.h (_FPU_MASK_TRAPS_RN):
(_FPU_MASK_FRAC_INEX_RET_CC): Fix masks to more properly handle
summary bits.
(_FPU_MASK_RN): Expand _FPU_MASK_RN to 64bit hex.
(_FPU_MASK_NOT_RN_NI): Treat bit 52 (left-to-right) as reserved.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com>
This patch moves the generic definition from x86_64 init-arch
to a common header ifunc-init.h. No functional changes is expected.
Checked on a x86_64-linux-gnu build.
* sysdeps/generic/ifunc-init.h: New file.
* sysdeps/x86/init-arch.h: Use generic ifunc-init.h.
With support for _Float128 functions on platforms where that type has
the same ABI as long double, as well as on platforms where it is
ABI-distinct, those functions will need to be exported from glibc's
shared libraries at appropriate symbol versions in each case.
This patch avoids duplication of lists of symbols to export by moving
the symbols other than __* to math/Versions and stdlib/Versions.
There, they are conditional on <float128-abi.h> defining
FLOAT128_VERSION and a default version of that header is added that
does not define that macro. Enabling the float128 function aliases
will then include adding a sysdeps/ieee754/ldbl-128/float128-abi.h
that defines FLOAT128_VERSION to GLIBC_2.27. Symbols __* remain in
sysdeps/ieee754/float128/Versions; those symbols should be present
only once per floating-point format, not once per type.
Note that if any platforms currently lacking support for a type with
binary128 format get glibc support for such a type in future (whether
only as _Float128, or also as a new long double format), and new libm
functions (present for all types) have been added by then, additional
macros will be needed to allow such functions to get a version of the
form "GLIBC_2.28 if the platform had _Float128 support by then, or the
later version at which that platform had _Float128 support added".
This is not however a preexisting condition, but would have applied
equally to the existing support for _Float128 as an ABI-distinct
type. New all-type libm functions should just be added to the
appropriate symbol version (currently GLIBC_2.27) for all types, with
such special-case handling for _Float128 versions (and _Float64x as
well in future) waiting until someone actually wants to add support
for _Float128 to an existing platform after a release in which that
platform and a post-2.26 libm function had support but that platform
lacked _Float128 support.
Tested with build-many-glibcs.py that installed stripped shared
libraries are unchanged by this patch. Also tested in conjunction
with the remaining changes to enable float128 aliases.
* sysdeps/generic/float128-abi.h: New file.
* sysdeps/ieee754/float128/Versions (FLOAT128_VERSION): Move
non-__prefixed symbols to ....
* math/Versions: ... here. Include <float128-abi.h>.
* stdlib/Versions ... and here. Include <float128-abi.h>
This patch adds support for building strtof128, wcstof128, strtof128_l
and wcstof128_l as aliases, in the case of __HAVE_FLOAT128 &&
!__HAVE_DISTINCT_FLOAT128.
Tested with build-many-glibcs.py that installed stripped shared
libraries are unchanged by this patch. Also tested together with
changes to enable float128 aliases.
* stdlib/strtold.c: Include <bits/floatn.h>
[__HAVE_FLOAT128 && !__HAVE_DISTINCT_FLOAT128] (strtof128): Define
and later undefine as macro. Define as weak alias if
[!USE_WIDE_CHAR].
[__HAVE_FLOAT128 && !__HAVE_DISTINCT_FLOAT128] (wcstof128): Define
and later undefine as macro. Define as weak alias if
[USE_WIDE_CHAR].
* sysdeps/ieee754/ldbl-128/strtold_l.c [__HAVE_FLOAT128 &&
!__HAVE_DISTINCT_FLOAT128] (strtof128_l): Define and later
undefine as macro. Define as weak alias if [!USE_WIDE_CHAR].
[__HAVE_FLOAT128 && !__HAVE_DISTINCT_FLOAT128] (wcstof128_l):
Define and later undefine as macro. Define as weak alias if
[USE_WIDE_CHAR].
* sysdeps/ieee754/ldbl-64-128/strtold_l.c: Include
<bits/floatn.h>.
[__HAVE_FLOAT128 && !__HAVE_DISTINCT_FLOAT128] (strtof128_l):
Define and later undefine as macro. Define as weak alias if
[!USE_WIDE_CHAR].
[__HAVE_FLOAT128 && !__HAVE_DISTINCT_FLOAT128] (wcstof128_l):
Define and later undefine as macro. Define as weak alias if
[USE_WIDE_CHAR].
This patch makes ldbl-64-128/s_nextafterl.c restore the default
weak_alias definition and use libm_alias_ldouble_other (having
undefined and redefined weak_alias for the include of
ldbl-128/s_nextafterl.c, so the libm_alias_ldouble use in the latter
file is ineffective).
Tested with build-many-glibcs.py that installed stripped shared
libraries are unchanged by this patch. Also tested together with
changes to enable float128 aliases.
* sysdeps/ieee754/ldbl-64-128/s_nextafterl.c (weak_alias):
Undefine and restore default definition. Use
libm_alias_ldouble_other.
Normally, TLS relocations against local symbols are optimised by the linker
to be absolute. However, gold does not do this, and so it is possible to
end up with, for example, R_SPARC_TLS_DTPMOD64 referring to a local symbol.
Since sym_map is left as null in elf_machine_rela for the special local
symbol case, the relocation handling thinks it has nothing to do, and so
the module gets left as 0. Havoc then ensues when the variable in question
is accessed.
Before this fix, the main_local_gold program would receive a SIGBUS on
sparc64, and SIGSEGV on powerpc32. With this fix applied, that test now
passes like the rest of them.
* sysdeps/powerpc/powerpc32/dl-machine.h (elf_machine_rela):
Assign sym_map to be map for local symbols, as TLS relocations
use sym_map to determine whether the symbol is defined and to
extract the TLS information.
* sysdeps/sparc/sparc32/dl-machine.h (elf_machine_rela): Likewise.
* sysdeps/sparc/sparc64/dl-machine.h (elf_machine_rela): Likewise.
Fix the ifdef clause that was being used in the opposite way, setting
a wrong value of the carry bit.
This is also correcting 2 memory accesses that were mistakenly referring
to r0 while they were supposed to mean the immediate value 0.
[BZ #22142]
* stdio-common/tst-printf.c (fp_test): Add tests for DBL_MAX and
-DBL_MAX.
(do_test): Likewise.
* stdio-common/tst-printf.sh: Likewise.
* sysdeps/powerpc/powerpc64/power7/add_n.S: Invert the initial
ifdef clause in order to set the carry bit right. Replace r0 by
0 without changing the behavior.
This patch makes SPARC fabsl implementation use libm_alias_ldouble, to
prepare them for also defining _Float128 function aliases.
Tested with build-many-glibcs.py that installed stripped shared
libraries (sparc64-linux-gnu and sparcv9-linux-gnu) are unchanged by
the patch.
* sysdeps/sparc/sparc32/fpu/s_fabsl.c: Include
<libm-alias-ldouble.h>.
(fabsl): Define using libm_alias_ldouble.
* sysdeps/sparc/sparc64/fpu/s_fabsl.c: Include
<libm-alias-ldouble.h>.
(fabsl): Define using libm_alias_ldouble.
Testing with changes to enable _Float128 function aliases shows that
the libm_alias_ldouble_other usage in ldbl-opt/w_lgamma_compatl.c does
not in fact work. Furthermore, it is unnecessary; the relevant
aliases get created through w_lgammal_compat2.c. This patch removes
the problem code.
Tested with build-many-glibcs.py that installed stripped shared
libraries are unchanged by the patch. Also tested in conjunction with
patches to enable _Float128 function aliases.
* sysdeps/ieee754/ldbl-opt/w_lgamma_compatl.c [BUILD_LGAMMA]:
Remove conditional code.
Testing with changes to enable _Float128 function aliases shows that
the libm_alias_ldouble_other usage in ldbl-opt/s_clog10l.c does not in
fact work, because __clog10l is defined with long_double_symbol rather
than as a normal C alias. This patch fixes this by renaming the
__clog10l__internal alias (not strictly necessary, but avoids a hack
with "__clog10l_interna" / "__clog10l__interna" as first argument to
libm_alias_ldouble_other) and using the renamed alias when calling
libm_alias_ldouble_other.
Tested with build-many-glibcs.py that installed stripped shared
libraries are unchanges by the patch. Also tested in conjunction with
patches to enable _Float128 function aliases.
* sysdeps/ieee754/ldbl-opt/s_clog10l.c (__clog10l__internal):
Rename to __clog10_internal_l.
(__clog10_internal_l): Define aliases using
libm_alias_ldouble_other instead of using libm_alias_ldouble_other
with __clog10.
Current GLIBC has two ways to implement the single thread optimization
on syscalls to avoid calling the cancellation path: either by using
global variables (__{libc,pthread}_multiple_thread) or by accessing
the TCB field (defined by TLS_MULTIPLE_THREADS_IN_TCB). Both the
variables and the macros to acces its value are defined in the
architecture sysdep-cancel.h header.
This patch consolidates its definition on only one header,
sysdeps/unix/sysv/linux/sysdep-cancel.h, and adds a new define
(SINGLE_THREAD_BY_GLOBAL) which the architecture defines if it prefer
to use the global variables instead of the TCB field. This is an
optimization, so if the architecture does not define it, the TCB
method will be used as default.
Checked on x86_64-linux-gnu and on a build with major touched
ABIs (aarch64-linux-gnu, alpha-linux-gnu, arm-linux-gnueabihf,
hppa-linux-gnu, i686-linux-gnu, m68k-linux-gnu, microblaze-linux-gnu,
mips-linux-gnu, mips64-linux-gnu, powerpc-linux-gnu,
powerpc64le-linux-gnu, s390-linux-gnu, s390x-linux-gnu, sh4-linux-gnu,
sparcv9-linux-gnu, sparc64-linux-gnu, tilegx-linux-gnu).
* sysdeps/unix/sysv/linux/aarch64/sysdep-cancel.h: Remove file.
* sysdeps/unix/sysv/linux/alpha/sysdep-cancel.h: Likewise.
* sysdeps/unix/sysv/linux/arm/sysdep-cancel.h: Likewise.
* sysdeps/unix/sysv/linux/hppa/sysdep-cancel.h: Likewise.
* sysdeps/unix/sysv/linux/mips/sysdep-cancel.h: Likewise.
* sysdeps/unix/sysv/linux/nios2/sysdep-cancel.h: Likewise.
* sysdeps/unix/sysv/linux/powerpc/sysdep-cancel.h: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/sysdep-cancel.h: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/sysdep-cancel.h: Likewise.
* sysdeps/unix/sysv/linux/sh/sysdep-cancel.h: Likewise.
* sysdeps/unix/sysv/linux/sparc/sysdep-cancel.h: Likewise.
* sysdeps/unix/sysv/linux/tile/sysdep-cancel.h: Likewise.
* sysdeps/unix/sysv/linux/x86_64/sysdep-cancel.h: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/sysdep.h
(SINGLE_THREAD_BY_GLOBAL): Define.
* sysdeps/unix/sysv/linux/aarch64/sysdep.h (SINGLE_THREAD_BY_GLOBAL):
Likewise.
* sysdeps/unix/sysv/linux/alpha/sysdep.h (SINGLE_THREAD_BY_GLOBAL):
Likewise.
* sysdeps/unix/sysv/linux/arm/sysdep.h (SINGLE_THREAD_BY_GLOBAL):
Likewise.
* sysdeps/unix/sysv/linux/hppa/sysdep.h (SINGLE_THREAD_BY_GLOBAL):
Likewise.
* sysdeps/unix/sysv/linux/microblaze/sysdep.h
(SINGLE_THREAD_BY_GLOBAL): Likewise.
* sysdeps/unix/sysv/linux/x86_64/sysdep.h (SINGLE_THREAD_BY_GLOBAL):
Likewise.
This patch fixes ldbl-opt code to use generic libm alias macros in
preparation for getting _FloatN / _FloatNx aliases where appropriate.
Four functions are affected, that undefine and redefine alias macros
before including the implementations they wrap in such a way that
_FloatN / _FloatNx aliases would not appear. s_clog10l.c undefines
and redefined declare_mgen_alias, so just needs a
libm_alias_ldouble_other call added. w_exp10l_compat.c undefines and
redefines weak_alias, but in fact does not need to do so, since
math/w_exp10l_compat.c uses libm_alias_ldouble and does not use
weak_alias other than through that, so the undefines and redefines of
weak_alias are removed. w_lgamma_compatl.c and w_remainderl_compat.c
are made to use libm_alias_ldouble_other in conjunction with restoring
the original definition of weak_alias so this is effective.
Tested with build-many-glibcs.py. Installed stripped shared libraries
are unchanged by this patch.
* sysdeps/ieee754/ldbl-opt/s_clog10l.c: Use
libm_alias_ldouble_other.
* sysdeps/ieee754/ldbl-opt/w_exp10l_compat.c (weak_alias): Do not
undefine and redefine.
[LIBM_SVID_COMPAT && !LONG_DOUBLE_COMPAT (libm, GLIBC_2_1)]
(exp10l): Do not define here.
* sysdeps/ieee754/ldbl-opt/w_lgamma_compatl.c [BUILD_LGAMMA]
(weak_alias): Undefine and redefine.
[BUILD_LGAMMA]: Use libm_alias_ldouble_other.
* sysdeps/ieee754/ldbl-opt/w_remainderl_compat.c
[LIBM_SVID_COMPAT] (weak_alias): Undefine and redefine here.
[LIBM_SVID_COMPAT]: Use libm_alias_ldouble_other.
Some libm functions are unable to use the generic alias macros such as
libm_alias_double because they have special symbol versioning
requirements for the main float, double or long double public names.
To facilitate adding _FloatN / _FloatNx function aliases in future,
it's still desirable to have generic macros those functions can use as
far as possible. This patch adds macros such as
libm_alias_double_other, which only define names for _FloatN /
_FloatNx aliases, not for float / double / long double. As present,
all these new macros do nothing, but they are called in the
appropriate places in macros such as libm_alias_double. This patch
also arranges for lgamma implementations, and the recently added
optimized float function implementations, to use the new macros to
make them ready for addition of _FloatN / _FloatNx aliases.
Tested for x86_64, and tested with build-many-glibcs.py that installed
stripped shared libraries are unchanged by this patch.
* sysdeps/generic/libm-alias-double.h (libm_alias_double_other_r):
New macro.
(libm_alias_double_other): Likewise.
(libm_alias_double_r): Use libm_alias_double_other_r.
* sysdeps/generic/libm-alias-float.h (libm_alias_float_other_r):
New macro.
(libm_alias_float_other): Likewise.
(libm_alias_float_r): Use libm_alias_float_other_r.
* sysdeps/generic/libm-alias-float128.h
(libm_alias_float128_other_r): New macro.
(libm_alias_float128_other): Likewise.
(libm_alias_float128_r): Use libm_alias_float128_other_r.
* sysdeps/generic/libm-alias-ldouble.h
(libm_alias_ldouble_other_r): New macro.
(libm_alias_ldouble_other): Likewise.
(libm_alias_ldouble_r): Use libm_alias_ldouble_other_r.
* sysdeps/ieee754/ldbl-opt/libm-alias-double.h
(libm_alias_double_other_r): New macro.
(libm_alias_double_other): Likewise.
(libm_alias_double_r): Use libm_alias_double_other_r.
* sysdeps/ieee754/ldbl-opt/libm-alias-ldouble.h
(libm_alias_ldouble_other_r): New macro.
(libm_alias_ldouble_other): Likewise.
(libm_alias_ldouble_r): Use libm_alias_ldouble_other_r.
* math/w_lgamma_main.c: Include <libm-alias-double.h>.
[!USE_AS_COMPAT]: Use libm_alias_double_other.
* math/w_lgammaf_main.c: Include <libm-alias-float.h>.
[!USE_AS_COMPAT]: Use libm_alias_float_other.
* math/w_lgammal_main.c: Include <libm-alias-ldouble.h>.
[!USE_AS_COMPAT]: Use libm_alias_ldouble_other.
* math/w_exp2f.c: Use libm_alias_float_other.
* math/w_expf.c: Likewise.
* math/w_log2f.c: Likewise.
* math/w_logf.c: Likewise.
* math/w_powf.c: Likewise.
* sysdeps/ieee754/flt-32/e_exp2f.c: Include <libm-alias-float.h>.
[!__exp2f]: Use libm_alias_float_other.
* sysdeps/ieee754/flt-32/e_expf.c: Include <libm-alias-float.h>.
[!__expf]: Use libm_alias_float_other.
* sysdeps/ieee754/flt-32/e_log2f.c: Include <libm-alias-float.h>.
[!__log2f]: Use libm_alias_float_other.
* sysdeps/ieee754/flt-32/e_logf.c: Include <libm-alias-float.h>.
[!__logf]: Use libm_alias_float_other.
* sysdeps/ieee754/flt-32/e_powf.c: Include <libm-alias-float.h>.
[!__powf]: Use libm_alias_float_other.
Continuing the use of generic macros for defining libm function
aliases, in preparation for adding more _FloatN / _FloatNx function
names, this patch makes the lgamma_r functions use such macros.
declare_mgen_alias_r becomes a standard macro in math-type-macros.h
instead of being locally defined in w_lgamma_r_templace.c. This in
turn must be defined by each math-type-macros-<type>.h. Rather than
providing an unused default in math-type-macros.h, that header is made
to give an error if math-type-macros-<type>.h failed to define
declare_mgen_alias or declare_mgen_alias_r. The compat lgamma_r
wrappers are updated similarly. The ldbl-opt versions are removed as
no longer needed.
Tested for x86_64, and with build-many-glibcs.py. Installed stripped
shared libraries are unchanged except for powerpc64le (where the usual
issue applies that an ldbl-opt long double function previously used
long_double_symbol unconditionally and now the symbol versions on
powerpc64le mean weak_alias is used instead, resulting in the same
symbol versions in the final shared library but still enough
difference in the input objects for that library not to be
byte-identical).
* sysdeps/generic/math-type-macros.h [!declare_mgen_alias]: Give
error. Remove default definition of declare_mgen_alias.
[!declare_mgen_alias_r]: Likewise.
* sysdeps/generic/math-type-macros-double.h
[!declare_mgen_alias_r] (declare_mgen_alias_r): New macro.
* sysdeps/generic/math-type-macros-float.h [!declare_mgen_alias_r]
(declare_mgen_alias_r): Likewise.
* sysdeps/generic/math-type-macros-float128.h
[!declare_mgen_alias_r] (declare_mgen_alias_r): Likewise.
* sysdeps/generic/math-type-macros-ldouble.h
[!declare_mgen_alias_r] (declare_mgen_alias_r): Likewise.
* math/w_lgamma_r_template.c (declare_mgen_alias_r_x): Remove
macro.
(declare_mgen_alias_r_s): Likewise.
(declare_mgen_alias_r): Likewise.
* math/w_lgamma_r_compat.c: Include <libm-alias-double.h>.
(lgamma_r): Define using libm_alias_double_r.
* math/w_lgammaf_r_compat.c: Include <libm-alias-float.h>.
(lgammaf_r): Define using libm_alias_float_r.
* math/w_lgammal_r_compat.c: Include <libm-alias-ldouble.h>.
(lgammal_r): Define using libm_alias_ldouble_r.
* sysdeps/ieee754/ldbl-opt/w_lgamma_r_compat.c: Remove file.
* sysdeps/ieee754/ldbl-opt/w_lgammal_r_compat.c: Likewise.
The ldbl-opt version of w_scalbln.c is not in fact needed; it handles
compat symbol versions for libc, but this file isn't built for libc,
only for libm. This patch removes this file.
Tested with build-many-glibcs.py that installed stripped shared
libraries are unchanged by this patch.
* sysdeps/ieee754/ldbl-opt/w_scalbln.c: Remove file.
This patch makes the ldbl-128 and ldbl-96 implementations of fma use
libm_alias_double.
Tested for x86_64, and tested with build-many-glibcs.py that installed
stripped shared libraries are unchanged by the patch.
* sysdeps/ieee754/ldbl-128/s_fma.c: Include <libm-alias-double.h>.
[!__fma] (fma): Define using libm_alias_double.
* sysdeps/ieee754/ldbl-96/s_fma.c: Include <libm-alias-double.h>.
[!__fma] (fma): Define using libm_alias_double.
This patch makes ldbl-128 functions use libm_alias_ldouble to define
function aliases. float128_private.h is updated accordingly. Most of
the ldbl-64-128 wrappers are removed as no longer needed with this
change (leaving those that involve versioning for functions in libc or
that shouldn't be exported from libm for _Float128 / _Float64x types
with the same format as long double).
Tested for x86_64, and tested with build-many-glibcs.py that installed
stripped shared libraries are unchanged by this patch.
* sysdeps/ieee754/float128/float128_private.h: Include
<libm-alias-ldouble.h> and <libm-alias-float128.h>.
(libm_alias_ldouble_r): Undefine and redefine.
* sysdeps/ieee754/ldbl-128/s_asinhl.c: Include
<libm-alias-ldouble.h>.
(asinhl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_atanl.c: Include
<libm-alias-ldouble.h>.
(atanl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_cbrtl.c: Include
<libm-alias-ldouble.h>.
(cbrtl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_ceill.c: Include
<libm-alias-ldouble.h>.
(ceill): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_copysignl.c: Include
<libm-alias-ldouble.h>.
(copysignl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_cosl.c: Include
<libm-alias-ldouble.h>.
(cosl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_erfl.c: Include
<libm-alias-ldouble.h>.
(erfl): Define using libm_alias_ldouble.
(erfcl): Likewise.
* sysdeps/ieee754/ldbl-128/s_expm1l.c: Include
<libm-alias-ldouble.h>.
(expm1l): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_fabsl.c: Include
<libm-alias-ldouble.h>.
(fabsl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_floorl.c: Include
<libm-alias-ldouble.h>.
(floorl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_fmal.c: Include
<libm-alias-ldouble.h>.
(fmal): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_frexpl.c: Include
<libm-alias-ldouble.h>.
(frexpl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_fromfpl.c (fromfpl): Define using
libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_fromfpl_main.c: Include
<libm-alias-ldouble.h>.
* sysdeps/ieee754/ldbl-128/s_fromfpxl.c (fromfpxl): Define using
libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_getpayloadl.c: Include
<libm-alias-ldouble.h>.
(getpayloadl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_llrintl.c: Include
<libm-alias-ldouble.h>.
(llrintl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_llroundl.c: Include
<libm-alias-ldouble.h>.
(llroundl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_logbl.c: Include
<libm-alias-ldouble.h>.
(logbl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_lrintl.c: Include
<libm-alias-ldouble.h>.
(lrintl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_lroundl.c: Include
<libm-alias-ldouble.h>.
(lroundl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_modfl.c: Include
<libm-alias-ldouble.h>.
(modfl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_nearbyintl.c: Include
<libm-alias-ldouble.h>.
(nearbyintl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_nextafterl.c: Include
<libm-alias-ldouble.h>.
(nextafterl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_nextupl.c: Include
<libm-alias-ldouble.h>.
(nextupl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_remquol.c: Include
<libm-alias-ldouble.h>.
(remquol): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_rintl.c: Include
<libm-alias-ldouble.h>.
(rintl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_roundevenl.c: Include
<libm-alias-ldouble.h>.
(roundevenl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_roundl.c: Include
<libm-alias-ldouble.h>.
(roundl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_setpayloadl.c (setpayloadl): Define
using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_setpayloadl_main.c: Include
<libm-alias-ldouble.h>.
* sysdeps/ieee754/ldbl-128/s_setpayloadsigl.c (setpayloadsigl):
Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_sincosl.c: Include
<libm-alias-ldouble.h>.
(sincosl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_sinl.c: Include
<libm-alias-ldouble.h>.
(sinl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_tanhl.c: Include
<libm-alias-ldouble.h>.
(tanhl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_tanl.c: Include
<libm-alias-ldouble.h>.
(tanl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_totalorderl.c: Include
<libm-alias-ldouble.h>.
(totalorderl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_totalordermagl.c: Include
<libm-alias-ldouble.h>.
(totalordermagl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_truncl.c: Include
<libm-alias-ldouble.h>.
(truncl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_ufromfpl.c (ufromfpl): Define using
libm_alias_ldouble.
* sysdeps/ieee754/ldbl-128/s_ufromfpxl.c (ufromfpxl): Define using
libm_alias_ldouble.
* sysdeps/ieee754/ldbl-64-128/s_copysignl.c: Include
<libm-alias-ldouble.h>.
(weak_alias): Do not undefine and redefine.
[IS_IN (libc)] (libm_alias_ldouble): Undefine and redefine.
(copysignl): Define with long_double_symbol only if [IS_IN
(libc)].
* sysdeps/ieee754/ldbl-64-128/s_frexpl.c: Include
<libm-alias-ldouble.h>.
(weak_alias): Do not undefine and redefine.
[IS_IN (libc)] (libm_alias_ldouble): Undefine and redefine.
(frexpl): Define with long_double_symbol only if [IS_IN (libc)].
* sysdeps/ieee754/ldbl-64-128/s_modfl.c: Include
<libm-alias-ldouble.h>.
(weak_alias): Do not undefine and redefine.
[IS_IN (libc)] (libm_alias_ldouble): Undefine and redefine.
(modfl): Define with long_double_symbol only if [IS_IN (libc)].
* sysdeps/ieee754/ldbl-64-128/s_asinhl.c: Remove file.
* sysdeps/ieee754/ldbl-64-128/s_atanl.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_cbrtl.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_ceill.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_cosl.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_erfl.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_expm1l.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_fabsl.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_floorl.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_fmal.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_llrintl.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_llroundl.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_logbl.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_lrintl.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_lroundl.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_nearbyintl.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_remquol.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_rintl.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_roundl.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_sincosl.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_sinl.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_tanhl.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_tanl.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_truncl.c: Likewise.
Various source files in ldbl-64-128 are redundant, because they wrap
files that no longer provide public symbols that need special
versioning (those symbols having moved to separate errno-setting
wrappers), or, in the case of w_scalblnl.c, because the type-generic
template now does everything required (it deals with symbol versioning
for use in libm, and this file is never built for libc anyway - the
compat scalbln* symbols in libc, as opposed to scalbn*, are only for
i386 and m68k and are aliases to the corresponding scalbn* symbols).
This patch removes those redundant files.
Tested with build-many-glibcs.py (for all ldbl-64-128 configurations)
that installed stripped shared libraries are unchanged by this patch.
* sysdeps/ieee754/ldbl-64-128/e_ilogbl.c: Remove file.
* sysdeps/ieee754/ldbl-64-128/s_log1pl.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_scalblnl.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_scalbnl.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/w_scalblnl.c: Likewise.
Recent commit 59ba2d2b54 missed to add __memrchr_power8 in
ifunc list. Also handled discarding unwanted bytes for
unaligned inputs in power8 optimization.
2017-10-05 Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com>
* sysdeps/powerpc/powerpc64/multiarch/memrchr-ppc64.c: Revert
back to powerpc32 file.
* sysdeps/powerpc/powerpc64/multiarch/memrchr.c
(memrchr): Add __memrchr_power8 to ifunc list.
* sysdeps/powerpc/powerpc64/power8/memrchr.S: Mask
extra bytes for unaligned inputs.
This patch makes ldbl-96 functions use libm_alias_ldouble to define
function aliases.
Tested for x86_64, and tested with build-many-glibcs.py that installed
stripped shared libraries are unchanged by the patch.
* sysdeps/ieee754/ldbl-96/s_asinhl.c: Include
<libm-alias-ldouble.h>.
(asinhl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-96/s_cbrtl.c: Include
<libm-alias-ldouble.h>.
(cbrtl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-96/s_copysignl.c: Include
<libm-alias-ldouble.h>.
(copysignl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-96/s_cosl.c: Include
<libm-alias-ldouble.h>.
(cosl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-96/s_erfl.c: Include
<libm-alias-ldouble.h>.
(erfl): Define using libm_alias_ldouble.
(erfcl): Likewise.
* sysdeps/ieee754/ldbl-96/s_fmal.c: Include
<libm-alias-ldouble.h>.
(fmal): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-96/s_frexpl.c: Include
<libm-alias-ldouble.h>.
(frexpl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-96/s_fromfpl.c (fromfpl): Define using
libm_alias_ldouble.
* sysdeps/ieee754/ldbl-96/s_fromfpl_main.c: Include
<libm-alias-ldouble.h>.
* sysdeps/ieee754/ldbl-96/s_fromfpxl.c (fromfpxl): Define using
libm_alias_ldouble.
* sysdeps/ieee754/ldbl-96/s_getpayloadl.c: Include
<libm-alias-ldouble.h>.
(getpayloadl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-96/s_llrintl.c: Include
<libm-alias-ldouble.h>.
(llrintl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-96/s_llroundl.c: Include
<libm-alias-ldouble.h>.
(llroundl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-96/s_lrintl.c: Include
<libm-alias-ldouble.h>.
(lrintl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-96/s_lroundl.c: Include
<libm-alias-ldouble.h>.
(lroundl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-96/s_modfl.c: Include
<libm-alias-ldouble.h>.
(modfl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-96/s_nextupl.c: Include
<libm-alias-ldouble.h>.
(nextupl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-96/s_remquol.c: Include
<libm-alias-ldouble.h>.
(remquol): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-96/s_roundevenl.c: Include
<libm-alias-ldouble.h>.
(roundevenl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-96/s_roundl.c: Include
<libm-alias-ldouble.h>.
(roundl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-96/s_setpayloadl.c (setpayloadl): Define
using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-96/s_setpayloadl_main.c: Include
<libm-alias-ldouble.h>.
* sysdeps/ieee754/ldbl-96/s_setpayloadsigl.c: Include
<libm-alias-ldouble.h>.
(setpayloadsigl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-96/s_sincosl.c: Include
<libm-alias-ldouble.h>.
(sincosl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-96/s_sinl.c: Include
<libm-alias-ldouble.h>.
(sinl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-96/s_tanhl.c: Include
<libm-alias-ldouble.h>.
(tanhl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-96/s_tanl.c: Include
<libm-alias-ldouble.h>.
(tanl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-96/s_totalorderl.c: Include
<libm-alias-ldouble.h>.
(totalorderl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-96/s_totalordermagl.c: Include
<libm-alias-ldouble.h>.
(totalordermagl): Define using libm_alias_ldouble.
* sysdeps/ieee754/ldbl-96/s_ufromfpl.c (ufromfpl): Define using
libm_alias_ldouble.
* sysdeps/ieee754/ldbl-96/s_ufromfpxl.c (ufromfpxl): Define using
libm_alias_ldouble.
This is an optimized memmove implementation for the Qualcomm Falkor
processor core. Due to the way the falkor memcpy needs to be written,
code cannot be easily shared between memmove and memcpy like in case
of other aarch64 memcpy implementations due to which this routine is
separate. The underlying principle is the same as that of memcpy
where it tries to use registers with the same lower 4 bits for
fetching the same stream, thus optimizing hardware prefetcher
performance.
The memcpy copy loop copies 64 bytes at a time using the same register
pair since that's the way to train the hardware prefetcher on the
falkor core. memmove cannot quite do that since it needs to avoid
overlaps, so it does the next best thing, i.e. has a 32 byte loop with
a 32 byte end (prefetch a loop ahead to account for overlapping
locations) with register pairs that alias so that they hit the same
prefetcher. Due to this difference in loop size, they have to
currently be separate implementations but efforts are on to try and
get memmove to fall back into memcpy whenever it can without simply
duplicating all of the code.
Performance:
The routine fares around 20-25% better than the generic memmove for
most medium to large sizes (i.e. > 128 bytes) for the new walking
memmove benchmark (memmove-walk) with an unexplained regression
between 1K and 2K. The minor regression is something worth looking
into for us, but the remaining gains are significant enough that we
would like this included upstream as we looking into the cause for the
regression. Here is a snippet of the numbers as generated from the
microbenchmark by the compare_strings script. Comparisons are against
__memmove_generic:
Function: memmove
Variant: walk
__memmove_thunderx __memmove_falkor __memmove_generic
========================================================================================================================
<snip>
length=16384: 12508800.00 ( 6.09%) 11486800.00 ( 13.76%) 13319600.00
length=16400: 13614200.00 ( -0.67%) 11585000.00 ( 14.33%) 13523600.00
length=16385: 13448400.00 ( 0.10%) 11732700.00 ( 12.84%) 13461200.00
length=16399: 13594100.00 ( -0.22%) 11859600.00 ( 12.57%) 13564400.00
length=16386: 13211600.00 ( 1.13%) 11503800.00 ( 13.91%) 13362400.00
length=16398: 13218600.00 ( 2.12%) 11573200.00 ( 14.30%) 13504700.00
length=16387: 13510900.00 ( -0.37%) 11744200.00 ( 12.76%) 13461300.00
length=16397: 13603700.00 ( -0.15%) 11878200.00 ( 12.55%) 13583200.00
length=16388: 13461700.00 ( -0.13%) 11558000.00 ( 14.03%) 13444100.00
length=16396: 13517500.00 ( -0.03%) 11561300.00 ( 14.45%) 13513900.00
length=16389: 13534100.00 ( 0.17%) 11756800.00 ( 13.28%) 13556900.00
length=16395: 13585600.00 ( 0.11%) 11791800.00 ( 13.30%) 13601200.00
length=16390: 13480100.00 ( -0.13%) 11685500.00 ( 13.20%) 13462100.00
length=16394: 13529900.00 ( -0.23%) 11549800.00 ( 14.43%) 13498200.00
length=16391: 13595400.00 ( -0.26%) 11768200.00 ( 13.22%) 13560600.00
length=16393: 13567000.00 ( 0.20%) 11779700.00 ( 13.35%) 13594700.00
length=32768: 71308800.00 ( -6.53%) 50220800.00 ( 24.98%) 66939200.00
length=32784: 72100800.00 (-11.55%) 50114100.00 ( 22.47%) 64636300.00
length=32769: 71767000.00 ( -7.10%) 51238400.00 ( 23.54%) 67010000.00
length=32783: 70113700.00 (-40.95%) 51129000.00 ( -2.78%) 49744400.00
length=32770: 71367600.00 ( -6.52%) 50244700.00 ( 25.01%) 67000900.00
length=32782: 64366700.00 ( 4.71%) 50101400.00 ( 25.83%) 67545600.00
length=32771: 71440100.00 ( -6.51%) 51263900.00 ( 23.57%) 67074900.00
length=32781: 66993000.00 ( 0.34%) 51108300.00 ( 23.97%) 67220300.00
length=32772: 71443900.00 (-60.50%) 50062100.00 (-12.47%) 44512600.00
length=32780: 71759100.00 ( -6.58%) 50263200.00 ( 25.35%) 67328600.00
length=32773: 71714900.00 (-33.21%) 51076600.00 ( 5.12%) 53835400.00
length=32779: 71756900.00 ( -6.56%) 51290800.00 ( 23.83%) 67337800.00
length=32774: 59689300.00 (-34.55%) 50068400.00 (-12.86%) 44363300.00
length=32778: 71847500.00 (-18.20%) 50084100.00 ( 17.61%) 60786500.00
length=32775: 71599300.00 ( -6.54%) 51278200.00 ( 23.70%) 67204800.00
length=32777: 71862900.00 (-60.85%) 51094000.00 (-14.36%) 44677900.00
length=65536: 282848000.00 ( -6.60%) 199187000.00 ( 24.93%) 265325000.00
length=65552: 243285000.00 (-41.61%) 198512000.00 (-15.54%) 171805000.00
length=65537: 255415000.00 (-23.47%) 202499000.00 ( 2.11%) 206858000.00
length=65551: 280122000.00 (-62.95%) 203349000.00 (-18.29%) 171911000.00
length=65538: 283676000.00 (-14.46%) 198368000.00 ( 19.96%) 247848000.00
length=65550: 275566000.00 (-51.76%) 198494000.00 ( -9.31%) 181581000.00
length=65539: 283699000.00 ( -6.58%) 203453000.00 ( 23.57%) 266195000.00
length=65549: 286572000.00 ( -6.65%) 202607000.00 ( 24.60%) 268712000.00
length=65540: 283710000.00 ( -6.59%) 199161000.00 ( 25.17%) 266160000.00
length=65548: 237573000.00 ( 11.48%) 198462000.00 ( 26.06%) 268395000.00
length=65541: 284150000.00 ( -6.58%) 203273000.00 ( 23.75%) 266600000.00
length=65547: 286250000.00 ( -6.70%) 202594000.00 ( 24.48%) 268263000.00
length=65542: 284167000.00 ( -6.60%) 199122000.00 ( 25.31%) 266584000.00
length=65546: 285656000.00 ( -6.59%) 198443000.00 ( 25.95%) 268002000.00
length=65543: 284600000.00 ( -6.58%) 203247000.00 ( 23.89%) 267030000.00
length=65545: 285665000.00 ( -6.40%) 202575000.00 ( 24.55%) 268472000.00
<snip>
* sysdeps/aarch64/multiarch/Makefile (sysdep_routines): Add
memmove_falkor.
* sysdeps/aarch64/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Likewise.
* sysdeps/aarch64/multiarch/memmove.c: Likewise.
* sysdeps/aarch64/multiarch/memmove_falkor.S: New file.
glibc has an add-ons mechanism to allow additional software to be
integrated into the glibc build. Such add-ons may be within the glibc
source tree, or outside it at a path passed to the --enable-add-ons
configure option.
localedata and crypt were once add-ons, distributed in separate
release tarballs, but long since stopped using that mechanism.
Linuxthreads was always an add-on. Ports spent some time as an add-on
with separate release tarballs, then was first moved into the glibc
source tree, then had its sysdeps files moved into the main sysdeps
hierarchy so the add-ons mechanism was no longer used. NPTL spent
some time as an add-on in the main glibc tree before stopping using
the add-on mechanism. libidn used to have separate release tarballs
but no longer does so, but still uses the add-ons mechanism within the
glibc source tree. Various other software has supported building with
the add-ons mechanism at times in the past, but I don't think any is
still widely used.
Add-ons involve significant, little-used complexity in the glibc build
system, and make it hard to understand what the space of possible
glibc configurations is. This patch removes the add-ons mechanism.
libidn is now built via the Subdirs mechanism to cause any
configuration using sysdeps/unix/inet to build libidn; HAVE_LIBIDN
(which effectively means shared libraries are available) is now
defined via sysdeps/unix/inet/configure. Various references to
add-ons around the source tree are removed (in the case of maint.texi,
the example list of sysdeps directories is still very out of date).
Externally maintained ports should now put their files in the normal
sysdeps directory structure rather than being arranged as add-ons;
they probably need to change e.g. elf.h anyway, rather than actually
being able to work just as a drop-in subtree. Hurd libpthread should
be arranged similarly to NPTL, so some files might go in a
hurd-pthreads (or similar) top-level directory in glibc, while sysdeps
files should go in the normal sysdeps directory structure (possibly in
hurd or hurd-pthreads subdirectories, just as there are nptl
subdirectories in the sysdeps tree).
Tested for x86_64, and with build-many-glibcs.py.
* configure.ac (--enable-add-ons): Remove option.
(machine): Do not mention add-ons in comment.
(LIBC_PRECONFIGURE): Likewise.
(add_ons): Remove variable and sanity checks and logic to locate
add-ons.
(add_ons_automatic): Remove variable.
(configured_add_ons): Likewise.
(add_ons_sfx): Likewise.
(add_ons_pfx): Likewise.
(add_on_subdirs): Likewise.
(sysnames_add_ons): Likewise. Remove loop over add-ons and
consideration of add-ons in Implies handling.
(sysdeps_add_ons): Likewise.
* configure: Regenerated.
* libidn/configure.ac: Remove.
* libidn/configure: Likewise.
* sysdeps/unix/inet/configure.ac: New file.
* sysdeps/unix/inet/configure: New generated file.
* sysdeps/unix/inet/Subdirs: Add libidn.
* Makeconfig (sysdeps-srcdirs): Remove variable.
(+sysdep_dirs): Do not include $(sysdeps-srcdirs).
($(common-objpfx)config.status): Do not depend on add-on files.
($(common-objpfx)shlib-versions.v.i): Do not mention add-ons in
comment.
(all-subdirs): Do not include $(add-on-subdirs).
* Makefile (dist-prepare): Do not use $(sysdeps-add-ons).
* config.make.in (add-ons): Remove variable.
(add-on-subdirs): Likewise.
(sysdeps-add-ons): Likewise.
* manual/Makefile (add-chapters): Remove.
($(objpfx)texis): Do not depend on $(add-chapters).
(nonexamples): Do not handle $(add-chapters).
(examples): Do not handle $(add-ons).
(chapters.% top-menu.%): Do not pass '$(add-chapters)' to
libc-texinfo.sh.
* manual/install.texi (Installation): Do not mention add-ons.
(--enable-add-ons): Do not document configure option.
* INSTALL: Regenerated.
* manual/libc-texinfo.sh: Do not handle $2 add-ons argument.
* manual/maint.texi (Hierarchy Conventions): Do not mention
add-ons.
* scripts/build-many-glibcs.py (Glibc.build_glibc): Do not use
--enable-add-ons.
* scripts/gen-sorted.awk: Do not handle Subdirs files from
add-ons.
* scripts/test-installation.pl: Do not handle glibc-compat add-on.
* sysdeps/nptl/Makeconfig: Do not mention add-ons in comment.
On i386, when multi-arch is enabled, all external functions must be
called via PIC PLT in PIE, which requires setting up EBX register,
since they may be IFUNC functions.
* config.h.in (NO_HIDDEN_EXTERN_FUNC_IN_PIE): New.
* include/libc-symbols.h (__hidden_proto_hiddenattr): Add check
for PIC and NO_HIDDEN_EXTERN_FUNC_IN_PIE.
* sysdeps/i386/configure.ac (NO_HIDDEN_EXTERN_FUNC_IN_PIE): New
AC_DEFINE if multi-arch is enabled.
* sysdeps/i386/configure: Regenerated.
This patch makes dbl-64 fma use libm_alias_double. The ldbl-opt
version is removed. The sparc32 version no longer needs to handle
compat symbols, while alpha needs a new wrapper to avoid getting the
ldbl-128 version (where ldbl-opt is earlier in the list of sysdeps
directories, so previously fma came from there).
Tested for x86_64, and tested with build-many-glibcs.py that installed
stripped shared libraries are unchanged by the patch.
* sysdeps/ieee754/dbl-64/s_fma.c: Include <libm-alias-double.h>.
(fma): Define using libm_alias_double.
* sysdeps/ieee754/ldbl-opt/s_fma.c: Remove file.
* sysdeps/sparc/sparc32/fpu/s_fma.c: Do not include
<math_ldbl_opt.h>.
(fmal): Do not define as compat symbol here.
* sysdeps/alpha/fpu/s_fma.c: New file.
32-bit SPARC libm should have compat symbols for copysignl
(GLIBC_2.0), fabsl (GLIBC_2.0), fmal (GLIBC_2.1), pointing to the
double functions; they were present in glibc 2.8, for example, but are
now missing, probably when optimized SPARC function implementations
were added without appropriate compat symbol handling. The same
applies to copysignl in libc. This patch restores those compat
symbols.
Tested with build-many-glibcs.py for sparcv9-linux-gnu.
[BZ #22229]
* sysdeps/sparc/sparc32/fpu/s_copysign.S: Include
<math_ldbl_opt.h>
(copysignl): Define as compat symbol at version GLIBC_2_0 for libm
and libc.
* sysdeps/sparc/sparc32/fpu/s_fabs.S: Include <math_ldbl_opt.h>.
(fabsl): Define as compat symbol at version GLIBC_2_0 for libm.
* sysdeps/sparc/sparc32/fpu/s_fma.c: Include <math_ldbl_opt.h>.
(fmal): Define as compat symbol at version GLIBC_2_1 for libm.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysign.S:
Include <math_ldbl_opt.h>
(copysignl): Define as compat symbol at version GLIBC_2_0 for libm
and libc.
(compat_symbol): Undefine and redefine.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fabs.S: Include
<math_ldbl_opt.h>
(fabsl): Define as compat symbol at version GLIBC_2_0 for libm.
(compat_symbol): Undefine and redefine.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fma.c
[HAVE_AS_VIS3_SUPPORT]: Include <math_ldbl_opt.h>.
[HAVE_AS_VIS3_SUPPORT] (fmal): Define as compat symbol at version
GLIBC_2_1 for libm.
* sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist: Add
GLIBC_2.0 copysignl symbol.
* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Add
GLIBC_2.0 copysignl and fabsl and GLIBC_2.1 fmal symbols.
Given my recent changes, sysdeps/alpha/fpu/s_nearbyint.c is no longer
needed: it just includes the dbl-64/wordsize-64 version, which is the
one that would be used anyway, and defines a compat symbol,
duplicating the same compat symbol defined by the dbl-64/wordsize-64
version through use of libm_alias_double. Thus, this patch removes
the redundant wrapper.
Tested with build-many-glibcs.py that installed stripped shared
libraries are unchanged for alpha.
* sysdeps/alpha/fpu/s_nearbyint.c: Remove file.
Without SVID compat wrapper yn(n,0) and ynf(n,0) does not raise
the divide-by-zero excpetion and it may return inf with the wrong
sign for n < 0.
[BZ #22244]
* sysdeps/ieee754/dbl-64/e_jn.c (__ieee754_yn): Fix x == 0 case.
* sysdeps/ieee754/flt-32/e_jnf.c (__ieee754_ynf): Likewise.
On 64bit targets if the SVID compat wrapper is suppressed (e.g. static linking)
then log2(0) and log10(0) returned inf instead of -inf.
[BZ #22243]
* sysdeps/ieee754/dbl-64/wordsize-64/e_log10.c (__ieee754_log10): Use fabs.
* sysdeps/ieee754/dbl-64/wordsize-64/e_log2.c (__ieee754_log2): Likewise.
Don't use "leal main@GOTOFF(%ebx), %eax" since main may be in a
shared object. Linker will convert "movl main@GOT(%ebx), %eax"
to "leal main@GOTOFF(%ebx), %eax" if main is defined locally.
* sysdeps/i386/start.S: Replace "leal main@GOT(%ebx), %eax" with
"movl main@GOTOFF(%ebx), %eax".
This code is used in non-PIE static executable and static PIE. It checks
if _DYNAMIC is undefined before using it to compute load address. But
not all targets can convert access _DYNAMIC via GOT, which needs dynamic
relocation, to PC-relative at link-time.
* sysdeps/i386/dl-machine.h (elf_machine_load_address): Don't
allow undefined _DYNAMIC in PIE libc.a.
* sysdeps/x86_64/dl-machine.h (elf_machine_load_address):
Likewse.
Since mips can't convert access _DYNAMIC via GOT, which needs dynamic
relocation, to PC-relative at link-time, don't check _DYNAMIC in
elf_machine_load_address.
* sysdeps/mips/dl-machine.h (elf_machine_load_address): Don't
check _DYNAMIC.
Since arm can't convert access _DYNAMIC via GOT, which needs dynamic
relocation, to PC-relative at link-time, don't check _DYNAMIC in
elf_machine_load_address.
* sysdeps/arm/dl-machine.h (elf_machine_load_address): Don't
check _DYNAMIC.
This patch makes dbl-64 modf use libm_alias_double. Both the dbl-64
and dbl-64/wordsize-64 versions are changed, and the ldbl-opt version
is changed to define the libc compat symbol only. Because of
multiarch wrappers, the changed implementations are made not to define
aliases at all if __modf is defined as a macro, as with other
functions, so avoiding duplicate compat symbols while allowing those
wrappers to be simplified.
Tested for x86_64, and verified with build-many-glibcs.py that
installed stripped shared libraries are unchanged by the patch.
* sysdeps/ieee754/dbl-64/s_modf.c: Include <libm-alias-double.h>.
(modf): Define using libm_alias_double, only if [!__modf].
* sysdeps/ieee754/dbl-64/wordsize-64/s_modf.c: Include
<libm-alias-double.h>.
(modf): Define using libm_alias_double, only if [!__modf].
* sysdeps/ieee754/ldbl-opt/s_modf.c (modfl): Only define libc
compat symbol here.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modf-ppc32.c
(weak_alias): Do not undefine and redefine.
(strong_alias): Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf-ppc64.c
(weak_alias): Likewise.
(strong_alias): Likewise.
This patch makes dbl-64 logb use libm_alias_double. Both the dbl-64
and dbl-64/wordsize-64 versions are changed, and the ldbl-opt version
is removed. Because of multiarch wrappers, the changed
implementations are made not to define aliases at all if __logb is
defined as a macro, as with other functions, so avoiding duplicate
compat symbols while allowing those wrappers to be simplified.
Tested for x86_64, and verified with build-many-glibcs.py that
installed stripped shared libraries are unchanged (except on alpha
where changes from using the wordsize-64 version are expected).
* sysdeps/ieee754/dbl-64/s_logb.c: Include <libm-alias-double.h>.
(logb): Define using libm_alias_double, only if [!__logb].
* sysdeps/ieee754/dbl-64/wordsize-64/s_logb.c: Include
<libm-alias-double.h>.
(logb): Define using libm_alias_double, only if [!__logb].
* sysdeps/ieee754/ldbl-opt/s_logb.c: Remove file.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_logb-ppc32.c
(weak_alias): Do not undefine and redefine.
(strong_alias): Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_logb-ppc64.c
(weak_alias): Likewise.
(strong_alias): Likewise.
For static PIE code, PIC is defined and SHARED is undefined. We
should check SHARED instead PIC for SYSCALL_ERROR_NAME.
* sysdeps/unix/sysv/linux/tile/sysdep.h (SYSCALL_ERROR_NAME):
Check SHARED instead PIC.
This patch makes the implementation of fmaf in the dbl-64 directory
use libm_alias float.
Tested for x86_64, and verified with build-many-glibcs.py that
installed stripped shared libraries are unchanged by this patch.
* sysdeps/ieee754/dbl-64/s_fmaf.c: Include <libm-alias-float.h>.
[!__fmaf] (fmaf): Define using libm_alias_float.
This patch makes dbl-64 frexp use libm_alias_double. Both the dbl-64
and dbl-64/wordsize-64 versions are changed; the ldbl-opt version is
made to define only the libc frexpl compat symbol, now the generic
code handles the libm compat symbol automatically.
Tested for x86_64, and verified with build-many-glibcs.py that
installed stripped shared libraries are unchanged by this patch.
* sysdeps/ieee754/dbl-64/s_frexp.c: Include <libm-alias-double.h>.
(frexp): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/wordsize-64/s_frexp.c: Include
<libm-alias-double.h>.
(frexp): Define using libm_alias_double.
* sysdeps/ieee754/ldbl-opt/s_frexp.c (frexpl): Only define libc
compat symbol here.
All representations of floating-point numbers in types with IEC 60559
binary exchange format are canonical. On the other hand, types with IEC
60559 extended formats, such as those implemented under ldbl-96 and
ldbl-128ibm, contain representations that are not canonical.
TS 18661-1 introduced the type-generic macro iscanonical, which returns
whether a floating-point value is canonical or not. In Glibc, this
type-generic macro is implemented using the macro __MATH_TG, which, when
support for float128 is enabled, relies on __builtin_types_compatible_p
to select between floating-point types. However, this use of
iscanonical breaks C++ applications, because the builtin is only
available in C mode.
This patch provides a C++ implementation of iscanonical that relies on
function overloading, rather than builtins, to select between
floating-point types.
Unlike the C++ implementations for iszero and issignaling, this
implementation ignores __NO_LONG_DOUBLE_MATH. The double type always
matches IEC 60559 double format, which is always canonical. Thus, when
double and long double are the same (__NO_LONG_DOUBLE_MATH), iscanonical
always returns 1 and is not implemented with __MATH_TG.
Tested for powerpc64, powerpc64le and x86_64.
[BZ #22235]
* math/math.h: Trivial fix for unbalanced parentheses in comment.
* math/Makefile [CXX] (tests): Add test-math-iscanonical.cc.
(CFLAGS-test-math-iscanonical.cc): New variable.
* math/test-math-iscanonical.cc: New file.
* sysdeps/ieee754/ldbl-96/bits/iscanonical.h (iscanonical):
Provide a C++ implementation based on function overloading,
rather than using __MATH_TG, which uses C-only builtins.
* sysdeps/ieee754/ldbl-128ibm/bits/iscanonical.h (iscanonical):
Likewise.
* sysdeps/powerpc/powerpc64le/Makefile
(CFLAGS-test-math-iscanonical.cc): New variable.
This patch makes more dbl-64 functions use libm_alias_double to define
function aliases. Specifically, it makes the change for functions
with dbl-64/wordsize-64 versions, changing both the dbl-64 and
dbl-64/wordsize-64 versions and removing the ldbl-opt wrappers.
Functions are excluded from this patch if there are complications
because of versions of those functions also present in libc, or
architecture-specific wrappers round these files.
Tested for x86_64, and with build-many-glibcs.py. Installed stripped
shared libraries are unchanged except for alpha (where increased use
of dbl-64/wordsize-64 files, where previously ldbl-opt files that
wrapped dbl-64 files were used, was expected to result in different,
better code).
* sysdeps/ieee754/dbl-64/s_ceil.c: Include <libm-alias-double.h>.
(ceil): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/s_floor.c: Include <libm-alias-double.h>.
(floor): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/s_llround.c: Include
<libm-alias-double.h>.
(llround): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/s_lround.c: Include
<libm-alias-double.h>.
(lround): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/s_nearbyint.c: Include
<libm-alias-double.h>.
(nearbyint): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/s_remquo.c: Include
<libm-alias-double.h>.
(remquo): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/s_rint.c: Include <libm-alias-double.h>.
(rint): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/s_round.c: Include <libm-alias-double.h>.
(round): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/s_trunc.c: Include <libm-alias-double.h>.
(trunc): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/wordsize-64/s_ceil.c: Include
<libm-alias-double.h>.
(ceil): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/wordsize-64/s_floor.c: Include
<libm-alias-double.h>.
(floor): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/wordsize-64/s_llround.c: Include
<libm-alias-double.h>.
(llround): Define using libm_alias_double.
[_LP64] (lround): Likewise.
* sysdeps/ieee754/dbl-64/wordsize-64/s_lround.c: Include
<libm-alias-double.h>.
[!_LP64] (lround): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/wordsize-64/s_nearbyint.c: Include
<libm-alias-double.h>.
(nearbyint): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/wordsize-64/s_remquo.c: Include
<libm-alias-double.h>.
(remquo): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/wordsize-64/s_rint.c: Include
<libm-alias-double.h>.
(rint): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/wordsize-64/s_round.c: Include
<libm-alias-double.h>.
(round): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/wordsize-64/s_trunc.c: Include
<libm-alias-double.h>.
(trunc): Define using libm_alias_double.
* sysdeps/ieee754/ldbl-opt/s_ceil.c: Remove file.
* sysdeps/ieee754/ldbl-opt/s_floor.c: Likewise.
* sysdeps/ieee754/ldbl-opt/s_llround.c: Likewise.
* sysdeps/ieee754/ldbl-opt/s_lround.c: Likewise.
* sysdeps/ieee754/ldbl-opt/s_nearbyint.c: Likewise.
* sysdeps/ieee754/ldbl-opt/s_remquo.c: Likewise.
* sysdeps/ieee754/ldbl-opt/s_rint.c: Likewise.
* sysdeps/ieee754/ldbl-opt/s_round.c: Likewise.
* sysdeps/ieee754/ldbl-opt/s_trunc.c: Likewise.
(&_dl_main_map) is used instead of (&bootstrap_map) to bootstrap static
PIE. Define BOOTSTRAP_MAP with (&_dl_main_map) to avoid hardcode to
(&bootstrap_map).
* elf/rtld.c (BOOTSTRAP_MAP): New.
(RESOLVE_MAP): Replace (&bootstrap_map) with BOOTSTRAP_MAP.
* sysdeps/hppa/dl-machine.h (ELF_MACHINE_BEFORE_RTLD_RELOC):
Likewise.
* sysdeps/ia64/dl-machine.h (ELF_MACHINE_BEFORE_RTLD_RELOC):
Likewise.
* sysdeps/mips/dl-machine.h (ELF_MACHINE_BEFORE_RTLD_RELOC):
Likewise.
On the Hurd, the rtld needs to see its own dumb versions of a few functions
(defined in sysdeps/mach/hurd/dl-sysdep.c) overridden by libc's versions once
loaded. rtld should thus not have hidden attribute for these. To achieve this,
the Hurd port used to just define NO_HIDDEN, which disables it completely. For
now, this changes that to disabling it for all rtld functions, for simplicity.
See Roland's comment on https://sourceware.org/bugzilla/show_bug.cgi?id=15605#c5
The ld.so numbers remain at
8 .rel.plt 000000c8 00000c24 00000c24 00000c24 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
9 .plt 000001a0 00000cf0 00000cf0 00000cf0 2**4
CONTENTS, ALLOC, LOAD, READONLY, CODE
10 .plt.got 00000010 00000e90 00000e90 00000e90 2**3
CONTENTS, ALLOC, LOAD, READONLY, CODE
18 .got.plt 00000070 0002d000 0002d000 0002c000 2**2
CONTENTS, ALLOC, LOAD, DATA
which is about 3 times as much as on Linux.
The libc.so numbers get divided by 3 (the remainings are mostly RPC stub calls)
* include/libc-symbols.h [NO_RTLD_HIDDEN] (rtld_hidden_proto,
rtld_hidden_tls_proto, rtld_hidden_def, rtld_hidden_weak,
rtld_hidden_rtld_hidden_ver, data_def, rtld_hidden_data_weak,
rtld_hidden_data_ver): Define to empty.
* include/assert.h [IS_IN(rtld) && NO_RTLD_HIDDEN] (__assert_fail,
__assert_perror_fail): Likewise.
* include/dirent.h [IS_IN(rtld) && NO_RTLD_HIDDEN]
(__rewinddir): Likewise.
* include/libc-internal.h [IS_IN(rtld) && NO_RTLD_HIDDEN]
(__profile_frequency): Likewise.
* include/setjmp.h (__sigsetjmp): Likewise.
* include/signal.h [IS_IN(rtld) && NO_RTLD_HIDDEN] (__sigaction,
__libc_sigaction): Likewise.
* include/stdlib.h [NO_RTLD_HIDDEN] (unsetenv, __strtoul_internal): Do
not set hidden attribute.
* include/string.h [IS_IN(rtld) && NO_RTLD_HIDDEN] (__stpcpy, __strdup,
__strerror_t, __strsep_g, memchr, memcmp, memcpy, memmove, memset,
rawmemchr, stpcpy, strchr, strcmp, strlen, strnlen, strsep): Likewise.
* include/sys/stat.h [IS_IN(rtld) && NO_RTLD_HIDDEN] (__fxstat,
__fxstat64, __lxstat, __lxstat64, __xstat, __xstat64,
__fxstatat64): Likewise.
* include/sys/utsname.h [IS_IN(rtld) && NO_RTLD_HIDDEN]
(__uname): Likewise.
* include/sysdeps/generic/_itoa.h [IS_IN(rtld) && NO_RTLD_HIDDEN]
(_itoa_upper_digits, _itoa_lower_digits): Likewise.
* sysdeps/mach/hurd/configure.ac (NO_HIDDEN): Do not set.
(NO_RTLD_HIDDEN): Set.
* sysdeps/mach/hurd/configure: Refresh.
* config.h.in: Refresh.
This patch makes the dbl-64 atan and tan implementations use
libm_alias_double, removing the corresponding ldbl-opt wrappers.
Tested for x86_64, and with build-many-glibcs.py. Installed stripped
shared libraries are unchanged on non-ldbl-opt platforms. For
ldbl-opt configurations, the patch has the effect of causing
compat_symbol to define atanl and tanl in terms of __atan and __tan
instead of in terms of atan and tan, which is enough to change the
installed stripped libm.so.
* sysdeps/ieee754/dbl-64/s_atan.c: Include <libm-alias-double.h>.
(atan): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/s_tan.c: Include <libm-alias-double.h>.
(tan): Define using libm_alias_double.
* sysdeps/ieee754/ldbl-opt/s_atan.c: Remove file.
* sysdeps/ieee754/ldbl-opt/s_tan.c: Likewise.
This patch converts the dbl-64 implementations of atan and tan into
weak aliases of __atan and __tan, in preparation for making them use
libm_alias_double. Consequent changes are made to the x86_64
multiarch versions wrapping round them (with the dbl-64 functions,
like other such functions, being made not to define their aliases at
all if __atan or __tan are defined as macros by an including file).
Tested for x86_64, and with build-many-glibcs.py.
* sysdeps/ieee754/dbl-64/s_atan.c (atan): Rename to __atan and
define as weak alias of __atan. Do not define any aliases if
[__atan].
[NO_LONG_DOUBLE] (__atanl): Define as strong alias of __atan.
[NO_LONG_DOUBLE] (atanl): Define as weak alias of __atanl.
* sysdeps/ieee754/dbl-64/s_tan.c (tan): Rename to __tan and define
as weak alias of __tan. Do not define any aliases if [__tan].
[NO_LONG_DOUBLE] (__tanl): Define as strong alias of __tan.
[NO_LONG_DOUBLE] (tanl): Define as weak alias of __tanl.
* sysdeps/x86_64/fpu/multiarch/s_atan-avx.c (atan): Rename to
__atan.
* sysdeps/x86_64/fpu/multiarch/s_atan-fma.c (atan): Likewise.
* sysdeps/x86_64/fpu/multiarch/s_atan-fma4.c (atan): Likewise.
* sysdeps/x86_64/fpu/multiarch/s_atan.c (atan): Rename to __atan
and define as weak alias of __atan.
* sysdeps/x86_64/fpu/multiarch/s_tan-avx.c (tan): Rename to
__atan.
* sysdeps/x86_64/fpu/multiarch/s_tan-fma.c (tan): Likewise.
* sysdeps/x86_64/fpu/multiarch/s_tan-fma4.c (tan): Likewise.
* sysdeps/x86_64/fpu/multiarch/s_tan.c (tan): Rename to __tan and
define as weak alias of __tan.
The new generic logf, log2f and powf code don't need wrappers any more,
they set errno inline so only use the wrappers on targets that need it.
* sysdeps/ieee754/flt-32/e_log2f.c (__log2f): Define without wrapper.
* sysdeps/ieee754/flt-32/e_logf.c (__logf): Likewise
* sysdeps/ieee754/flt-32/e_powf.c (__powf): Likewise
* sysdeps/ieee754/flt-32/w_log2f.c: New file.
* sysdeps/ieee754/flt-32/w_logf.c: New file.
* sysdeps/ieee754/flt-32/w_powf.c: New file.
* sysdeps/i386/fpu/w_log2f.c: New file.
* sysdeps/i386/fpu/w_logf.c: New file.
* sysdeps/i386/fpu/w_powf.c: New file.
* sysdeps/m68k/m680x0/fpu/w_log2f.c: New file.
* sysdeps/m68k/m680x0/fpu/w_logf.c: New file.
* sysdeps/m68k/m680x0/fpu/w_powf.c: New file.
The new generic expf and exp2f code don't need wrappers any more, they
set errno inline, so only use the wrappers on targets that need it.
(If the wrapper is needed, then the top level wrapper code is included,
otherwise empty w_exp*f.c is used to suppress the wrapper.)
A powerpc64 expf implementation includes the expf c code directly which
needed some changes.
* sysdeps/ieee754/flt-32/e_exp2f.c (__exp2f): Define without wrapper.
* sysdeps/ieee754/flt-32/e_expf.c (__expf): Likewise
* sysdeps/ieee754/flt-32/w_exp2f.c: New file.
* sysdeps/ieee754/flt-32/w_expf.c: New file.
* sysdeps/powerpc/powerpc64/fpu/multiarch/e_expf-ppc64.c: Update for
the new expf code.
* sysdeps/powerpc/powerpc64/fpu/multiarch/w_expf.c: New file.
* sysdeps/powerpc/powerpc64/power8/fpu/w_expf.c: New file.
* sysdeps/m68k/m680x0/fpu/w_exp2f.c: New file.
* sysdeps/m68k/m680x0/fpu/w_expf.c: New file.
* sysdeps/i386/fpu/w_exp2f.c: New file.
* sysdeps/i386/fpu/w_expf.c: New file.
* sysdeps/i386/i686/fpu/multiarch/w_expf.c: New file.
* sysdeps/x86_64/fpu/w_expf.c: New file.
Vectorized loops are used for sizes greater than 32B to improve
performance over power7 optimization. This shows as an average
of 25% improvement depending on the position of search
character. The performance is same for shorter strings.
Hide internal fadvise64/fallocate64 functions to allow direct access
within libc.so and libc.a without using GOT nor PLT.
[BZ #18822]
* sysdeps/unix/sysv/linux/posix_fadvise64.c
(__posix_fadvise64_l64): Add Add libc_hidden_proto and
libc_hidden_def.
* sysdeps/unix/sysv/linux/posix_fallocate64.c
(__posix_fallocate64_l64): Likewise.
Hide internal __sched_setaffinity_new function to allow direct access
within libc.so and libc.a without using GOT nor PLT.
[BZ #18822]
* sysdeps/unix/sysv/linux/sched_setaffinity.c
(__sched_setaffinity_new): Add libc_hidden_proto and
libc_hidden_def.
Hide internal __glob64 function to allow direct access within libc.so
and libc.a without using GOT nor PLT.
[BZ #18822]
* include/glob.h (__glob64): Add libc_hidden_proto.
* sysdeps/unix/sysv/linux/glob64.c (__glob64): Add
libc_hidden_def.
Hide internal __new_getrlimit function to allow direct access within
libc.so and libc.a without using GOT nor PLT.
[BZ #18822]
* sysdeps/unix/sysv/linux/getrlimit64.c (__new_getrlimit): Add
attribute_hidden.
Hide internal __tcgetattr function to allow direct access within libc.so
and libc.a without using GOT nor PLT.
[BZ #18822]
* include/termios.h (__tcgetattr): Add libc_hidden_proto.
* sysdeps/unix/bsd/tcgetattr.c (__tcgetattr): Add
libc_hidden_def.
* sysdeps/unix/sysv/linux/tcgetattr.c (__tcgetattr): Likewise.
* termios/tcgetattr.c (__tcgetattr): Likewise.
Hide internal __ifreq function to allow direct access within libc.so and
libc.a without using GOT nor PLT.
[BZ #18822]
* include/ifreq.h: New file.
* sysdeps/generic/ifreq.h (__if_nextreq): Removed.
(__ifreq): Likewise.
* sysdeps/mach/hurd/ifreq.h (__if_nextreq): Removed.
(__ifreq): Likewise.
Hide internal idna functions to allow direct access within libc.so and
libc.a without using GOT nor PLT.
[BZ #18822]
* include/idna.h: New file.
* inet/getnameinfo.c: Include <idna.h> instead of
<libidn/idna.h>.
(__idna_to_unicode_lzlz): Removed.
* sysdeps/posix/getaddrinfo.c: Include <idna.h> instead of
<libidn/idna.h>.
(__idna_to_ascii_lz): Removed.
(__idna_to_unicode_lzlz): Likewise.
Hide internal __get_sol function to allow direct access within libc.so
and libc.a without using GOT nor PLT.
[BZ #18822]
* sysdeps/unix/sysv/linux/getsourcefilter.c: Include
"getsourcefilter.h".
* sysdeps/unix/sysv/linux/getsourcefilter.h: New file.
* sysdeps/unix/sysv/linux/setsourcefilter.c: Include
"getsourcefilter.h".
(__get_sol): Removed.
Hide internal __bsd_getpt function to allow direct access within
libc.so and libc.a without using GOT nor PLT.
[BZ #18822]
* sysdeps/unix/sysv/linux/getpt.c (__bsd_getpt): Add
attribute_hidden.
Hide internal __sysinfo function to allow direct access within libc.so and
libc.a without using GOT nor PLT.
[BZ #18822]
* sysdeps/unix/sysv/linux/include/sys/sysinfo.h (__sysinfo): Add
attribute_hidden.
Hide internal __mremap function to allow direct access within libc.so and
libc.a without using GOT nor PLT.
__GI___mremap is defined when sysdeps/unix/syscalls.list is used to
generate mremap. Otherwise libc_hidden_def is needed explicitly.
[BZ #18822]
* include/sys/mman.h (__mremap): Add libc_hidden_proto.
* sysdeps/unix/sysv/linux/m68k/mremap.S (__mremap): Add
libc_hidden_def.
Hide internal __ioctl function to allow direct access within libc.so and
libc.a without using GOT nor PLT.
__GI___ioctl is defined when sysdeps/unix/syscalls.list is used to
generate ioctl. Otherwise libc_hidden_def is needed explicitly.
[BZ #18822]
* include/sys/ioctl.h (__ioctl): Add libc_hidden_proto.
* misc/ioctl.c (__ioctl): Add libc_hidden_def.
* sysdeps/mach/hurd/ioctl.c (__ioctl): Likewise.
* sysdeps/unix/sysv/linux/aarch64/ioctl.S (__ioctl): Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/n64/ioctl.S (__ioctl):
Likewise.
* sysdeps/unix/sysv/linux/powerpc/ioctl.c (__ioctl): Likewise.
* sysdeps/unix/sysv/linux/tile/tilegx/ioctl.S (__ioctl): Likewise.
Mark internal netlink functions with attribute_hidden to allow direct
access within libc.so and libc.a without using GOT nor PLT.
[BZ #18822]
* sysdeps/unix/sysv/linux/netlinkaccess.h (__netlink_open): Add
attribute_hidden.
(__netlink_close): Likewise.
(__netlink_free_handle): Likewise.
(__netlink_request): Likewise.
Mark internal dirent functions with attribute_hidden to allow direct
access within libc.so and libc.a without using GOT nor PLT. __readdir64
is hidden with libc_hidden_proto and libc_hidden_def since the exported
readdir64 is an alias of __readdir64.
[BZ #18822]
* include/dirent.h (__opendir): Always add attribute_hidden.
(__fdopendir): Likewise.
(__closedir): Likewise.
(__readdir): Likewise.
(__readdir64): Add libc_hidden_proto.
* sysdeps/mach/hurd/readdir64.c (__readdir64): Add libc_hidden_def.
* sysdeps/unix/sysv/linux/i386/readdir64.c (__readdir64): Likewise.
* sysdeps/unix/sysv/linux/readdir64.c (__readdir64): Likewise.
* sysdeps/unix/sysv/linux/wordsize-64/readdir.c (__GI___readdir64):
New alias.
Mark __internal_statvfs[64] with attribute_hidden to allow direct access
to them within libc.so and libc.a without using GOT nor PLT.
[BZ #18822]
* sysdeps/unix/sysv/linux/fstatvfs.c: Include "internal_statvfs.h"
instead of <sys/statvfs.h>.
(__internal_statvfs): Removed.
* sysdeps/unix/sysv/linux/fstatvfs64.c Include "internal_statvfs.h"
instead of <sys/statvfs.h>.
(__internal_statvfs64): Removed.
* sysdeps/unix/sysv/linux/internal_statvfs.c: Include
"internal_statvfs.h" instead of <sys/statvfs.h>.
* sysdeps/unix/sysv/linux/internal_statvfs.h: New file.
* sysdeps/unix/sysv/linux/statvfs.c Include "internal_statvfs.h"
instead of <sys/statvfs.h>.
(__internal_statvfs): Removed.
* sysdeps/unix/sysv/linux/statvfs64.c Include "internal_statvfs.h"
instead of <sys/statvfs.h>.
(__internal_statvfs64): Removed.
__setcontext on hppa.
* sysdeps/unix/sysv/linux/hppa/getcontext.S (__getcontext): Save return
pointer in frame.
* sysdeps/unix/sysv/linux/hppa/setcontext.S (__setcontext): Likewise.
Correct offset used to restore PIC register.
Continuing the move of libm aliases to common macros that can create
_FloatN / _FloatNx aliases in future, this patch converts some dbl-64
functions to using libm_alias_double, thereby eliminating the need for
some ldbl-opt wrappers.
This patch deliberately limits what functions are converted so that it
can be verified by comparison of stipped binaries. Specifically, atan
and tan are excluded because they first need converting to being weak
aliases; fma is omitted as it has additional complications with
versions in other directories (removing the ldbl-opt version can
e.g. cause the ldbl-128 version to be used instead of dbl-64); and
functions that have both dbl-64/wordsize-64 and ldbl-opt versions are
excluded because ldbl-opt currently always wraps dbl-64 function
versions, so changing those will result in platforms using both
ldbl-opt and dbl-64/wordsize-64 (i.e. alpha) starting to use the
dbl-64/wordsize-64 versions of those functions (which is good, as an
optimization, but still best separated from the present patch to get
better validation).
Tested for x86_64, and tested with build-many-glibcs.py that installed
stripped shared libraries are unchanged by the patch.
* sysdeps/ieee754/dbl-64/s_asinh.c: Include <libm-alias-double.h>.
(asinh): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/s_cbrt.c: Include <libm-alias-double.h>.
(cbrt): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/s_copysign.c: Include
<libm-alias-double.h>.
(copysign): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/s_erf.c: Include <libm-alias-double.h>.
(erf): Define using libm_alias_double.
(erfc): Likewise.
* sysdeps/ieee754/dbl-64/s_expm1.c: Include <libm-alias-double.h>.
(expm1): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/s_fabs.c: Include <libm-alias-double.h>.
(fabs): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/s_fromfp.c (fromfp): Define using
libm_alias_double.
* sysdeps/ieee754/dbl-64/s_fromfp_main.c: Include
<libm-alias-double.h>.
* sysdeps/ieee754/dbl-64/s_fromfpx.c (fromfpx): Define using
libm_alias_double.
* sysdeps/ieee754/dbl-64/s_getpayload.c: Include
<libm-alias-double.h>.
(getpayload): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/s_llrint.c: Include
<libm-alias-double.h>.
(llrint): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/s_lrint.c: Include <libm-alias-double.h>.
(lrint): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/s_nextup.c: Include
<libm-alias-double.h>.
(nextup): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/s_roundeven.c: Include
<libm-alias-double.h>.
(roundeven): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/s_setpayload.c (setpayload): Define using
libm_alias_double.
* sysdeps/ieee754/dbl-64/s_setpayload_main.c: Include
<libm-alias-double.h>.
* sysdeps/ieee754/dbl-64/s_setpayloadsig.c (setpayloadsig): Define
using libm_alias_double.
* sysdeps/ieee754/dbl-64/s_sin.c: Include <libm-alias-double.h>.
(cos): Define using libm_alias_double.
(sin): Likewise.
* sysdeps/ieee754/dbl-64/s_sincos.c: Include
<libm-alias-double.h>.
(sincos): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/s_tanh.c: Include <libm-alias-double.h>.
(tanh): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/s_totalorder.c: Include
<libm-alias-double.h>.
(totalorder): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/s_totalordermag.c: Include
<libm-alias-double.h>.
(totalordermag): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/s_ufromfp.c (ufromfp): Define using
libm_alias_double.
* sysdeps/ieee754/dbl-64/s_ufromfpx.c (ufromfpx): Define using
libm_alias_double.
* sysdeps/ieee754/dbl-64/wordsize-64/s_getpayload.c: Include
<libm-alias-double.h>.
(getpayload): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/wordsize-64/s_roundeven.c: Include
<libm-alias-double.h>.
(roundeven): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/wordsize-64/s_setpayload_main.c: Include
<libm-alias-double.h>.
* sysdeps/ieee754/dbl-64/wordsize-64/s_totalorder.c: Include
<libm-alias-double.h>.
(totalorder): Define using libm_alias_double.
* sysdeps/ieee754/dbl-64/wordsize-64/s_totalordermag.c: Include
<libm-alias-double.h>.
(totalordermag): Define using libm_alias_double.
* sysdeps/ieee754/ldbl-opt/s_copysign.c (copysignl): Only define
libc compat symbol here.
* sysdeps/ieee754/ldbl-opt/s_asinh.c: Remove file.
* sysdeps/ieee754/ldbl-opt/s_cbrt.c: Likewise.
* sysdeps/ieee754/ldbl-opt/s_erf.c: Likewise.
* sysdeps/ieee754/ldbl-opt/s_expm1.c: Likewise.
* sysdeps/ieee754/ldbl-opt/s_fabs.c: Likewise.
* sysdeps/ieee754/ldbl-opt/s_llrint.c: Likewise.
* sysdeps/ieee754/ldbl-opt/s_lrint.c: Likewise.
* sysdeps/ieee754/ldbl-opt/s_sin.c: Likewise.
* sysdeps/ieee754/ldbl-opt/s_sincos.c: Likewise.
* sysdeps/ieee754/ldbl-opt/s_tanh.c: Likewise.
When --enable-static-pie is used to configure glibc, we need to use
_dl_relocate_static_pie to compute load address in static PIE.
* sysdeps/arm/dl-machine.h (elf_machine_load_address): Use
_dl_relocate_static_pie instead of _dl_start to compute load
address in static PIE. Return 0 if _DYNAMIC is undefined for
static executable.
mips uses a local label to compute load address, which works with static
PIE. We just need to return 0 if _DYNAMIC is undefined for static
executable.
* sysdeps/mips/dl-machine.h (elf_machine_dynamic): Return 0 if
_DYNAMIC is undefined for static executable.
A few math functions still use __fabs(f/l) rather than fabs, which
means they won't be inlined. Rename them so they are inlined.
Also add -fno-builtin-fabsl to nofpu powerpc makefile to work around
BZ #29253.
* sysdeps/ieee754/dbl-64/e_lgamma_r.c
(__ieee754_lgamma_r): Use fabs rather than __fabs.
* sysdeps/ieee754/dbl-64/e_log10.c (__ieee754_log10): Likewise.
* sysdeps/ieee754/dbl-64/e_log2.c (__ieee754_log2): Likewise.
* sysdeps/ieee754/flt-32/e_lgammaf_r.c
(__ieee754_lgammaf_r): Use fabsf rather than __fabsf.
* sysdeps/ieee754/flt-32/e_log10f.c (__ieee754_log10f): Likewise.
* sysdeps/ieee754/flt-32/e_log2f.c (__ieee754_log2f): Likewise.
* sysdeps/ieee754/ldbl-128/e_lgammal_r.c
(__ieee754_lgammal_r): Use fabsl rather than __fabsl.
* sysdeps/ieee754/ldbl-128/e_log10l.c (__ieee754_log10l): Likewise.
* sysdeps/ieee754/ldbl-128/e_log2l.c (__ieee754_log2l): Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_lgammal_r.c
(__ieee754_lgammal_r): Use fabsl rather than __fabsl.
* sysdeps/ieee754/ldbl-128ibm/e_log10l.c (__ieee754_log10l): Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_log2l.c (__ieee754_log2l): Likewise.
* sysdeps/powerpc/nofpu/Makefile: Add -fno-builtin-fabsl for BZ #29253.
without wrapper on aarch64:
powf reciprocal-throughput: 4.2x faster
powf latency: 2.6x faster
old worst-case error: 1.11 ulp
new worst-case error: 0.82 ulp
aarch64 .text size: -780 bytes
aarch64 .rodata size: +144 bytes
powf(x,y) is implemented as exp2(y*log2(x)) with the same algorithms
that are used in exp2f and log2f, except that the log2f polynomial is
larger for extra precision and its output (and exp2f input) may be
scaled by a power of 2 (POWF_SCALE) to simplify the argument reduction
step of exp2 (possible when efficient round and convert toint operation
is available).
The special case handling tries to minimize the checks in the hot path.
When the input of exp2_inline is checked, int arithmetics is used as
that was faster on the tested aarch64 cores.
* math/Makefile (type-float-routines): Add e_powf_log2_data.
* sysdeps/ieee754/flt-32/e_powf.c: New implementation.
* sysdeps/ieee754/flt-32/e_powf_log2_data.c: New file.
* sysdeps/ieee754/flt-32/math_config.h (__powf_log2_data): Define.
(issignalingf_inline): Likewise.
(POWF_LOG2_TABLE_BITS): Likewise.
(POWF_LOG2_POLY_ORDER): Likewise.
(POWF_SCALE_BITS): Likewise.
(POWF_SCALE): Likewise.
* sysdeps/i386/fpu/e_powf_log2_data.c: New file.
* sysdeps/ia64/fpu/e_powf_log2_data.c: New file.
* sysdeps/m68k/m680x0/fpu/e_powf_log2_data.c: New file.
Similar to the new logf: double precision arithmetics and a small
lookup table is used. The argument reduction step is the same as in
the new logf.
without wrapper on aarch64:
log2f reciprocal-throughput: 2.3x faster
log2f latency: 2.1x faster
old worst case error: 1.72 ulp
new worst case error: 0.75 ulp
aarch64 .text size: -252 bytes
aarch64 .rodata size: +244 bytes
* math/Makefile (type-float-routines): Add e_log2f_data.
* sysdeps/ieee754/flt-32/e_log2f.c: New implementation.
* sysdeps/ieee754/flt-32/e_log2f_data.c: New file.
* sysdeps/ieee754/flt-32/math_config.h (__log2f_data): Define.
(LOG2F_TABLE_BITS, LOG2F_POLY_ORDER): Define.
* sysdeps/i386/fpu/e_log2f_data.c: New file.
* sysdeps/ia64/fpu/e_log2f_data.c: New file.
* sysdeps/m68k/m680x0/fpu/e_log2f_data.c: New file.
without wrapper on aarch64:
logf reciprocal-throughput: 2.2x faster
logf latency: 1.9x faster
old worst case error: 0.89 ulp
new worst case error: 0.82 ulp
aarch64 .text size: -356 bytes
aarch64 .rodata size: +240 bytes
Uses double precision arithmetics and a lookup table to allow smaller
polynomial and avoid the use of division.
Data is in a separate translation unit with fixed layout to prevent the
compiler generating suboptimal literal access.
Errors are handled inline according to POSIX rules, but this patch
keeps the wrapper with SVID compatible error handling.
Needs libm-test-ulps adjustment for clogf in non-nearest rounding mode.
* math/Makefile (type-float-routines): Add e_logf_data.
* sysdeps/ieee754/flt-32/e_logf.c: New implementation.
* sysdeps/ieee754/flt-32/e_logf_data.c: New file.
* sysdeps/ieee754/flt-32/math_config.h (__logf_data): Define.
(LOGF_TABLE_BITS, LOGF_POLY_ORDER): Define.
* sysdeps/i386/fpu/e_logf_data.c: New file.
* sysdeps/ia64/fpu/e_logf_data.c: New file.
* sysdeps/m68k/m680x0/fpu/e_logf_data.c: New file.
When --enable-static-pie is used to build static PIE, _DYNAMIC is used
to compute the load address of static PIE. But _DYNAMIC is undefined
when creating static executable. This patch makes _DYNAMIC weak in PIE
libc.a so that it can be undefined.
* sysdeps/i386/dl-machine.h (elf_machine_load_address): Allow
undefined _DYNAMIC in PIE libc.a.
* sysdeps/x86_64/dl-machine.h (elf_machine_load_address):
Likewse.
Simplify the C99 isgreater macros. Although some support was added
in GCC 2.97, not all targets added support until GCC 3.1. Therefore
only use the builtins in math.h from GCC 3.1 onwards, and defer to
generic macros otherwise. Improve the generic isunordered macro
to use compares rather than call fpclassify twice - this is not only
faster but also correct for signaling NaNs.
* math/math.h: Improve handling of C99 isgreater macros.
* sysdeps/alpha/fpu/bits/mathinline.h: Remove isgreater macros.
* sysdeps/m68k/m680x0/fpu/bits/mathinline.h: Likewise.
* sysdeps/powerpc/bits/mathinline.h: Likewise.
* sysdeps/sparc/fpu/bits/mathinline.h: Likewise.
* sysdeps/x86/fpu/bits/mathinline.h: Likewise.
In <https://sourceware.org/ml/libc-alpha/2013-05/msg00722.html> I
remarked on the possibility of arithmetic in various nearbyint
implementations being scheduled before feholdexcept calls, resulting
in spurious "inexact" exceptions.
I'm now actually observing this occurring in glibc built for ARM with
GCC 7 (in fact, both copies of the same addition/subtraction sequence
being combined and moved out before the conditionals and
feholdexcept/fesetenv pairs), resulting in test failures.
This patch makes the nearbyint implementations with this particular
feholdexcept / arithmetic / fesetenv pattern consistently use
math_opt_barrier on the function argument when first used in
arithmetic, and also consistently use math_force_eval before fesetenv
(the latter was generally already done, but the dbl-64/wordsize-64
implementation used math_opt_barrier instead, and as
math_opt_barrier's intended effect is through its output value being
used, such a use that doesn't use the return value is suspect).
Tested for x86_64 (--disable-multi-arch so more of these
implementations get used), and for ARM in a configuration where I saw
the problem scheduling.
[BZ #22225]
* sysdeps/ieee754/dbl-64/s_nearbyint.c (__nearbyint): Use
math_opt_barrier on argument when doing arithmetic on it.
* sysdeps/ieee754/dbl-64/wordsize-64/s_nearbyint.c (__nearbyint):
Likewise. Use math_force_eval not math_opt_barrier after
arithmetic.
* sysdeps/ieee754/flt-32/s_nearbyintf.c (__nearbyintf): Use
math_opt_barrier on argument when doing arithmetic on it.
* sysdeps/ieee754/ldbl-128/s_nearbyintl.c (__nearbyintl):
Likewise.
from `freeaddrinfo'.
`getifaddrs' and `freeifaddrs' are not in POSIX, they should not be
exposed along `freeaddrinfo' (through `__check_pf') which is POSIX.
* include/ifaddrs.h (__getifaddrs, __freeifaddrs): New declarations,
and use libc_hidden_def on them.
* inet/ifaddrs.c (__getifaddrs, __freeifaddrs): Use libc_hidden_def on
them.
* sysdeps/gnu/ifaddrs.c (__getifaddrs, __freeifaddrs): Likewise.
* inet/check_pf.c (__check_pf): Use __getifaddrs and __freeifaddrs
instead of getifaddrs and freeifaddrs.
`seekdir' is MISC || XOPEN, it should not be exposed along `rewinddir' which
is POSIX.
* include/dirent.h (__seekdir): New declaration.
* sysdeps/mach/hurd/seekdir.c (seekdir): Rename to __seekdir and
redefine as weak alias.
* sysdeps/mach/hurd/rewinddir.c (__rewinddir): Use __seekdir instead
of seekdir.
`revoke' is MISC only, it should not be exposed along `unlockpt' which is
XOPEN.
* include/unistd.h (__revoke): New declaration.
* misc/revoke.c (revoke): Rename to __revoke, and redefine as weak
alias.
* sysdeps/mach/hurd/revoke.c (revoke): Likewise.
* sysdeps/unix/bsd/unlockpt.c (unlockpt): Use __revoke instead of
revoke.
dirfd is XOPEN2K8 only, it should not be exposed along ftw which is earlier.
* include/dirent.h (__dirfd): New declaration.
* dirent/dirfd.c (dirfd): Rename to __dirfd, and redefine as weak
alias.
* sysdeps/posix/dirfd/dirfd.c (dirfd): Likewise.
* sysdeps/mach/hurd/dirfd.c (dirfd): Likewise.
* io/ftw.c (open_dir_stream, ftw_dir): Use __dirfd instead of dirfd.
sysdeps/unix/make-syscalls.sh has support, used only by x32, for
generating IFUNCs for kernel VDSO symbols. This support creates
IFUNCs by setting symbol types manually, which is bad for debug info
and does not work with current GCC mainline because it results in
errors from the checks on types of function aliases.
This patch fixes it to use the common __ifunc macro, which uses the
ifunc attribute when available and so works with GCC mainline. Note
however that the original error resulted from an indirect inclusion of
a header declaring __gettimeofday from the generated sources, and
using __ifunc now relies on such an indirect inclusion remaining as it
means use of __typeof to determine the correct types. If glibc's
headers change in such a way as to remove that indirect inclusion, it
will become necessary to change the syscalls.list syntax for VDSO
syscalls so the name of the header to include can be specified.
Tested (compilation only) with build-many-glibcs.py that this fixes
the build for x32 with GCC mainline.
* sysdeps/unix/make-syscalls.sh: Use __ifunc to define symbols
using VDSO.
glibc fails to build with GCC mainline for SPARC because of the use of
manually-created IFUNCs, which fail the tests of compatibility of
function alias types. This patch changes sparc-ifunc.h to use the
generic __ifunc in defining sparc_libm_ifunc. The generic __ifunc can
use the GCC ifunc attribute when available, so ensuring
type-correctness as well as better debug info than when setting symbol
types in asm statements.
Note that for this to fix the build with GCC mainline the GCC patch
<https://gcc.gnu.org/ml/gcc-patches/2017-09/msg01779.html>, or
building GCC with --enable-gnu-indirect-function, is also needed.
Tested (compilation only) with build-many-glibcs.py (sparc64-linux-gnu
and sparcv9-linux-gnu, with GCC 8 with the above patch, and also with
GCC 7).
* sysdeps/sparc/sparc-ifunc.h [!__ASSEMBLER__] (sparc_libm_ifunc):
Define using __ifunc.
As per https://gcc.gnu.org/ml/gcc-patches/2017-09/msg01220.html ia64
defaults to non-executable stacks in the Linux kernel (furthermore,
the use of function descriptors means that trampolines for nested
function pointers never need an executable stack). glibc however
defines DEFAULT_STACK_PERMS to include PF_X for that architecture,
meaning (a) elf/check-execstack fails and (b) (from code inspection,
not tested, but this is why I think this is a user-visible bug) thread
stacks are unnecessarily mapped with execute permission. This patch
fixes the DEFAULT_STACK_PERMS definition in question.
Tested (compilation only) with build-many-glibcs.py for ia64. This
fixes the check-execstack failure.
[BZ #22156]
* sysdeps/ia64/stackinfo.h (DEFAULT_STACK_PERMS): Likewise.
This patch follows commit 5554304f0 (posix: Allow glob to match dangling
symlinks [BZ #866]) by adding a compat symbol that follow previous
semantic of not following dangling symlinks and thus avoiding call
gl_lstat with GLOB_ALTDIRFUNC.
It avoids failure with old binaries that not set the alternate function
pointer for lstat (GNUmake for instance). The following scenario, for
instance, fails with current GNUmake because glibc will access unitialized
memory when calling gl_lstat:
$ cat src/t/t.c
int main ()
{
return 0;
}
$ cat Makefile
SRC = $(wildcard src/*/t.c)
OBJ = $(patsubst src/%.c, obj/%.o, $(SRC))
prog: $(OBJ)
$(CC) $(CFLAGS) $(LDFLAGS) $(LIBS) $(OBJ) -o prog
obj/%.o: src/%.c
$(CC) $(CFLAGS) -c $< -o $@
$ make
This works as expected with the patch applied. Since it is for generic
ABI, default compat symbols are added with override for Linux due LFS.
Now we have two compat symbols for glob on Linux:
1. sysdeps/unix/sysv/linux/oldglob.c which implements glob64 with
the old dirent layout. For this implementation I also set it to
not follow dangling symlinks (which is the safest path).
2. sysdeps/unix/sysv/linux/glob{64}-lstat-compat.c which implements
the compat symbol for dangling symlinks. As for generic glob,
the implementation uses XSTAT_IS_XSTAT64 to define whether
both __glob_lstat_compat and __glob64_lstat_compat should be
different implementations. For archictures that define
XSTAT_IS_XSTAT64, __glob_lstat_compat is aliased to
__glob64_lstat_compat.
3. sysdeps/unix/sysv/linux/alpha/oldglob.c with a different glob_t
layout. As for 1. this patch changes it to not follow dangling
symlinks.
The patch also bumps _GNU_GLOB_INTERFACE_VERSION to 2 to advertise the
new semantic. On GNUmake, for instance, it will force to it use its
internal glob implementation instead and avoiding triggering the same
failure on builds against newer GLIBCs.
Checked on x86_64-linux-gnu and i686-linux-gnu. I also checked
with a build against the major ABIs required to check for the abilist.
The changes should also work on gnulib (I run gnulib-tool.py check glob
and it shown no regressions).
[BZ #22183]
* include/gnu-versions.h (_GNU_GLOB_INTERFACE_VERSION): Increase
version to 2.
* posix/Makefile (routines): Add glob-lstat-compat and
glob64-lstat-compat.
* posix/Versions (GLIBC_2.27, glob, glob64): Add symbol version.
* posix/glob-lstat-compat.c: New file.
* posix/glob64-lstat-compat.c: Likewise.
* posix/tst-glob_lstat_compat.c: Likewise.
* sysdeps/unix/sysv/linux/glob-lstat-compat.c: Likewise.
* sysdeps/unix/sysv/linux/alpha/glob-lstat-compat.c: Likewise.
* sysdeps/unix/sysv/linux/glob64-lstat-compat.c: Likewise.
* sysdeps/unix/sysv/linux/alpha/glob.c: Remove file.
* posix/glob.c (glob_lstat): New function.
(glob): Rename to __glob and add versioned symbol to 2.27.
(glob_in_dir): Use glob_lstat.
* posix/glob64.c (glob64): Add GLOB_ATTRIBUTE.
* sysdeps/unix/sysv/linux/arm/libc.abilist: Likewise.
* sysdeps/unix/sysv/linux/glob.c (glob): Add versioned symbol for
2.27.
* sysdeps/unix/sysv/linux/glob64.c (glob64): Likewise.
* sysdeps/unix/sysv/linux/oldglob.c (GLOB_NO_LSTAT): Define.
* sysdeps/unix/sysv/linux/alpha/oldglob.c (__old_glob): Do not use
gl_lstat on glob call.
* sysdeps/unix/sysv/linux/aarch64/libc.abilist: Add GLIBC_2.27 glob
and glob64 symbols.
* sysdeps/unix/sysv/linux/alpha/libc.abilist: Likewise.
* sysdeps/unix/sysv/linux/hppa/libc.abilist: Likewise.
* sysdeps/unix/sysv/linux/i386/libc.abilist: Likewise.
* sysdeps/unix/sysv/linux/ia64/libc.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist: Likewise.
* sysdeps/unix/sysv/linux/microblaze/libc.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist: Likewise.
* sysdeps/unix/sysv/linux/nios2/libc.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist:
Likewise.
* sysdeps/unix/linux/powerpc/powerpc32/nofpu/libc.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libc.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist: Likewise.
* sysdeps/unix/sysv/linux/sh/libc.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist: Likewise.
* sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libc.abilist: Likewise.
* sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libc.abilist: Likewise.
* sysdeps/unix/sysv/linux/tile/tilepro/libc.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/64/libc.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist: Likewise.
This patch fixes a typo in inclusion guard in sincos32.h.
ChangeLog:
* sysdeps/ieee754/dbl-64/sincos32.h
[SINCCOS32_H]: Remove define.
[SINCOS32_H]: Define.
This patch changes the expf and exp2f error handling semantics to only
set errno accoring to POSIX rules. New symbol version is introduced at
GLIBC_2.27.
The old wrappers are kept for compat symbols.
Internal calls to __expf now get the new error semantics, this seems to
only affect sysdeps/i386/fpu/s_expm1f.S where the errno-only behaviour
should be correct.
ia64 needed assembly change to have the new and compat versioned symbol
map to the same function.
All linux libm abilists are updated.
* math/Versions (expf): New libm symbol at GLIBC_2.27.
(exp2f): Likewise.
* math/w_exp2f.c: New file.
* math/w_expf.c: New file.
* math/w_exp2f_compat.c (__exp2f_compat): For compat symbol only.
* math/w_expf_compat.c (__expf_compat): Likewise.
* sysdeps/ia64/fpu/e_exp2f.S: Add versioned symbols.
* sysdeps/ia64/fpu/e_expf.S: Likewise.
* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update.
* sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist:
Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist:
Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist:
Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist:
Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libm.abilist:
Likewise.
* sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libm.abilist:
Likewise.
* sysdeps/unix/sysv/linux/tile/tilepro/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.
Based on new expf and exp2f code from
https://github.com/ARM-software/optimized-routines/
with wrapper on aarch64:
expf reciprocal-throughput: 2.3x faster
expf latency: 1.7x faster
without wrapper on aarch64:
expf reciprocal-throughput: 3.3x faster
expf latency: 1.7x faster
without wrapper on aarch64:
exp2f reciprocal-throughput: 2.8x faster
exp2f latency: 1.3x faster
libm.so size on aarch64:
.text size: -152 bytes
.rodata size: -1740 bytes
expf/exp2f worst case nearest rounding error: 0.502 ulp
worst case non-nearest rounding error: 1 ulp
Error checks are inline and errno setting is in separate tail called
functions, but the wrappers are kept in this patch to handle the
_LIB_VERSION==_SVID_ case. (So e.g. errno is set twice for expf calls
and once for __expf_finite calls on targets where the new code is used.)
Double precision arithmetics is used which is expected to be faster on
most targets (including soft-float) than using single precision and it
is easier to get good precision result with it.
Const data is kept in a separate translation unit which complicates
maintenance a bit, but is expected to give good code for literal loads
on most targets and allows sharing data across expf, exp2f and powf.
(This data is disabled on i386, m68k and ia64 which have their own
expf, exp2f and powf code.)
Some details may need target specific tweaks:
- best convert and round to int operation in the arg reduction may be
different across targets.
- code was optimized on fma target, optimal polynomial eval may be
different without fma.
- gcc does not always generate good code for fp bit representation
access via unions or it may be inherently slow on some targets.
The libm-test-ulps will need adjustment because..
- The argument reduction ideally uses nearest rounded rint, but that is
not efficient on most targets, so the polynomial can get evaluated on a
wider interval in non-nearest rounding mode making 1 ulp errors common
in that case.
- The polynomial is evaluated such that it may have 1 ulp error on
negative tiny inputs with upward rounding.
* math/Makefile (type-float-routines): Add math_errf and e_exp2f_data.
* sysdeps/aarch64/fpu/math_private.h (TOINT_INTRINSICS): Define.
(roundtoint, converttoint): Likewise.
* sysdeps/ieee754/flt-32/e_expf.c: New implementation.
* sysdeps/ieee754/flt-32/e_exp2f.c: New implementation.
* sysdeps/ieee754/flt-32/e_exp2f_data.c: New file.
* sysdeps/ieee754/flt-32/math_config.h: New file.
* sysdeps/ieee754/flt-32/math_errf.c: New file.
* sysdeps/ieee754/flt-32/t_exp2f.h: Remove.
* sysdeps/i386/fpu/e_exp2f_data.c: New file.
* sysdeps/i386/fpu/math_errf.c: New file.
* sysdeps/ia64/fpu/e_exp2f_data.c: New file.
* sysdeps/ia64/fpu/math_errf.c: New file.
* sysdeps/m68k/m680x0/fpu/e_exp2f_data.c: New file.
* sysdeps/m68k/m680x0/fpu/math_errf.c: New file.
conform/ISO11/time.h/linknamespace complains that using timespec_get exposes
gettimeofday.
conform/POSIX/time.h/linknamespace complains that using clock_settime
exposes settimeofday.
* sysdeps/unix/clock_gettime.c (realtime_gettime, __clock_gettime): Use
__gettimeofday instead of gettimeofday.
* sysdeps/unix/clock_settime.c (__clock_settime): Use __settimeofday
instead of settimeofday.
* sysdeps/mach/hurd/bits/socket.h: Include <bits/wordsize.h> instead
of <limits.h>
(__need_NULL): Do not define.
(__ss_aligntype): Use __WORDSIZE instead of ULONG_MAX to determine
alignment.
[!__USE_MISC] (pseudo_AF_XTP, pseudo_AF_RTIP, pseudo_AF_PIP,
CMGROUP_MAX, cmsgcred): Do not define.
(CMSG_FIRSTHDR, __cmsg_nxthdr): Use (struct cmsghdr *) 0 instead of
NULL.
* bits/socket.h: Likewise.
* sysdeps/mach/hurd/dl-sysdep.c (check_no_hidden): New macro.
(__open, __close, __libc_read, __libc_write, __writev, __libc_lseek64,
__mmap, __fxstat64, __xstat64, __access, __access_noerrno, __getpid,
__getcwd, __sbrk, __strtoul_internal, _exit, abort): Use check_no_hidden
to make sure that these symbols are defined.
This patch makes flt-32 libm functions use libm_alias_float to define
public interfaces (in cases where _Float32 aliases of those interfaces
would be appropriate, so not for finitef / isinff / isnanf).
Tested for x86_64. Also tested with build-many-glibcs.py that
installed stripped shared libraries are unchanged by the patch.
* sysdeps/ieee754/flt-32/s_asinhf.c: Include <libm-alias-float.h>.
(asinhf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_atanf.c: Include <libm-alias-float.h>.
(atanf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_cbrtf.c: Include <libm-alias-float.h>.
(cbrtf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_ceilf.c: Include <libm-alias-float.h>.
(ceilf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_copysignf.c: Include
<libm-alias-float.h>.
(copysignf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_cosf.c: Include <libm-alias-float.h>.
(cosf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_erff.c: Include <libm-alias-float.h>.
(erff): Define using libm_alias_float.
(erfcf): Likewise.
* sysdeps/ieee754/flt-32/s_expm1f.c: Include <libm-alias-float.h>.
(expm1f): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_fabsf.c: Include <libm-alias-float.h>.
(fabsf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_floorf.c: Include <libm-alias-float.h>.
(floorf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_frexpf.c: Include <libm-alias-float.h>.
(frexpf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_fromfpf.c (fromfpf): Define using
libm_alias_float.
* sysdeps/ieee754/flt-32/s_fromfpf_main.c: Include
<libm-alias-float.h>.
* sysdeps/ieee754/flt-32/s_fromfpxf.c (fromfpxf): Define using
libm_alias_float.
* sysdeps/ieee754/flt-32/s_getpayloadf.c: Include
<libm-alias-float.h>.
(getpayloadf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_llrintf.c: Include
<libm-alias-float.h>.
(llrintf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_llroundf.c: Include
<libm-alias-float.h>.
(llroundf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_logbf.c: Include <libm-alias-float.h>.
(logbf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_lrintf.c: Include <libm-alias-float.h>.
(lrintf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_lroundf.c: Include <libm-alias-float.h>.
(lroundf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_modff.c: Include <libm-alias-float.h>.
(modff): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_nearbyintf.c: Include
<libm-alias-float.h>.
(nearbyintf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_nextafterf.c: Include
<libm-alias-float.h>.
(nextafterf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_nextupf.c: Include
<libm-alias-float.h>.
(nextupf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_remquof.c: Include
<libm-alias-float.h>.
(remquof): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_rintf.c: Include <libm-alias-float.h>.
(rintf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_roundevenf.c: Include
<libm-alias-float.h>.
(roundevenf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_roundf.c: Include <libm-alias-float.h>.
(roundf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_setpayloadf.c (setpayloadf): Define
using libm_alias_float.
* sysdeps/ieee754/flt-32/s_setpayloadf_main.c: Include
<libm-alias-float.h>.
* sysdeps/ieee754/flt-32/s_setpayloadsigf.c (setpayloadsigf):
Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_sincosf.c: Include
<libm-alias-float.h>.
(sincosf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_sinf.c: Include <libm-alias-float.h>.
(sinf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_tanf.c: Include <libm-alias-float.h>.
(tanf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_tanhf.c: Include <libm-alias-float.h>.
(tanhf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_totalorderf.c: Include
<libm-alias-float.h>.
(totalorderf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_totalordermagf.c: Include
<libm-alias-float.h>.
(totalordermagf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_truncf.c: Include <libm-alias-float.h>.
(truncf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_ufromfpf.c (ufromfpf): Define using
libm_alias_float.
* sysdeps/ieee754/flt-32/s_ufromfpxf.c (ufromfpxf): Define using
libm_alias_float.
The IEEE 754 implementation of lgammal in sysdeps/ieee754/ldbl-128/ used
to be shared by IBM's implementation in sysdeps/ieee754/ldbl-128ibm/ (by
an inclusion of the source file). In order for the algorithm to work
for IBM's implementation, a check for LDBL_MANT_DIG was required. Since
the source file is no longer shared, the requirement for the check is
gone. This patch removes the conditionals.
Tested for powerpc64le and s390x.
* sysdeps/ieee754/ldbl-128/e_lgammal_r.c (__ieee754_lgammal_r):
Remove conditionals on LDBL_MANT_DIG.
* sysdeps/ieee754/ldbl-128ibm/e_lgammal_r.c
(__ieee754_lgammal_r): Likewise.
The ldbl-128ibm implementation of j0l, j1l, lgammal_r, and cbrtl, as
well as the tables used by expl were copied from ldbl-128. However, the
original files used _Float128 for the type and L() for the literal
suffix. This patch uses the following sed command to rewrite _Float128
as long double and L(x) as xL (for e_expl.c, e_j0l.c, e_j1l.c,
e_lgammal_r.c, and t_expl.h):
sed -i <filename> \
-e "/^#define _Float128 long double/d" \
-e "/^#define L(x) x ## L/d" \
-e "/L(/s/)/L/" \
-e "/L(/s/L(//" \
-e "s/_Float128/long double/g"
For sysdeps/ieee754/ldbl-128ibm/s_cbrtl.c, this sed command incorrectly
replaces a few occurrences of L(), so the following command is used
instead:
sed -i sysdeps/ieee754/ldbl-128ibm/s_cbrtl.c \
-e "/^#define _Float128 long double/d" \
-e "/^#define L(x) x ## L/d" \
-e "s/L(0\.3\{40\})/0.3333333333333333333333333333333333333333L/" \
-e "s/L(3\.7568280825958912391243e-1)/3.7568280825958912391243e-1L/" \
-e "/L(/s/)/L/" \
-e "/L(/s/L(//" \
-e "s/_Float128/long double/g"
Tested for powerpc64le with patched [1] and unpatched gcc.
[1] https://gcc.gnu.org/ml/gcc-patches/2017-08/msg01028.html
* sysdeps/ieee754/ldbl-128ibm/e_expl.c: Remove definitions of
_Float128 and L().
* sysdeps/ieee754/ldbl-128ibm/e_j0l.c: Remove definitions of
_Float128 and L(). Replace _Float128 with long double and L(x)
with xL, throughout the file.
* sysdeps/ieee754/ldbl-128ibm/e_j1l.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_lgammal_r.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_cbrtl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/t_expl.h: Likewise.
Some files under sysdeps/ieee754/ldbl-128ibm/ are able to reuse the
implementation in sysdeps/ieee754/ldbl-128/ by defining _Float128 to
long double. This relied on compiler support for _Float128 being
disabled. On powerpc, such support was disabled by default, however, it
got enabled by default [1] in GCC 8.
This patch copies the implementations from ldbl-128 to ldbl-128ibm. The
uses of _Float128 and L() are kept intact in this patch and are replaced
with a script in a subsequent patch.
[1] https://gcc.gnu.org/ml/gcc-patches/2017-08/msg01028.html
Tested for powerpc64 and powerpc64le.
* sysdeps/ieee754/ldbl-128ibm/e_expl.c: Include tables from
sysdeps/ieee754/ldbl-128ibm.
* sysdeps/ieee754/ldbl-128ibm/e_j0l.c: Copy contents from the
equivalent implementation in sysdeps/ieee754/ldbl-128/ instead
of including it. Keep _Float128 and L() intact. These will be
reviewed by a separate patch.
* sysdeps/ieee754/ldbl-128ibm/e_j1l.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_lgammal_r.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_cbrtl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/t_expl.h: Likewise.
On powerpc64le, compiler support for float128 is not enabled by default
on gcc. To enable it, the flag -mfloat128 must be passed as a command
line option to the compiler. This means that only the few files that
actively have -mfloat128 passed as an argument get compiler support for
float128, whereas all other files don't.
When -mfloat128 becomes enabled by default on powerpc [1], all the files
that do not currently have compiler support for float128 enabled during
their compilation, will start to have it. This will lead to build
errors in s_finite.c, s_isinf.c, and s_isnan.c.
The errors are due to the unintended macro expansion of __finitef128 to
__redirect_finitef128 in math/bits/mathcalls-helper-functions.h. In
that header, __MATHDECL_1 takes '__finite' and 'f128' as arguments and
concatenates them. However, since '__finite' has been redefined in
s_finite.c, the function declaration becomes __redirect_finitef128:
extern int __redirect___finitef128 (_Float128 __value) __attribute__ ((__nothrow__ )) __attribute__ ((__const__));
This declaration itself is OK. The problem arises when include/math.h
creates the hidden prototype ('hidden_proto (__finitef128)'), which
expands to:
extern __typeof (__finitef128) __finitef128 __attribute__ ((visibility ("hidden")));
Since __finitef128 is not declared, __typeof fails. This effect was
already true for the 'float' and 'long double' versions and is now true
for float128. Likewise for isinsff128 and isnanf128.
This patch defines __finitef128 as __redirect___finitef128 in
sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite.c, similarly to what's
done for the float and long double versions of these functions, to get
rid of the build error. Likewise for isinff128 and isnanf128.
[1] https://gcc.gnu.org/ml/gcc-patches/2017-08/msg01028.html
Tested for powerpc64 and powerpc64le.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite.c
(__finitef128): Define to __redirect___finitef128.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf.c
(__isinff128): Define to __redirect___isinff128.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan.c
(__isnanf128): Define to __redirect___isnanf128.
On powerpc64le, not all files can have the flag -mfloat128 passed as an
option on the compile command, since that could conflict with other
flags, such as -mno-vsx. Each file that needs the flag, gets it through
a CFLAGS-filename variable on sysdeps/powerpc/powerpc64le/Makefile.
The test cases tst-strtod-nan-locale and tst-wcstod-nan-locale are
missing this flag.
Tested for powerpc64le.
* sysdeps/powerpc/powerpc64le/Makefile
(CFLAGS-tst-strtod-nan-locale.c): New variable.
(CFLAGS-tst-wcstod-nan-locale.c): New variable.
This patch adds SSE4.1 versions of trunc and truncf, using the roundsd
/ roundss instructions, similar to the versions of ceil, floor, rint
and nearbyint functions we already have. In my testing with the glibc
benchtests these are about 30% faster than the C versions for double,
20% faster for float.
Tested for x86_64.
[BZ #20142]
* sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines):
Add s_trunc-c, s_truncf-c, s_trunc-sse4_1 and s_truncf-sse4_1.
* sysdeps/x86_64/fpu/multiarch/s_trunc-c.c: New file.
* sysdeps/x86_64/fpu/multiarch/s_trunc-sse4_1.S: Likewise.
* sysdeps/x86_64/fpu/multiarch/s_trunc.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/s_truncf-c.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/s_truncf-sse4_1.S: Likewise.
* sysdeps/x86_64/fpu/multiarch/s_truncf.c: Likewise.
The recent fexecve changes broke the build on (at least) alpha (maybe
other configurations, that was the first breakage I saw in my
build-many-glibcs.py run):
In file included from ../sysdeps/unix/sysv/linux/alpha/sysdep.h:29:0,
from ../sysdeps/alpha/nptl/tls.h:31,
from ../include/errno.h:25,
from ../sysdeps/unix/sysv/linux/fexecve.c:18:
../sysdeps/unix/sysv/linux/fexecve.c: In function 'fexecve':
../sysdeps/unix/alpha/sysdep.h:203:10: error: 'sizeof' on array function parameter 'argv' will return size of 'char * const*' [-Werror=sizeof-array-argument]
(sizeof(arg) == 4 ? (long)(int)(long)(arg) : (long)(arg))
^
../sysdeps/unix/alpha/sysdep.h:302:26: note: in expansion of macro 'syscall_promote'
register long _tmp_18 = syscall_promote (arg3); \
^~~~~~~~~~~~~~~
../sysdeps/unix/alpha/sysdep.h:173:2: note: in expansion of macro 'inline_syscall5'
inline_syscall##nr(__NR_##name, args); \
^~~~~~~~~~~~~~
../sysdeps/unix/sysv/linux/alpha/sysdep.h:85:2: note: in expansion of macro 'INLINE_SYSCALL1'
INLINE_SYSCALL1(name, nr, args); \
^~~~~~~~~~~~~~~
../sysdeps/unix/sysv/linux/fexecve.c:42:3: note: in expansion of macro 'INLINE_SYSCALL'
INLINE_SYSCALL (execveat, 5, fd, "", argv, envp, AT_EMPTY_PATH);
^~~~~~~~~~~~~~
../sysdeps/unix/sysv/linux/fexecve.c:33:30: note: declared here
fexecve (int fd, char *const argv[], char *const envp[])
^~~~
This patch fixes this similarly to previous fixes for such issues: use
&argv[0] and &envp[0] as the syscall macro arguments. Tested
(compilation only) for alpha-linux-gnu with build-many-glibcs.py.
* sysdeps/unix/sysv/linux/fexecve.c (fexecve) [__NR_execveat]:
Explicitly take address of first element of array arguments in
call to INLINE_SYSCALL.
Add unwind info to __libc_start_main so that unwinding continues one
extra level to _start. Similarly add unwind info to backtrace.
Given many targets require this, do this in a general way.
* csu/Makefile: Add -funwind-tables to libc-start.c.
* debug/Makefile: Add -funwind-tables to backtrace.c.
* sysdeps/aarch64/Makefile: Remove CFLAGS-backtrace.c.
* sysdeps/arm/Makefile: Likewise.
* sysdeps/i386/Makefile: Likewise.
* sysdeps/m68k/Makefile: Likewise.
* sysdeps/mips/Makefile: Likewise.
* sysdeps/nios2/Makefile: Likewise.
* sysdeps/sh/Makefile: Likewise.
* sysdeps/sparc/Makefile: Likewise.
As per the section "3.1.4.2 Alignment Interrupts" of the "POWER8 Processor
User's Manual for the Single-Chip Module", alignment interrupt is reported
for misaligned stores in Caching-inhibited storage. As memset is used in
some drivers for DMA (like xorg), this patch avoids misaligned stores for
sizes less than 8 in memset.