The fallback code of Linux wrapper for preadv2/pwritev2 executes
regardless of the errno code for preadv2, instead of the case where
the syscall is not supported.
This fixes it by calling the fallback code iff errno is ENOSYS. The
patch also adds tests for both invalid file descriptor and invalid
iov_len and vector count.
The only discrepancy between preadv2 and fallback code regarding
error reporting is when an invalid flags are used. The fallback code
bails out earlier with ENOTSUP instead of EINVAL/EBADF when the syscall
is used.
Checked on x86_64-linux-gnu on a 4.4.0 and 4.15.0 kernel.
[BZ #23579]
* misc/tst-preadvwritev2-common.c (do_test_with_invalid_fd): New
test.
* misc/tst-preadvwritev2.c, misc/tst-preadvwritev64v2.c (do_test):
Call do_test_with_invalid_fd.
* sysdeps/unix/sysv/linux/preadv2.c (preadv2): Use fallback code iff
errno is ENOSYS.
* sysdeps/unix/sysv/linux/preadv64v2.c (preadv64v2): Likewise.
* sysdeps/unix/sysv/linux/pwritev2.c (pwritev2): Likewise.
* sysdeps/unix/sysv/linux/pwritev64v2.c (pwritev64v2): Likewise.
Continuing the move to use, within libm, public names for libm
functions that can be inlined as built-in functions on many
architectures, this patch moves calls to __round functions to call the
corresponding round names instead, with asm redirection to __round
when the calls are not inlined.
An additional complication arises in
sysdeps/ieee754/ldbl-128ibm/e_expl.c, where a call to roundl, with the
result converted to int, gets converted by the compiler to call
lroundl in the case of 32-bit long, so resulting in localplt test
failures. It's logically correct to let the compiler make such an
optimization; an appropriate asm redirection of lroundl to __lroundl
is thus added to that file (it's not needed anywhere else).
Tested for x86_64, and with build-many-glibcs.py.
* include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ &&
__FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (round): Redirect
using MATH_REDIRECT.
* sysdeps/aarch64/fpu/s_round.c: Define NO_MATH_REDIRECT before
header inclusion.
* sysdeps/aarch64/fpu/s_roundf.c: Likewise.
* sysdeps/ieee754/dbl-64/s_round.c: Likewise.
* sysdeps/ieee754/dbl-64/wordsize-64/s_round.c: Likewise.
* sysdeps/ieee754/float128/s_roundf128.c: Likewise.
* sysdeps/ieee754/flt-32/s_roundf.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_roundl.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_roundl.c: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_round.c: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_roundf.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_round.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_roundf.c: Likewise.
* sysdeps/riscv/rv64/rvd/s_round.c: Likewise.
* sysdeps/riscv/rvf/s_roundf.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_roundl.c: Likewise.
(round): Redirect to __round.
(__roundl): Call round instead of __round.
* sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__round):
Remove macro.
[_ARCH_PWR5X] (__roundf): Likewise.
* sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Use round
functions instead of __round variants.
* sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Likewise.
* sysdeps/ieee754/ldbl-128/e_gammal_r.c (gammal_positive):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (gammal_positive):
Likewise.
* sysdeps/ieee754/ldbl-96/e_gammal_r.c (gammal_positive):
Likewise.
* sysdeps/x86/fpu/powl_helper.c (__powl_helper): Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_expl.c (lroundl): Redirect to
__lroundl.
(__ieee754_expl): Call roundl instead of __roundl.
The function f1a, executed on a stack of size 32k, allocates an object of
size 32k on the stack. Make the stack variables static to reduce
excessive stack usage.
Continuing bits/mman.h unification between architectures using the
Linux kernel, this patch arranges for the common set of MAP_* flags to
be used by two more architectures. That common set is moved to
bits/mman-map-flags-generic.h, which is included by bits/mman.h, to
allow architectures to use that common set even if they also have
architecture-specific additions to it. As well as the generic
bits/mman.h, the versions for x86 and ia64 are also then made to
include bits/mman-map-flags-generic.h, so while they still need
architecture-specific bits/mman.h (for MAP_32BIT and MAP_GROWSUP
respectively), they do not need to duplicate the generic flag
definitions in there.
Tested for x86_64 and x86, and with build-many-glibcs.py.
* sysdeps/unix/sysv/linux/bits/mman-map-flags-generic.h: New
file. Most contents moved from ....
* sysdeps/unix/sysv/linux/bits/mman.h: ... here. Move contents to
and include <bits/mman-map-flags-generic.h>.
* sysdeps/unix/sysv/linux/Makefile [$(subdir) = misc]
(sysdep_headers): Add bits/mman-map-flags-generic.h.
* sysdeps/unix/sysv/linux/ia64/bits/mman.h: Include
<bits/mman-map-flags-generic.h>.
[__USE_MISC] (MAP_GROWSUP): Only define this macro, not other
macros defined in <bits/mman-map-flags-generic.h>.
* sysdeps/unix/sysv/linux/x86/bits/mman.h: Include
<bits/mman-map-flags-generic.h>.
[__USE_MISC] (MAP_32BIT): Only define this macro, not other macros
defined in <bits/mman-map-flags-generic.h>.
Currently, DT_TEXTREL is incompatible with IFUNC. When DT_TEXTREL or
DF_TEXTREL is seen, the dynamic linker calls __mprotect on the segments
with PROT_READ|PROT_WRITE before applying dynamic relocations. It leads
to segfault when performing IFUNC resolution (which requires PROT_EXEC
as well for the IFUNC resolver).
This patch makes it call __mprotect with extra PROT_WRITE bit, which
will keep the PROT_EXEC bit if exists, and thus fixes the segfault.
FreeBSD rtld libexec/rtld-elf/rtld.c (reloc_textrel_prot) does the same.
Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu,
sparc64-linux-gnu, sparcv9-linux-gnu, and armv8-linux-gnueabihf.
Adam J. Richte <adam_richter2004@yahoo.com>
Adhemerval Zanella <adhemerval.zanella@linaro.org>
Fangrui Song <maskray@google.com>
[BZ #20480]
* config.h.in (CAN_TEXTREL_IFUNC): New define.
* configure.ac: Add check if linker supports textrel relocation with
ifunc.
* elf/dl-reloc.c (_dl_relocate_object): Use all required flags on
DT_TEXTREL segments, not only PROT_READ and PROT_WRITE.
* elf/Makefile (ifunc-pie-tests): Add tst-ifunc-textrel.
(CFLAGS-tst-ifunc-textrel.c): New rule.
* elf/tst-ifunc-textrel.c: New file.
This patch completes the process of unifying sys/procfs.h headers for
architectures using the Linux kernel by making alpha use the generic
version.
That was previously deferred because alpha has different definitions
of prgregset_t and prfpregset_t from other architectures, so changing
to the common definitions would change C++ name mangling. To avoid
such a change, a header bits/procfs-prregset.h is added, and alpha
gets its own version of that header.
Tested for x86_64 and x86, and with build-many-glibcs.py.
* sysdeps/unix/sysv/linux/sys/procfs.h: Include
<bits/procfs-prregset.h>.
(prgregset_t): Define using __prgregset_t.
(prfpregset_t): Define using __prfpregset_t.
* sysdeps/unix/sysv/linux/Makefile [$(subdir) = misc]
(sysdep_headers): Add bits/procfs-prregset.h.
* sysdeps/unix/sysv/linux/bits/procfs-prregset.h: New file.
* sysdeps/unix/sysv/linux/alpha/bits/procfs-prregset.h: Likewise.
* sysdeps/unix/sysv/linux/alpha/bits/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/alpha/sys/procfs.h: Remove file.
This patch continues the process of unifying sys/procfs.h headers for
architectures using the Linux kernel.
A bits/procfs-id.h header is added to define __pr_uid_t and __pr_gid_t
for the types of pr_uid and pr_gid; the default version of this header
uses unsigned int. On some architectures, sys/procfs.h has copies of
32-bit structures for 64-bit builds; those move into a
bits/procfs-extra.h header (they can't go in bits/procfs.h because
they have to come *after* other declarations from sys/procfs.h).
Given appropriate versions of these headers, six more architectures
can then move to providing only bits/procfs*.h without duplicating the
rest of the contents of sys/procfs.h. Only alpha needs a further
bits/ header to be added before it can stop having its own
sys/procfs.h.
Tested for x86_64 and x86, and with build-many-glibcs.py.
* sysdeps/unix/sysv/linux/sys/procfs.h: Include
<bits/procfs-id.h> and <bits/procfs-extra.h>.
(struct elf_prpsinfo): Use __pr_uid_t and __pr_gid_t as types of
pr_uid and pr_gid.
* sysdeps/unix/sysv/linux/Makefile [$(subdir) = misc]
(sysdep_headers): Add bits/procfs-id.h and bits/procfs-extra.h.
* sysdeps/unix/sysv/linux/bits/procfs-extra.h: New file.
* sysdeps/unix/sysv/linux/bits/procfs-id.h: Likewise.
* sysdeps/unix/sysv/linux/arm/bits/procfs-id.h: Likewise.
* sysdeps/unix/sysv/linux/arm/bits/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/m68k/bits/procfs-id.h: Likewise.
* sysdeps/unix/sysv/linux/m68k/bits/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/s390/bits/procfs-extra.h: Likewise.
* sysdeps/unix/sysv/linux/s390/bits/procfs-id.h: Likewise.
* sysdeps/unix/sysv/linux/s390/bits/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/sh/bits/procfs-id.h: Likewise.
* sysdeps/unix/sysv/linux/sh/bits/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/sparc/bits/procfs-extra.h: Likewise.
* sysdeps/unix/sysv/linux/sparc/bits/procfs-id.h: Likewise.
* sysdeps/unix/sysv/linux/sparc/bits/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/x86/bits/procfs-id.h: Likewise.
* sysdeps/unix/sysv/linux/x86/bits/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/arm/sys/procfs.h: Remove file.
* sysdeps/unix/sysv/linux/m68k/sys/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/s390/sys/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/sh/sys/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/sparc/sys/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/x86/sys/procfs.h: Likewise.
As per recent discussions, this patch unifies some of the sys/procfs.h
headers for architectures using the Linux kernel, producing a generic
version that can hopefully be used by all new architectures as well.
The new generic version is based on the AArch64 one. The register
definitions, the only part that generally needs to vary by
architecture, go in a new bits/procfs.h header (which each
architecture using the generic version needs to provide); that header
also has any #includes that were in the architecture-specific
sys/procfs.h, where those includes went beyond the generic set.
The generic version is used for eight architectures where the generic
definitions were the same as the architecture-specific ones. (Some of
those architectures had #if 0 fields, now removed; some defined types
or fields using different type names which were typedefs for the same
underlying types.)
Six of the remaining architectures with their own sys/procfs.h use
unsigned short for pr_uid / pr_gid in some cases; moving those to the
generic header will require a bits/ header to define a typedef for the
type of those fields. In the case of alpha, the generic sys/procfs.h
uses elf_gregset_t (= unsigned long int[33]) to define prgregset_t and
elf_fpregset_t (= double[32]) to define prfpregset_t, but the alpha
version uses gregset_t (= long int[33]) and fpregset_t (= long
int[32]), so avoiding unnecessarily changing the underlying types (and
thus C++ name mangling) again means a bits/ header will need to be
able to define a different choice for those typedefs.
bits/procfs.h is included outside the __BEGIN_DECLS / __END_DECLS pair
(whereas the definitions it contains were previously inside that pair
in various sys/procfs.h headers), because it sometimes includes other
headers and putting those other #includes inside that pair seems
risky. Because none of the declarations in bits/procfs.h are of
functions or variables or involve function types, I don't think it
makes any difference whether they are inside or outside an extern "C"
context.
Tested with build-many-glibcs.py (again, that does not provide much
validation for the correctness of this patch).
* sysdeps/unix/sysv/linux/sys/procfs.h: Replace with file based on
AArch64 version. Include <bits/procfs.h>.
* sysdeps/unix/sysv/linux/Makefile [$(subdir) = misc]
(sysdep_headers): Add bits/procfs.h.
* sysdeps/unix/sysv/linux/bits/procfs.h: New file.
* sysdeps/unix/sysv/linux/aarch64/bits/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/hppa/bits/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/ia64/bits/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/microblaze/bits/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/mips/bits/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/nios2/bits/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/powerpc/bits/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/riscv/bits/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/aarch64/sys/procfs.h: Remove file.
* sysdeps/unix/sysv/linux/hppa/sys/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/ia64/sys/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/microblaze/sys/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/mips/sys/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/nios2/sys/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/powerpc/sys/procfs.h: Likewise.
* sysdeps/unix/sysv/linux/riscv/sys/procfs.h: Likewise.
The variables __gconv_path_elem, __gconv_max_path_elem_len and function
__gconv_get_path declared in, as well as the type path_elem and macro
GCONV_NCHAR_GOAL defined in gconv_int.h are all used in only one iconv
compilation unit each. In addition, the extern declaration of the variable
__gconv_nmodules refers to a variable that does not exist any more.
Considering this, these symbols do not need to be exposed via a header file.
This patch removes the extern declarations from the header file and moves
the definitions to the compilation units where they are used.
For architectures and ABIs that are added in version 2.29 or later the
option --enable-obsolete-nsl is no longer available, and no libnsl
compatibility library is built.
Linux from 3.9 through 4.2 does not abort HTM transaction on syscalls,
instead it suspend and resume it when leaving the kernel. The
side-effects of the syscall will always remain visible, even if the
transaction is aborted. This is an issue when transaction is used along
with futex syscall, on pthread_cond_wait for instance, where the futex
call might succeed but the transaction is rolled back leading the
pthread_cond object in an inconsistent state.
Glibc used to prevent it by always aborting a transaction before issuing
a syscall. Linux 4.2 also decided to abort active transaction in
syscalls which makes the glibc workaround superfluous. Worse, glibc
transaction abortion leads to a performance issue on recent kernels
where the HTM state is saved/restore lazily (v4.9). By aborting a
transaction on every syscalls, regardless whether a transaction has being
initiated before, GLIBS makes the kernel always save/restore HTM state
(it can not even lazily disable it after a certain number of syscall
iterations).
Because of this shortcoming, Transactional Lock Elision is just enabled
when it has been explicitly set (either by tunables of by a configure
switch) and if kernel aborts HTM transactions on syscalls
(PPC_FEATURE2_HTM_NOSC). It is reported that using simple benchmark [1],
the context-switch is about 5% faster by not issuing a tabort in every
syscall in newer kernels.
Checked on powerpc64le-linux-gnu with 4.4.0 kernel (Ubuntu 16.04).
* NEWS: Add note about new TLE support on powerpc64le.
* sysdeps/powerpc/nptl/tcb-offsets.sym (TM_CAPABLE): Remove.
* sysdeps/powerpc/nptl/tls.h (tcbhead_t): Rename tm_capable to
__ununsed1.
(TLS_INIT_TP, TLS_DEFINE_INIT_TP): Remove tm_capable setup.
(THREAD_GET_TM_CAPABLE, THREAD_SET_TM_CAPABLE): Remove macros.
* sysdeps/powerpc/powerpc32/sysdep.h,
sysdeps/powerpc/powerpc64/sysdep.h (ABORT_TRANSACTION_IMPL,
ABORT_TRANSACTION): Remove macros.
* sysdeps/powerpc/sysdep.h (ABORT_TRANSACTION): Likewise.
* sysdeps/unix/sysv/linux/powerpc/elision-conf.c (elision_init): Set
__pthread_force_elision iff PPC_FEATURE2_HTM_NOSC is set.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h,
sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h
sysdeps/unix/sysv/linux/powerpc/syscall.S (ABORT_TRANSACTION): Remove
usage.
* sysdeps/unix/sysv/linux/powerpc/not-errno.h: Remove file.
Reported-by: Breno Leitão <leitao@debian.org>
Synchronize some values with CLDR and apply some suggestions from Bugzilla.
[BZ #10425]
* localedata/locales/it_IT (d_t_fmt): Use "%a %-d %b %Y, %T".
(date_fmt): Use "%a %-d %b %Y, %T, %Z".
* localedata/locales/it_CH (d_t_fmt): Use "%a %-d %b %Y, %T"
which is the same as in it_IT.
(d_fmt): Use "%d.%m.%Y" which is the same as in de_CH.
(date_fmt): Use "%a %-d %b %Y, %T, %Z" which is the same as in it_IT.
I noticed that sysdeps/x86/cpu-features.h had conditionals on whether
to define HAS_CPUID, HAS_I586 and HAS_I686 with a long list of
preprocessor macros for i686-and-later processors which however was
out of date. This patch avoids the problem of the list getting out of
date by instead having conditionals on all the (few, old) pre-i686
processors for which GCC has preprocessor macros, rather than the
(many, expanding list) i686-and-later processors. It seems HAS_I586
and HAS_I686 are unused so the only effect of these macros being
missing is that 32-bit glibc built for one of these processors would
end up doing runtime detection of CPUID availability.
i386 builds are prevented by a configure test so there is no need to
allow for them here. __geode__ (no long nops?) and __k6__ (no CMOV,
at least according to GCC) are conservatively handled as i586, not
i686, here (as noted above, this is a theoretical distinction at
present in that only HAS_CPUID appears to be used).
Tested for x86.
* sysdeps/x86/cpu-features.h [__geode__ || __k6__]: Handle like
[__i586__ || __pentium__].
[__i486__]: Handle explicitly.
(HAS_CPUID): Define to 1 if above macros are undefined.
(HAS_I586): Likewise.
(HAS_I686): Likewise.
If the compiler reduces the stack usage in function f1 before calling
into function f2, then when we swapcontext back to f1 and continue
execution we may overwrite registers that were spilled to the stack
while f2 was executing. Later when we return to f2 the corrupt
registers will be reloaded from the stack and the test will crash. This
was most commonly observed on i686 with __x86.get_pc_thunk.dx and
needing to save and restore $edx. Overall i686 has few registers and
the spilling to the stack is bound to happen, therefore the solution to
making this test robust is to split function f1 into two parts f1a and
f1b, and allocate f1b it's own stack such that subsequent execution does
not overwrite the stack in use by function f2.
Tested on i686 and x86_64.
Signed-off-by: Carlos O'Donell <carlos@redhat.com>
[BZ #23603]
* include/time.h (__mktime_internal): The localtime offset is now
of type long int instead of time_t. This is the longstanding type
in glibc, and it is more than enough to represent difference
between localtime and gmtime even if it is 32 bits and time_t is
64. Changing it now will let us avoid an unnecessary change when
time_t is widened to 64 bits on 32-bit platforms.
* time/mktime-internal.h (mktime_offset_t): Now long int.
[BZ #23603][BZ #16346]
This fixes some obscure problems with integer overflow.
Although it looks scary, it is almost all a byte-for-byte copy
from Gnulib, and the Gnulib code has been tested reasonably well.
* include/intprops.h: New file, copied from Gnulib.
* include/verify.h, time/mktime-internal.h:
New tiny files, simplified from Gnulib.
* time/mktime.c: Copy from Gnulib. This has the following changes:
Do not include config.h if DEBUG_MKTIME is nonzero.
Include stdbool.h, intprops.h, verify.h.
Include string.h only if needed.
Include stdlib.h on MS-Windows.
Include mktime-internal.h.
(DEBUG_MKTIME): Default to 0, and simplify later uses.
(NEED_MKTIME_INTERNAL, NEED_MKTIME_WINDOWS)
(NEED_MKTIME_WORKING): Give default values to pacify -Wundef,
which glibc uses. Default NEED_MKTIME_WORKING to DEBUG_MKTIME, to
simplify later conditionals; default the others to zero. Use
these conditionals to express only the code needed on the current
platform. In uses of these conditionals, explicitly spell out how
_LIBC affects things, so it’s easier to review from a glibc
viewpoint.
(WRAPV): Remove; no longer needed now that we have
systematic overflow checking.
(my_tzset, __tzset) [!_LIBC]: New function and macro, to better
compartmentalize tzset issues. Move system-dependent tzsettish
code here from mktime.
(verify): Remove; now done by verify.h. All uses changed.
(long_int): Use a more-conservative definition, to avoid
integer overflow.
(SHR): Remove, replacing with ...
(shr): New function, which means we needn’t worry about side
effects in args, and conversion analysis is simpler.
(TYPE_IS_INTEGER, TYPE_TWOS_COMPLEMENT, TYPE_SIGNED, TYPE_MINIMUM)
(TYPE_MAXIMUM, TIME_T_MIN, TIME_T_MAX, TIME_T_MIDPOINT)
(time_t_avg, time_t_add_ok): Remove.
(mktime_min, mktime_max): New constants.
(leapyear, isdst_differ): Use bool for booleans.
(ydhms_diff, guess_time_tm, ranged_convert, __mktime_internal):
Use long_int, not time_t, for mktime differences.
(long_int_avg): New function, replacing time_t_avg.
INT_ADD_WRAPV replaces time_t_add_ok.
(guess_time_tm): 6th arg is now long_int, not time_t const *.
All uses changed.
(convert_time): New function.
(ranged_convert): Use it.
(__mktime_internal): Last arg now points to mktime_offset_t, not
time_t. All uses changed. This is a no-op on glibc, where
mktime_offset_t is always time_t. Use int, not time_t, for UTC
offset guess. Directly check for integer overflow instead of
using a heuristic that works only 99.9...% of the time.
Access *OFFSET only once, to avoid an unlikely race if the
compiler delays a load and if this cascades into a signed integer
overflow.
(mktime): Move tzsettish code to my_tzset, and move
localtime_offset to within mktime so that it doesn’t
need a separate ifdef.
(main) [DEBUG_MKTIME]: Speed up by using localtime_r
instead of localtime.
* time/timegm.c: Copy from Gnulib. This has the following changes:
Include mktime-internal.h.
[!_LIBC]: Include config.h and time.h. Do not include
timegm.h or time_r.h. Make __mktime_internal a macro,
and include mktime-internal.h to get its declaration.
(timegm): Temporary is now mktime_offset_t, not time_t.
This affects only Gnulib.
The generic strstr in GLIBC 2.28 fails to match huge needles. The optimized
AVAILABLE macro reads ahead a large fixed amount to reduce the overhead of
repeatedly checking for the end of the string. However if the needle length
is larger than this, two_way_long_needle may confuse this as meaning the end
of the string and return NULL. This is fixed by adding the needle length to
the amount to read ahead.
[BZ #23637]
* string/test-strstr.c (pr23637): New function.
(test_main): Add tests with longer needles.
* string/strcasestr.c (AVAILABLE): Fix readahead distance.
* string/strstr.c (AVAILABLE): Likewise.
The algorithm is exp(y * log(x)), where log(x) is computed with about
1.3*2^-68 relative error (1.5*2^-68 without fma), returning the result
in two doubles, and the exp part uses the same algorithm (and lookup
tables) as exp, but takes the input as two doubles and a sign (to handle
negative bases with odd integer exponent). The __exp1 internal symbol
is no longer necessary.
There is separate code path when fma is not available but the worst case
error is about 0.54 ULP in both cases. The lookup table and consts for
log are 4168 bytes. The .rodata+.text is decreased by 37908 bytes on
aarch64. The non-nearest rounding error is less than 1 ULP.
Improvements on Cortex-A72 compared to current glibc master:
pow thruput: 2.40x in [0.01 11.1]x[0.01 11.1]
pow latency: 1.84x in [0.01 11.1]x[0.01 11.1]
Tested on
aarch64-linux-gnu (defined __FP_FAST_FMA, TOINT_INTRINSICS) and
arm-linux-gnueabihf (!defined __FP_FAST_FMA, !TOINT_INTRINSICS) and
x86_64-linux-gnu (!defined __FP_FAST_FMA, !TOINT_INTRINSICS) and
powerpc64le-linux-gnu (defined __FP_FAST_FMA, !TOINT_INTRINSICS) targets.
* NEWS: Mention pow improvements.
* math/Makefile (type-double-routines): Add e_pow_log_data.
* sysdeps/generic/math_private.h (__exp1): Remove.
* sysdeps/i386/fpu/e_pow_log_data.c: New file.
* sysdeps/ia64/fpu/e_pow_log_data.c: New file.
* sysdeps/ieee754/dbl-64/Makefile (CFLAGS-e_pow.c): Allow fma
contraction.
* sysdeps/ieee754/dbl-64/e_exp.c (__exp1): Remove.
(exp_inline): Remove.
(__ieee754_exp): Only single double input is handled.
* sysdeps/ieee754/dbl-64/e_pow.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_pow_log_data.c: New file.
* sysdeps/ieee754/dbl-64/math_config.h (issignaling_inline): Define.
(__pow_log_data): Define.
* sysdeps/ieee754/dbl-64/upow.h: Remove.
* sysdeps/ieee754/dbl-64/upow.tbl: Remove.
* sysdeps/m68k/m680x0/fpu/e_pow_log_data.c: New file.
* sysdeps/x86_64/fpu/multiarch/Makefile (CFLAGS-e_pow-fma.c): Allow fma
contraction.
(CFLAGS-e_pow-fma4.c): Likewise.
Many bits/mman.h headers for Linux architectures have exactly the same
contents, up to whitespace, comments and the number of leading 0s on
constants. Specifically, this applies to architectures that, in the
Linux kernel, either have no uapi/asm/mman.h, or have one that
includes asm-generic/mman.h without any changes or additions relevant
to glibc (this last case is the one that applies to Arm).
It's not useful to have to duplicate the set of MAP_* constants in
glibc for all such architectures and any new architectures with that
property. Thus, this patch creates a generic
sysdeps/unix/sysv/linux/bits/mman.h and removes all the
architecture-specific versions that become unnecessary.
Further unification remains possible after this patch. For example,
the new bits/mman.h could become bits/mman-map-flags-generic.h so that
it could also be used by architecture-specific bits/mman.h headers on
architectures that use the generic flags but add architecture-specific
ones to them. That would allow this common set of MAP_* definitions
to be used on ia64 and x86 as well (architectures that include
asm-generic/mman.h from their own uapi/asm/mman.h but define
additional MAP_* values of their own).
Tested with build-many-glibcs.py.
* sysdeps/unix/sysv/linux/bits/mman.h: New file.
* sysdeps/unix/sysv/linux/aarch64/bits/mman.h: Remove.
* sysdeps/unix/sysv/linux/arm/bits/mman.h: Likewise.
* sysdeps/unix/sysv/linux/m68k/bits/mman.h: Likewise.
* sysdeps/unix/sysv/linux/microblaze/bits/mman.h: Likewise.
* sysdeps/unix/sysv/linux/nios2/bits/mman.h: Likewise.
* sysdeps/unix/sysv/linux/riscv/bits/mman.h: Likewise.
* sysdeps/unix/sysv/linux/s390/bits/mman.h: Likewise.
* sysdeps/unix/sysv/linux/sh/bits/mman.h: Likewise.
The ldbl-128ibm implementations of ceill and floorl call the
corresponding double functions. This patch fixes those
implementations to call those functions as ceil and floor rather than
as __ceil and __floor, so that the proper inlining takes place when
possible, while including local asm redirections for when the
functions are not inlined since NO_MATH_REDIRECT applies to the double
functions as well as to the long double ones.
Tested with build-many-glibcs.py for all its powerpc configurations.
* sysdeps/ieee754/ldbl-128ibm/s_ceill.c (ceil): Redirect to
__ceil.
(__ceill): Call ceil instead of __ceil.
* sysdeps/ieee754/ldbl-128ibm/s_floorl.c (floor): Redirect to
__floor.
(__floorl): Call floor instead of __floor.
As of Linux 4.17, siginfo headers in the Linux kernel have been
largely unified across architectures (so various constants are defined
with common values in include/uapi/asm-generic/siginfo.h even if not
all architectures can generate those particular constants).
This patch makes glibc reflect that unification and the current set of
constants in that header as of Linux 4.18. Various constants are
added to bits/siginfo-consts.h (under the same feature test macro
conditions as the other constants with the same prefix), and removed
from the ia64 bits/siginfo-consts-arch.h where they were previously
there - this is not limited to constants added by the unification.
Nothing is done about macros that are defined in
include/uapi/asm-generic/siginfo.h with names with leading '__' (some
of those are ia64-specific ones that remain in the ia64
bits/siginfo-consts-arch.h without the leading '__' there).
A consequence of these changes is that TRAP_HWBKPT becomes available
on AArch64 and all other architectures as requested in bug 21286.
Tested for x86_64; tested with build-many-glibcs.py for ia64.
[BZ #21286]
* sysdeps/unix/sysv/linux/bits/siginfo-consts.h (SI_DETHREAD): New
constant.
[__USE_XOPEN_EXTENDED || __USE_XOPEN2K8] (ILL_BADIADDR): Likewise.
[__USE_XOPEN_EXTENDED || __USE_XOPEN2K8] (FPE_FLTUNK): Likewise.
[__USE_XOPEN_EXTENDED || __USE_XOPEN2K8] (FPE_CONDTRAP): Likewise.
[__USE_XOPEN_EXTENDED || __USE_XOPEN2K8] (SEGV_ACCADI): Likewise.
[__USE_XOPEN_EXTENDED || __USE_XOPEN2K8] (SEGV_ADIDERR): Likewise.
[__USE_XOPEN_EXTENDED || __USE_XOPEN2K8] (SEGV_ADIPERR): Likewise.
[__USE_XOPEN_EXTENDED] (TRAP_BRANCH): Likewise.
[__USE_XOPEN_EXTENDED] (TRAP_HWBKPT): Likewise.
[__USE_XOPEN_EXTENDED] (TRAP_UNK): Likweise.
* sysdeps/unix/sysv/linux/ia64/bits/siginfo-consts-arch.h
(ILL_BADIADDR): Remove constant.
(TRAP_BRANCH): Likewise.
(TRAP_HWBKPT): Likewise.
As discussed at
<https://sourceware.org/ml/libc-alpha/2018-09/msg00191.html> and
followup discussions, the MIPS n32 definitions of pr_sigpend and
pr_sighold in struct elf_prstatus, and pr_flag in struct elf_prpsinfo,
are wrong to use unsigned long long int; actual n32 core dumps use a
32-bit type there, so userspace unsigned long int is correct for all
MIPS ABIs. This patch removes the conditionals (also thereby aligning
the structures with other architectures and so facilitating future
unification of different versions of this header).
Tested with build-many-glibcs.py for its MIPS configurations.
[BZ #23656]
* sysdeps/unix/sysv/linux/mips/sys/procfs.h (struct elf_prstatus):
Remove [_MIPS_SIM = _ABIN32] conditional case.
(struct elf_prpsinfo): Likewise.
As noted in
<https://sourceware.org/ml/libc-alpha/2018-09/msg00178.html>, glibc's
sys/procfs.h headers for microblaze, mips (n64), nios2 and riscv have
incorrect types for the pr_uid and pr_gid members of struct
elf_prpsinfo (as does the generic Linux version, but nothing uses
that).
This patch fixes those headers to use unsigned int. The generic Linux
version is also fixed, but I do *not* recommend making new
architectures use it yet. Rather, I think it should be reworked to
look more like a copy of the AArch64 version, but with a new
<bits/procfs.h> header included to provide register set definitions;
<bits/procfs.h> would then be architecture-specific while many
architectures could use the generic <sys/procfs.h>. This fix is
deliberately separate from any reworking to use a generic header more,
since it's possible there could be uses for backporting this fix but
not for backporting a subsequent cleanup.
Tested with build-many-glibcs.py. This of course doesn't provide much
validation of the structure layout; if the Linux kernel is fixed so
that "#include <linux/elfcore.h>" actually compiles with the headers
from "make headers_install" (and if the layout in both headers is
meant to be the same, whatever ABI we are building for), I have a test
that can be added to glibc to check the layout against that from the
Linux kernel.
[BZ #23649]
* sysdeps/unix/sysv/linux/microblaze/sys/procfs.h (struct
elf_prpsinfo): Use unsigned int for pr_uid and pr_gid.
* sysdeps/unix/sysv/linux/mips/sys/procfs.h (struct elf_prpsinfo):
Likewise.
* sysdeps/unix/sysv/linux/nios2/sys/procfs.h (struct
elf_prpsinfo): Likewise.
* sysdeps/unix/sysv/linux/riscv/sys/procfs.h (struct
elf_prpsinfo): Likewise.
* sysdeps/unix/sysv/linux/sys/procfs.h (struct elf_prpsinfo):
Likewise.
Continuing the move to use, within libm, public names for libm
functions that can be inlined as built-in functions on many
architectures, this patch moves calls to __rint functions to call the
corresponding rint names instead, with asm redirection to __rint when
the calls are not inlined. The x86_64 math_private.h is removed as no
longer useful after this patch.
This patch is relative to a tree with my floor patch
<https://sourceware.org/ml/libc-alpha/2018-09/msg00148.html> applied,
and much the same considerations arise regarding possibly replacing an
IFUNC call with a direct inline expansion.
Tested for x86_64, and with build-many-glibcs.py.
* include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ &&
__FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (rint): Redirect
using MATH_REDIRECT.
* sysdeps/aarch64/fpu/s_rint.c: Define NO_MATH_REDIRECT before
header inclusion.
* sysdeps/aarch64/fpu/s_rintf.c: Likewise.
* sysdeps/alpha/fpu/s_rint.c: Likewise.
* sysdeps/alpha/fpu/s_rintf.c: Likewise.
* sysdeps/i386/fpu/s_rintl.c: Likewise.
* sysdeps/ieee754/dbl-64/s_rint.c: Likewise.
* sysdeps/ieee754/dbl-64/wordsize-64/s_rint.c: Likewise.
* sysdeps/ieee754/float128/s_rintf128.c: Likewise.
* sysdeps/ieee754/flt-32/s_rintf.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_rintl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_rintl.c: Likewise.
* sysdeps/m68k/coldfire/fpu/s_rint.c: Likewise.
* sysdeps/m68k/coldfire/fpu/s_rintf.c: Likewise.
* sysdeps/m68k/m680x0/fpu/s_rint.c: Likewise.
* sysdeps/m68k/m680x0/fpu/s_rintf.c: Likewise.
* sysdeps/m68k/m680x0/fpu/s_rintl.c: Likewise.
* sysdeps/powerpc/fpu/s_rint.c: Likewise.
* sysdeps/powerpc/fpu/s_rintf.c: Likewise.
* sysdeps/riscv/rv64/rvd/s_rint.c: Likewise.
* sysdeps/riscv/rvf/s_rintf.c: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rint.c: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rintf.c: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_rint.c: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_rintf.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/s_rint.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/s_rintf.c: Likewise.
* sysdeps/x86_64/fpu/math_private.h: Remove file.
* math/e_scalb.c (invalid_fn): Use rint functions instead of
__rint variants.
* math/e_scalbf.c (invalid_fn): Likewise.
* math/e_scalbl.c (invalid_fn): Likewise.
* sysdeps/ieee754/dbl-64/e_gamma_r.c (__ieee754_gamma_r):
Likewise.
* sysdeps/ieee754/flt-32/e_gammaf_r.c (__ieee754_gammaf_r):
Likewise.
* sysdeps/ieee754/k_standard.c (__kernel_standard): Likewise.
* sysdeps/ieee754/k_standardl.c (__kernel_standard_l): Likewise.
* sysdeps/ieee754/ldbl-128/e_gammal_r.c (__ieee754_gammal_r):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r):
Likewise.
* sysdeps/ieee754/ldbl-96/e_gammal_r.c (__ieee754_gammal_r):
Likewise.
* sysdeps/powerpc/powerpc32/fpu/s_llrint.c (__llrint): Likewise.
* sysdeps/powerpc/powerpc32/fpu/s_llrintf.c (__llrintf): Likewise.
Similar to the changes that were made to call sqrt functions directly
in glibc, instead of __ieee754_sqrt variants, so that the compiler
could inline them automatically without needing special inline
definitions in lots of math_private.h headers, this patch makes libm
code call floor functions directly instead of __floor variants,
removing the inlines / macros for x86_64 (SSE4.1) and powerpc
(POWER5).
The redirection used to ensure that __ieee754_sqrt does still get
called when the compiler doesn't inline a built-in function expansion
is refactored so it can be applied to other functions; the refactoring
is arranged so it's not limited to unary functions either (it would be
reasonable to use this mechanism for copysign - removing the inline in
math_private_calls.h but also eliminating unnecessary local PLT entry
use in the cases (powerpc soft-float and e500v1, for IBM long double)
where copysign calls don't get inlined).
The point of this change is that more architectures can get floor
calls inlined where they weren't previously (AArch64, for example),
without needing special inline definitions in their math_private.h,
and existing such definitions in math_private.h headers can be
removed.
Note that it's possible that in some cases an inline may be used where
an IFUNC call was previously used - this is the case on x86_64, for
example. I think the direct calls to floor are still appropriate; if
there's any significant performance cost from inline SSE2 floor
instead of an IFUNC call ending up with SSE4.1 floor, that indicates
that either the function should be doing something else that's faster
than using floor at all, or it should itself have IFUNC variants, or
that the compiler choice of inlining for generic tuning should change
to allow for the possibility that, by not inlining, an SSE4.1 IFUNC
might be called at runtime - but not that glibc should avoid calling
floor internally. (After all, all the same considerations would apply
to any user program calling floor, where it might either be inlined or
left as an out-of-line call allowing for a possible IFUNC.)
Tested for x86_64, and with build-many-glibcs.py.
* include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ &&
__FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT):
New macro.
[!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0)
&& !NO_MATH_REDIRECT] (MATH_REDIRECT_LDBL): Likewise.
[!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0)
&& !NO_MATH_REDIRECT] (MATH_REDIRECT_F128): Likewise.
[!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0)
&& !NO_MATH_REDIRECT] (MATH_REDIRECT_UNARY_ARGS): Likewise.
[!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0)
&& !NO_MATH_REDIRECT] (sqrt): Redirect using MATH_REDIRECT.
[!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0)
&& !NO_MATH_REDIRECT] (floor): Likewise.
* sysdeps/aarch64/fpu/s_floor.c: Define NO_MATH_REDIRECT before
header inclusion.
* sysdeps/aarch64/fpu/s_floorf.c: Likewise.
* sysdeps/ieee754/dbl-64/s_floor.c: Likewise.
* sysdeps/ieee754/dbl-64/wordsize-64/s_floor.c: Likewise.
* sysdeps/ieee754/float128/s_floorf128.c: Likewise.
* sysdeps/ieee754/flt-32/s_floorf.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_floorl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_floorl.c: Likewise.
* sysdeps/m68k/m680x0/fpu/s_floor_template.c: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floor.c: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floorf.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_floor.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_floorf.c: Likewise.
* sysdeps/riscv/rv64/rvd/s_floor.c: Likewise.
* sysdeps/riscv/rvf/s_floorf.c: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_floor.c: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_floorf.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/s_floor.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/s_floorf.c: Likewise.
* sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__floor):
Remove macro.
[_ARCH_PWR5X] (__floorf): Likewise.
* sysdeps/x86_64/fpu/math_private.h [__SSE4_1__] (__floor): Remove
inline function.
[__SSE4_1__] (__floorf): Likewise.
* math/w_lgamma_main.c (LGFUNC (__lgamma)): Use floor functions
instead of __floor variants.
* math/w_lgamma_r_compat.c (__lgamma_r): Likewise.
* math/w_lgammaf_main.c (LGFUNC (__lgammaf)): Likewise.
* math/w_lgammaf_r_compat.c (__lgammaf_r): Likewise.
* math/w_lgammal_main.c (LGFUNC (__lgammal)): Likewise.
* math/w_lgammal_r_compat.c (__lgammal_r): Likewise.
* math/w_tgamma_compat.c (__tgamma): Likewise.
* math/w_tgamma_template.c (M_DECL_FUNC (__tgamma)): Likewise.
* math/w_tgammaf_compat.c (__tgammaf): Likewise.
* math/w_tgammal_compat.c (__tgammal): Likewise.
* sysdeps/ieee754/dbl-64/e_lgamma_r.c (sin_pi): Likewise.
* sysdeps/ieee754/dbl-64/k_rem_pio2.c (__kernel_rem_pio2):
Likewise.
* sysdeps/ieee754/dbl-64/lgamma_neg.c (__lgamma_neg): Likewise.
* sysdeps/ieee754/flt-32/e_lgammaf_r.c (sin_pif): Likewise.
* sysdeps/ieee754/flt-32/lgamma_negf.c (__lgamma_negf): Likewise.
* sysdeps/ieee754/ldbl-128/e_lgammal_r.c (__ieee754_lgammal_r):
Likewise.
* sysdeps/ieee754/ldbl-128/e_powl.c (__ieee754_powl): Likewise.
* sysdeps/ieee754/ldbl-128/lgamma_negl.c (__lgamma_negl):
Likewise.
* sysdeps/ieee754/ldbl-128/s_expm1l.c (__expm1l): Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_lgammal_r.c (__ieee754_lgammal_r):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_powl.c (__ieee754_powl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/lgamma_negl.c (__lgamma_negl):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_expm1l.c (__expm1l): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_truncl.c (__truncl): Likewise.
* sysdeps/ieee754/ldbl-96/e_lgammal_r.c (sin_pi): Likewise.
* sysdeps/ieee754/ldbl-96/lgamma_negl.c (__lgamma_negl): Likewise.
* sysdeps/powerpc/power5+/fpu/s_modf.c (__modf): Likewise.
* sysdeps/powerpc/power5+/fpu/s_modff.c (__modff): Likewise.
I'm testing a patch to let the compiler expand calls to floor in libm
as built-in function calls as much as possible, instead of calling
__floor, so that no architecture-specific __floor inlines are needed,
and then to arrange for non-inlined calls to end up calling __floor,
as done with sqrt and __ieee754_sqrt.
This shows up elf/tst-relsort1mod2.c calling floor, which must not be
converted to a call to __floor. Now, while an IS_IN (libm)
conditional could be added to the existing conditionals on such
redirections in include/math.h, the _ISOMAC conditional ought to
suffice (code in other glibc libraries shouldn't be calling floor or
sqrt anyway, as they aren't provided in libc and the other libraries
don't link with libm). But while tests are mostly now built with
_ISOMAC defined, test modules in modules-names aren't unless also
listed in modules-names-tests.
As far as I can see, all the modules in modules-names in elf/ are in
fact parts of tests and so listing them in modules-names-tests is
appropriate, so they get built with something closer to the headers
used for user code, except in a few cases that actually rely on
something from internal headers. This patch duly sets
modules-names-tests there accordingly (filtering out those tests that
fail to build without internal headers).
Tested for x86_64, and with build-many-glibcs.py.
* elf/Makefile (modules-names-tests): New variable.
Similar algorithm is used as in log: log2(2^k x) = k + log2(c) + log2(x/c)
where the last term is approximated by a polynomial of x/c - 1, the first
order coefficient is about 1/ln2 in this case.
There is separate code path when fma instruction is not available for
computing x/c - 1 precisely, for which the table size is doubled.
The worst case error is 0.547 ULP (0.55 without fma), the read only
global data size is 1168 bytes (2192 without fma) on aarch64. The
non-nearest rounding error is less than 1 ULP.
Improvements on Cortex-A72 compared to current glibc master:
log2 thruput: 2.00x in [0.01 11.1]
log2 latency: 2.04x in [0.01 11.1]
log2 thruput: 2.17x in [0.999 1.001]
log2 latency: 2.88x in [0.999 1.001]
Tested on
aarch64-linux-gnu (defined __FP_FAST_FMA)
arm-linux-gnueabihf (!defined __FP_FAST_FMA)
x86_64-linux-gnu (!defined __FP_FAST_FMA)
powerpc64le-linxu-gnu (defined __FP_FAST_FMA)
targets.
* NEWS: Mention log2 improvements.
* math/Makefile (type-double-routines): Add e_log2_data.
* sysdeps/i386/fpu/e_log2_data.c: New file.
* sysdeps/ia64/fpu/e_log2_data.c: New file.
* sysdeps/ieee754/dbl-64/e_log2.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_log2_data.c: New file.
* sysdeps/ieee754/dbl-64/math_config.h (__log2_data): Add.
* sysdeps/ieee754/dbl-64/wordsize-64/e_log2.c: Remove.
* sysdeps/m68k/m680x0/fpu/e_log2_data.c: New file.
Optimized log using carefully generated lookup table with 1/c and log(c)
values for small intervalls around 1. The log(c) is very near a double
precision value, it has about 62 bits precision. The algorithm is
log(2^k x) = k log(2) + log(c) + log(x/c), where the last term is
approximated by a polynomial of x/c - 1. Near 1 a single polynomial of
x - 1 is used.
There is separate code path when fma instruction is not available for
computing x/c - 1 precisely, in which case the table size is doubled.
The code uses __builtin_fma under __FP_FAST_FMA to ensure it is inlined
as an instruction.
With the default configuration settings the worst case error is 0.519 ULP
(and 0.520 without fma), the rodata size is 2192 bytes (4240 without fma).
The non-nearest rounding error is less than 1 ULP.
Improvements on Cortex-A72 compared to current glibc master:
log thruput: 3.28x in [0.01 11.1]
log latency: 2.23x in [0.01 11.1]
log thruput: 1.56x in [0.999 1.001]
log latency: 1.57x in [0.999 1.001]
Tested on
aarch64-linux-gnu (defined __FP_FAST_FMA)
arm-linux-gnueabihf (!defined __FP_FAST_FMA)
x86_64-linux-gnu (!defined __FP_FAST_FMA)
powerpc64le-linux-gnu (defined __FP_FAST_FMA)
targets.
* NEWS: Mention log improvement.
* math/Makefile (type-double-routines): Add e_log_data.
* sysdeps/i386/fpu/e_log_data.c: New file.
* sysdeps/ia64/fpu/e_log_data.c: New file.
* sysdeps/ieee754/dbl-64/e_log.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_log_data.c: New file.
* sysdeps/ieee754/dbl-64/math_config.h (__log_data): Add.
* sysdeps/ieee754/dbl-64/ulog.h: Remove.
* sysdeps/ieee754/dbl-64/ulog.tbl: Remove.
* sysdeps/m68k/m680x0/fpu/e_log_data.c: New file.
Wrapping the _start function with ENTRY and END to insert ENDBR32 at
function entry when CET is enabled. Since _start now includes CFI,
without "cfi_undefined (eip)", unwinder may not terminate at _start
and we will get
Program received signal SIGSEGV, Segmentation fault.
0xf7dc661e in ?? () from /lib/libgcc_s.so.1
Missing separate debuginfos, use: dnf debuginfo-install libgcc-8.2.1-3.0.fc28.i686
(gdb) bt
#0 0xf7dc661e in ?? () from /lib/libgcc_s.so.1
#1 0xf7dc7c18 in _Unwind_Backtrace () from /lib/libgcc_s.so.1
#2 0xf7f0d809 in __GI___backtrace (array=array@entry=0xffffc7d0,
size=size@entry=20) at ../sysdeps/i386/backtrace.c:127
#3 0x08049254 in compare (p1=p1@entry=0xffffcad0, p2=p2@entry=0xffffcad4)
at backtrace-tst.c:12
#4 0xf7e2a28c in msort_with_tmp (p=p@entry=0xffffca5c, b=b@entry=0xffffcad0,
n=n@entry=2) at msort.c:65
#5 0xf7e29f64 in msort_with_tmp (n=2, b=0xffffcad0, p=0xffffca5c)
at msort.c:53
#6 msort_with_tmp (p=p@entry=0xffffca5c, b=b@entry=0xffffcad0, n=n@entry=5)
at msort.c:53
#7 0xf7e29f64 in msort_with_tmp (n=5, b=0xffffcad0, p=0xffffca5c)
at msort.c:53
#8 msort_with_tmp (p=p@entry=0xffffca5c, b=b@entry=0xffffcad0, n=n@entry=10)
at msort.c:53
#9 0xf7e29f64 in msort_with_tmp (n=10, b=0xffffcad0, p=0xffffca5c)
at msort.c:53
#10 msort_with_tmp (p=p@entry=0xffffca5c, b=b@entry=0xffffcad0, n=n@entry=20)
at msort.c:53
#11 0xf7e2a5b6 in msort_with_tmp (n=20, b=0xffffcad0, p=0xffffca5c)
at msort.c:297
#12 __GI___qsort_r (b=b@entry=0xffffcad0, n=n@entry=20, s=s@entry=4,
cmp=cmp@entry=0x8049230 <compare>, arg=arg@entry=0x0) at msort.c:297
#13 0xf7e2a84d in __GI_qsort (b=b@entry=0xffffcad0, n=n@entry=20, s=s@entry=4,
cmp=cmp@entry=0x8049230 <compare>) at msort.c:308
#14 0x080490f6 in main (argc=2, argv=0xffffcbd4) at backtrace-tst.c:39
FAIL: debug/backtrace-tst
[BZ #23606]
* sysdeps/i386/start.S: Include <sysdep.h>
(_start): Use ENTRY/END to insert ENDBR32 at entry when CET is
enabled. Add cfi_undefined (eip).
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
The x86_64 math_private.h has asm versions of the macros to
reinterpret between floating-point and integer types.
This is the sort of thing we now strongly discourage; the expectation
in such cases, where the generic C code gives the compiler all the
information needed about the required semantics, is that you should
get the compiler to do the right thing for the generic C code rather
than writing an asm version.
Trivial tests showed GCC generates the expected single instructions
for reinterpretation from floating point to integer. In the other
direction, it goes via memory when the asms don't; I asked about this
in GCC bug 87236 and was advised this was deliberate for generic
tuning because it was faster that way on some AMD processors (but
-mtune=intel, and -Os with the latest GCC, avoid going via memory).
The asms don't and can't know about those tuning details, so that's
evidence that they are actually making the code worse.
This patch removes the asms accordingly. Tested for x86_64.
* sysdeps/x86_64/fpu/math_private.h (MOVD): Remove macro.
(MOVQ): Likewise.
(EXTRACT_WORDS64): Likewise.
(INSERT_WORDS64): Likewise.
(GET_FLOAT_WORD): Likewise.
(SET_FLOAT_WORD): Likewise.
Every so often we get libsanitizer or libgo builds breaking with new
glibc because of some change in the glibc headers.
glibc's build-many-glibcs.py deliberately disables libsanitizer and
GCC languages other than C and C++ because the point is to test glibc
and find glibc problems (including problems shown up by new compiler
warnings in new GCC), not to test libsanitizer or libgo; if the
compiler build fails because of libsanitizer or libgo failing to
build, that could hide the existence of new problems in glibc.
However, it seems reasonable to have a non-default mode where
build-many-glibcs.py does build those additional pieces, which this
patch adds.
Note that I do not intend to run a build-many-glibcs.py bot with this
new option. If people concerned with libsanitizer, libgo or other
potentially affected GCC libraries wish to find out about such
problems more quickly, they may wish to run such a bot or bots (and to
monitor the results and fix issues found - obviously there will be
some overlap with issues found by my bots not using that option).
Note also that building a non-native Ada compiler requires a
sufficiently recent native (or build-x-host, in general) Ada compiler
to be used, possibly more or less the same version as being built.
That needs to be in the PATH when build-many-glibcs.py --full-gcc is
run; the script does not deal with setting up such a compiler (or any
of the other host tools needed for building GCC and glibc, beyond the
GMP / MPFR / MPC libraries), but perhaps it should, to avoid the need
to keep updating such a compiler manually when running a bot.
Tested by running build-many-glibcs.py with the new option, with
mainline GCC. There are build failures for various configurations,
which may be of interest to Go / Ada people even if you're not
interested in running such a bot:
* mips64 / mips64el (all configuration): ICE building libstdc++, as
seen without using the new option
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87156>.
* aarch64_be: error building libgo (little-endian aarch64 works fine):
version.go:67:13: error: expected ';' or ')' or newline
67 | BigEndian =
| ^
version.go:67:3: error: reference to undefined name 'BigEndian'
67 | BigEndian =
| ^
* arm (all configurations): error building libgo:
/scratch/jmyers/glibc/many9/src/gcc/libgo/go/internal/syscall/unix/getrandom_linux.go:29:5: error: reference to undefined name 'randomTrap'
29 | if randomTrap == 0 {
| ^
/scratch/jmyers/glibc/many9/src/gcc/libgo/go/internal/syscall/unix/getrandom_linux.go:38:34: error: reference to undefined name 'randomTrap'
38 | r1, _, errno := syscall.Syscall(randomTrap,
| ^
What's happening there is, I think, that the arm*b*-*-* case in
libgo/configure.ac is wrongly matching arm-glibc-linux-gnueabi with
the 'b' in the vendor part, and then something else is failing to
handle GOARCH=armbe. Given that you can have configurations with
multilibs of both endiannesses, endianness should always be detected
by configure.ac, for all architectures, using a compile test of
whether __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__, not based on textual
matches to the host (= target at top-level) triplet.
* armeb (all configurations): error building libada (for some reason
the Arm libada configuration seems to do different things for EH for
big-endian, which makes no sense to me and doesn't actually work):
a-exexpr.adb:87:06: "System.Exceptions.Machine" is not a predefined library unit
a-exexpr.adb:87:06: "Ada.Exceptions (body)" depends on "Ada.Exceptions.Exception_Propagation (body)"
a-exexpr.adb:87:06: "Ada.Exceptions.Exception_Propagation (body)" depends on "System.Exceptions.Machine (spec)"
* hppa: error building libgo (same error as for aarch64_be).
* ia64: ICE building libgo. I've filed
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87281> for this.
* m68k: ICE in the Go front end building libgo
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84948>.
* microblaze, microblazeel, nios2, sh3, sh3eb: build failure in libada
for lack of a libada port to those systems (I'm not sure sh3 would
actually need anything different from sh4):
a-cbdlli.ads:38:14: violation of restriction "No_Finalization" at system.ads:47
* i686-gnu: build failure in libada, might be fixed by the patch
attached to <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81103>
(not tested):
terminals.c:1115:13: fatal error: termio.h: No such file or directory
* scripts/build-many-glibcs.py (Context.__init__): Add full_gcc
argument.
(Config.build_gcc): Use --disable-libsanitizer for first GCC
build, but not for second build if --full-gcc. Use
--enable-languages=all for second build if --full-gcc.
(get_parser): Add --full-gcc option.
(main): Update call to Context.
CLDR and many other sources say that it_IT (Italian) should use a dot
(".") as a thousands separator and a comma (",") as a decimal separator.
For it_CH and de_CH CLDR says that they should use the Right Single
Quotation Mark ("’") as a thousands separator and a dot (".") as a
decimal separator. Consequently, the same rules are copied to all other
locales in Switzerland.
These rules apply to both LC_MONETARY and LC_NUMERIC.
[BZ #10797]
* localedata/locales/de_CH (mon_thousands_sep): Use "<U2019>" (Right
Single Quotation Mark).
(thousands_sep): Likewise.
* localedata/locales/it_CH (LC_NUMERIC): Use “copy "de_CH"”.
* localedata/locales/it_IT (thousands_sep): Use ".".
(grouping): Use "3;3".
We've had issues before with build failures (with new GCC) in code
only built with --enable-obsolete-rpc or --enable-obsolete-nsl not
being reported for a while because build-many-glibcs.py does not test
those configure options. This patch adds configurations (32-bit and
64-bit) using those options so that in future we can notice quickly if
they start failing to build.
Tested the new configurations do build with GCC 8.
* scripts/build-many-glibcs.py (Context.add_all_configs): Add
x86_64 and i686 configs using --enable-obsolete-rpc
--enable-obsolete-nsl.
If glibc is built with gcc 8 and -march=z900,
the testcase posix/tst-spawn4-compat crashes with a segfault.
In function maybe_script_execute, the new_argv array is dynamically
initialized on stack with (argc + 1) elements.
The function wants to add _PATH_BSHELL as the first argument
and writes out of bounds of new_argv.
There is an off-by-one because maybe_script_execute fails to count
the terminating NULL when sizing new_argv.
ChangeLog:
* sysdeps/unix/sysv/linux/spawni.c (maybe_script_execute):
Increment size of new_argv by one.
This commit also fixes d_fmt in bn_BD which is identical to bn_IN,
in ne_NP which is identical to ne_IN (not supported by Glibc but supported
by CLDR), and in ta_LK which is identical to ta_IN.
For those locales which are supported by CLDR data is imported from
CLDR v33. For others it is copied from those locales which were identical
before this commit.
[BZ #17426]
* localedata/locales/anp_IN (d_fmt): Use "%-d//%-m//%y".
* localedata/locales/ar_IN (d_fmt): Likewise.
* localedata/locales/bhb_IN (d_fmt): Likewise.
* localedata/locales/bho_IN (d_fmt): Likewise.
* localedata/locales/bn_BD (d_fmt): Likewise.
* localedata/locales/bn_IN (d_fmt): Likewise.
* localedata/locales/doi_IN (d_fmt): Likewise.
* localedata/locales/gu_IN (d_fmt): Likewise.
* localedata/locales/hi_IN (d_fmt): Likewise.
* localedata/locales/hne_IN (d_fmt): Likewise.
* localedata/locales/kn_IN (d_fmt): Likewise.
* localedata/locales/mag_IN (d_fmt): Likewise.
* localedata/locales/mai_IN (d_fmt): Likewise.
* localedata/locales/mjw_IN (d_fmt): Likewise.
* localedata/locales/ml_IN (d_fmt): Likewise.
* localedata/locales/mni_IN (d_fmt): Likewise.
* localedata/locales/mr_IN (d_fmt): Likewise.
* localedata/locales/pa_IN (d_fmt): Likewise.
* localedata/locales/raj_IN (d_fmt): Likewise.
* localedata/locales/sat_IN (d_fmt): Likewise.
* localedata/locales/sd_IN (d_fmt): Likewise.
* localedata/locales/sd_IN@devanagari (d_fmt): Likewise.
* localedata/locales/ta_IN (d_fmt): Likewise.
* localedata/locales/ta_LK (d_fmt): Likewise.
* localedata/locales/tcy_IN (d_fmt): Likewise.
* localedata/locales/ur_IN (d_fmt): Likewise.
* localedata/locales/brx_IN (d_fmt): Use "%-m//%-d//%y".
* localedata/locales/ks_IN (d_fmt): Likewise.
* localedata/locales/ks_IN@devanagari (d_fmt): Likewise.
* localedata/locales/kok_IN (d_fmt): Use "%-d-%-m-%y".
* localedata/locales/ne_NP (d_fmt): Use "%y//%-m//%-d".
* localedata/locales/sa_IN (d_fmt): Use "%-d-%m-%y".
* localedata/locales/te_IN (d_fmt): Use "%d-%m-%y".
After some math_private.h cleanups (in particulat math-barriers.h
being split out), the only thing left in the alpha math_private.h was
macro definitions of __isnan and __isnanf, apparently (based on the
comments) intended to avoid problems with inline definitions in other
math_private.h files. Those inline definitions were removed in commit
fe8c2b33ae, and the alpha math_private.h
is no longer needed; this patch removes it.
Tested with build-many-glibcs.py that installed stripped shared
libraries for alpha are unchanged by the patch.
* sysdeps/alpha/fpu/math_private.h: Remove.
Continuing the cleanup of math_private.h, with a view to it becoming
the header for the APIs defined therein and not also a header with
inline variants of math.h APIs, this patch moves inline definitions of
__isinff128 and fabsf128 to include/math.h, so that any users of
math.h in glibc automatically get the optimized functions rather than
quietly missing them if they do not also include math_private.h.
Tested for x86_64 and x86, and with build-many-glibcs.py with GCC 6.
There are changes to installed stripped libc.so on configurations with
distinct _Float128, because of __printf_fp_l code that now gets the
__isinff128 inline where previously it called the out-of-line
function because of the lack of a math_private.h call. It seems
appropriate that this code does get the inline (as it would
automatically with GCC 7 and later when the built-in function is used)
rather than being the only place in glibc that does not.
* sysdeps/generic/math_private.h
[__HAVE_DISTINCT_FLOAT128 && !__GNUC_PREREQ (7, 0)] (__isinff128):
Move this inline function ....
[__HAVE_DISTINCT_FLOAT128] (fabsf128): And this one ....
* include/math.h [!_ISOMAC]: To here....