Locking overhead can be significant in some stdio operations
that are common in single threaded applications.
This patch adds the _IO_FLAGS2_NEED_LOCK flag to indicate if
an _IO_FILE object needs to be locked and some of the stdio
functions just jump to their _unlocked variant when not. The
flag is set on all _IO_FILE objects when the first thread is
created. A new GLIBC_PRIVATE libc symbol, _IO_enable_locks,
was added to do this from libpthread.
The optimization can be applied to more stdio functions,
currently it is only applied to single flag check or single
non-wide-char standard operations. The flag should probably
be never set for files with _IO_USER_LOCK, but that's just a
further optimization, not a correctness requirement.
The optimization is valid in a single thread because stdio
operations are non-as-safe (so lock state is not observable
from a signal handler) and stdio locks are recursive (so lock
state is not observable via deadlock). The optimization is not
valid if a thread may be created while an stdio lock is taken
and thus it should be disabled if any user code may run during
an stdio operation (interposed malloc, printf hooks, etc).
This makes the optimization more complicated for some stdio
operations (e.g. printf), but those are bigger and thus less
important to optimize so this patch does not try to do that.
* libio/libio.h (_IO_FLAGS2_NEED_LOCK, _IO_need_lock): Define.
* libio/libioP.h (_IO_enable_locks): Declare.
* libio/Versions (_IO_enable_locks): New symbol.
* libio/genops.c (_IO_enable_locks): Define.
(_IO_old_init): Initialize flags2.
* libio/feof.c.c (_IO_feof): Avoid locking when not needed.
* libio/ferror.c (_IO_ferror): Likewise.
* libio/fputc.c (fputc): Likewise.
* libio/putc.c (_IO_putc): Likewise.
* libio/getc.c (_IO_getc): Likewise.
* libio/getchar.c (getchar): Likewise.
* libio/ioungetc.c (_IO_ungetc): Likewise.
* nptl/pthread_create.c (__pthread_create_2_1): Enable stdio locks.
* libio/iofopncook.c (_IO_fopencookie): Enable locking for the file.
* sysdeps/pthread/flockfile.c (__flockfile): Likewise.
A dot-less host name without an /etc/resolv.conf file caused an
assertion failure in update_from_conf because the function would not
deal correctly with the empty search list case.
Thanks to Andreas Schwab for debugging assistence.
This patch updates build-many-glibcs.py to use the current release
branch of binutils and current releases of GMP and the Linux kernel.
* scripts/build-many-glibcs.py (Context.checkout): Default
binutils version to 2.29 branch, GMP version to 6.1.2 and Linux
kernel version to 4.12.
This commit enhances the stub resolver to reload the configuration
in the per-thread _res object if the /etc/resolv.conf file has
changed. The resolver checks whether the application has modified
_res and will not overwrite the _res object in that case.
The struct resolv_context mechanism is used to check the
configuration file only once per name lookup.
This commit adds the remaining unchanging members (which are loaded
from /etc/resolv.conf) to struct resolv_conf.
The extended name server list is currently not used by the stub
resolver. The switch depends on a cleanup: The _u._ext.nssocks
array stores just a single socket, and needs to be replaced with
a single socket value.
(The compatibility gethostname implementation does not use the
extended addres sort list, either. Updating the compat code is
not worthwhile.)
This change uses the extended resolver state in struct resolv_conf to
store the search list. If applications have not patched the _res
object directly, this extended search list will be used by the stub
resolver during name resolution.
This change provides additional resolver configuration state which
is not exposed through the _res ABI. It reuses the existing
initstamp field in the supposedly-private part of _res. Some effort
is undertaken to avoid memory safety issues introduced by applications
which directly patch the _res object.
With this commit, only the initstamp field is moved into struct
resolv_conf. Additional members will be added later, eventually
migrating the entire resolver configuration.
struct resolv_context objects provide a temporary resolver context
which does not change during a name lookup operation. Only when the
outmost context is created, the stub resolver configuration is
verified to be current (at present, only against previous res_init
calls). Subsequent attempts to obtain the context will reuse the
result of the initial verification operation.
struct resolv_context can also be extended in the future to store
data which needs to be deallocated during thread cancellation.
posix/sched_cpucount.c assumes that size of __cpu_mask == size of long,
which is incorrect for x32. This patch uses __builtin_popcount, which
is availabe in GCC 4.9, in posix/sched_cpucount.c.
Tested on i686, x86-64 and x32 with multi-arch disabled.
[BZ #21696]
* posix/sched_cpucount.c: Don't include <limits.h>.
(__sched_cpucount): Use __builtin_popcount.
In math/math.h, __MATH_TG will expand signbit to __builtin_signbit*,
e.g.: __builtin_signbitf128, before GCC 6. However, there has never
been a __builtin_signbitf128 in GCC and the type-generic builtin is
only available since GCC 6. For older GCC, this patch defines
__builtin_signbitf128 to __signbitf128, so that the internal function
is used instead of the non-existent builtin.
This patch also changes the implementation of __signbitf128, because
it was reusing the implementation of __signbitl from ldbl-128, which
calls __builtin_signbitl. Using the long double version of the
builtin is not correct on machines where _Float128 is ABI-distinct
from long double (i.e.: ia64, powerpc64le, x86, x86_84). The new
implementation does not rely on builtins when being built with GCC
versions older than 6.0.
The new code does not currently affect powerpc64le builds, because
only GCC 6.2 fulfills the requirements from configure. It might
affect powerpc64le builds if those requirements are backported to
older versions of the compiler. The new code affects x86_64 builds,
since glibc is supposed to build correctly with older versions of GCC.
Tested for powerpc64le and x86_64.
* include/math.h (__signbitf128): Define as hidden.
* sysdeps/ieee754/float128/s_signbitf128.c (__signbitf128):
Reimplement without builtins.
* sysdeps/ia64/bits/floatn.h [!__GNUC_PREREQ (6, 0)]
(__builtin_signbitf128): Define to __signbitf128.
* sysdeps/powerpc/bits/floatn.h: Likewise.
* sysdeps/x86/bits/floatn.h: Likewise.
Add a new tunable (glibc.tune.cpu) to override CPU identification on
aarch64. This is useful in two cases: one where it is desirable to
pretend to be another CPU for purposes of testing or because routines
written for that CPU are beneficial for specific workloads and second
where the underlying kernel does not support emulation of MRS to get
the MIDR of the CPU.
* elf/dl-tunables.h (tunable_is_name): Move from...
* elf/dl-tunables.c (is_name): ... here.
(parse_tunables, __tunables_init): Adjust.
* manual/tunables.texi: Document glibc.tune.cpu.
* sysdeps/aarch64/dl-tunables.list: New file.
* sysdeps/unix/sysv/linux/aarch64/cpu-features.c (struct
cpu_list): New type.
(cpu_list): New list of CPU names and their MIDR.
(get_midr_from_mcpu): New function.
(init_cpu_features): Override MIDR if necessary.
The string function implementations implemented so far do not use any
instructions that may deviate from standard aarch64, so it is possible
for all routines to run on all armv8 hardware. Select all
implementations in the benchmarks and tests.
* sysdeps/aarch64/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Unconditionally select thunderx
routine for testing.
GCC 7 changed the definition of max_align_t on i386:
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=9b5c49ef97e63cc63f1ffa13baf771368105ebe2
As a result, glibc malloc no longer returns memory blocks which are as
aligned as max_align_t requires.
This causes malloc/tst-malloc-thread-fail to fail with an error like this
one:
error: allocation function 0, size 144 not aligned to 16
This patch moves the MALLOC_ALIGNMENT definition to <malloc-alignment.h>
and increases the malloc alignment to 16 for i386.
[BZ #21120]
* malloc/malloc-internal.h (MALLOC_ALIGNMENT): Moved to ...
* sysdeps/generic/malloc-alignment.h: Here. New file.
* sysdeps/i386/malloc-alignment.h: Likewise.
* sysdeps/generic/malloc-machine.h: Include <malloc-alignment.h>.
This patch improves the default posix implementation of posix_spawn{p}
and align with Linux one. The main idea is to fix some issues already
fixed in Linux code, and deprecated vfork internal usage (source of
various bug reports). In a short:
- It moves POSIX_SPAWN_USEVFORK usage and sets it a no-op. Since
the process that actually spawn the new process do not share
memory with parent (with vfork), it fixes BZ#14750 for this
implementation.
- It uses a pipe to correctly obtain the return upon failure
of execution (BZ#18433).
- It correctly enable/disable asynchronous cancellation (checked
on ptl/tst-exec5.c).
- It correctly disable/enable signal handling.
Using this version instead of Linux shows only one regression,
posix/tst-spawn3, because of pipe2 usage which increase total
number of file descriptor.
* sysdeps/posix/spawni.c (__spawni_child): New function.
(__spawni): Rename to __spawnix.
This patch adds tgmath.h support for _Float128, so eliminating the
awkward caveat in NEWS about the type not being supported there. This
does inevitably increase the size of macro expansions (which grows
particularly fast when you have nested calls to tgmath.h macros), but
only when _Float128 is supported and the declarations of _Float128
interfaces are visible; otherwise the expansions are unchanged.
Tested for x86_64 and arm.
* math/tgmath.h: Include <bits/libc-header-start.h> and
<bits/floatn.h>.
(__TGMATH_F128): New macro.
(__TGMATH_CF128): Likewise.
(__TGMATH_UNARY_REAL_ONLY): Use __TGMATH_F128.
(__TGMATH_UNARY_REAL_RET_ONLY): Likewise.
(__TGMATH_BINARY_FIRST_REAL_ONLY): Likewise.
(__TGMATH_BINARY_FIRST_REAL_STD_ONLY): New macro.
(__TGMATH_BINARY_REAL_ONLY): Use __TGMATH_F128.
(__TGMATH_BINARY_REAL_STD_ONLY): New macro.
(__TGMATH_BINARY_REAL_RET_ONLY): Use __TGMATH_F128.
(__TGMATH_TERNARY_FIRST_SECOND_REAL_ONLY): Likewise.
(__TGMATH_TERNARY_REAL_ONLY): Likewise.
(__TGMATH_TERNARY_FIRST_REAL_RET_ONLY): Likewise.
(__TGMATH_UNARY_REAL_IMAG): Use __TGMATH_CF128.
(__TGMATH_UNARY_IMAG): Use __TGMATH_F128.
(__TGMATH_UNARY_REAL_IMAG_RET_REAL): Use __TGMATH_CF128.
(__TGMATH_BINARY_REAL_IMAG): Likewise.
(nexttoward): Use __TGMATH_BINARY_FIRST_REAL_STD_ONLY.
[__USE_MISC] (scalb): Use __TGMATH_BINARY_REAL_STD_ONLY.
* math/gen-tgmath-tests.py (Type.init_types): Enable _FloatN and
_FloatNx types if the corresponding HUGE_VAL macros are defined.
As a GNU extension, for _GNU_SOURCE glibc's complex.h provides a
clog10 function and tgmath.h supports complex arguments to the log10
macro. However, tgmath.h uses __clog10 not clog10 in defining the
macro.
There is no namespace reason (ignoring the block-scope namespace
issues that would apply equally to *every* function called by tgmath.h
macros) for using __clog10 here, since this is only for _GNU_SOURCE so
clog10 is always visible when this macro definition is used.
Furthermore, __clog10f128 is not exported, so supporting _Float128 in
tgmath.h implies using clog10 not __clog10 there. (__clog10 and
clog10 aren't used in libstdc++ either, although that library would
have a good case for using the __clog10 reserved-namespace export: the
standard C++ library includes log10 of a complex number.) This patch
duly changes the header to use clog10, and enables tests of the macro
for complex arguments.
Tested for x86_64.
* math/tgmath.h [__USE_GNU] (log10): Use clog10 not __clog10.
* math/gen-tgmath-tests.py (Tests.add_all_tests): Test log10 for
complex arguments.
The tgmath.h totalorder and totalordermag macros wrongly return a
floating-point type. They should return int, like the underlying
functions. This patch fixes them accordingly, updating tests
including enabling tests of those functions from gen-tgmath-tests.py.
Tested for x86_64.
[BZ #21687]
* math/tgmath.h (__TGMATH_BINARY_REAL_RET_ONLY): New macro.
(totalorder): Use it.
(totalordermag): Likewise.
* math/gen-tgmath-tests.py (Tests.add_all_tests): Enable tests of
totalorder and totalordermag.
* math/test-tgmath.c (F(compile_test)): Do not call totalorder or
totalordermag in arguments of calls to those functions.
(NCALLS): Change to 134.