copy_file_range can't be used to copy a file from glibc source directory
to glibc build directory since they may be on different filesystems.
This patch adds xcopy_file_range for cross-device copy.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
[BZ #23597]
* support/Makefile (libsupport-routines): Add
support_copy_file_range and xcopy_file_range.
* support/support.h: Include <sys/types.h>.
(support_copy_file_range): New prototype.
* support/support_copy_file_range.c: New file. Copied and
modified from io/copy_file_range-compat.c.
* support/test-container.c (copy_one_file): Call xcopy_file_rang
instead of copy_file_range.
* support/xcopy_file_range.c: New file.
* support/xunistd.h (xcopy_file_range): New prototype.
The elf/tst-dlopen-aout.c test uses asserts to verify properties of the
test execution. Instead of using assert it should use xpthread_create
and xpthread_join to catch errors starting the threads and fail the
test. This shows up in Fedora 28 when building for i686-pc-linux-gnu
and using gcc 8.1.1.
Tested on i686, and fixes a check failure with -DNDEBUG.
Signed-off-by: Carlos O'Donell <carlos@redhat.com>
Initially, this function was restricted to _GNU_SOURCE, but experience
shows that compatibility with existing build systems is improved if we
declare it under _DEFAULT_SOURCE as well.
The test tries to allocate more than 2^31 bytes which will always fail on s390
as it has maximum 2^31bit of memory.
Before commit 6c3a8a9d86, this test returned
unsupported if malloc fails. This patch re enables this behaviour.
Furthermore support_delete_temp_files() failed to remove the temp directory
in this case as it is not empty due to the created symlink.
Thus the creation of the symlink is moved behind malloc.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
ChangeLog:
* stdlib/test-bz22786.c (do_test): Return EXIT_UNSUPPORTED
if malloc fails.
When converting gen-libm-test to Python, in one place I noted a bug in
the old Perl version that I preserved in the Python version so that
the generated output files were the same with both versions, as such
comparisons help give confidence in the correctness of such a rewrite
of a script. Now that the conversion has been done, this patch fixes
that bug, by arranging for tests with plus_oflow or minus_oflow
results (manually written tests in libm-test-*.inc that have
overflowing results that thus depend on the rounding mode) to be
properly treated as having non-finite results, and thus not run for
the __FINITE_MATH_ONLY__ tests. (As the affected tests in fact did
pass for __FINITE_MATH_ONLY__ testing, this is just a matter of
logical correctness in the choice of which tests run for that case,
rather than fixing any actual test failures.)
Tested for x86_64.
* math/gen-libm-test.py (gen_test_args_res): Also treat plus_oflow
and minus_oflow as non-finite.
On some architectures, the parts of math_private.h relating to the
floating-point environment are in a separate file fenv_private.h
included from math_private.h. As this is purely an
architecture-specific convention used by several architectures,
however, all such architectures still need their own math_private.h,
even if it has nothing to do beyond #include <fenv_private.h> and
peculiarity of including the i386 file directly instead of having a
shared file in sysdeps/x86.
This patch makes the fenv_private.h name an architecture-independent
convention in glibc. The include of fenv_private.h from
math_private.h becomes architecture-independent (until callers are
updated to include fenv_private.h directly so the include from
math_private.h is no longer needed). Some architecture math_private.h
headers are removed if no longer needed, or renamed to fenv_private.h
if all they define belongs in that header; architecture fenv_private.h
headers now do require #include_next <fenv_private.h>. The i386
fenv_private.h file moves to sysdeps/x86/fpu/ to reflect how it is
actually shared with x86_64. The generic math_private.h gets a new
include of <stdbool.h>, as needed for bool in some prototypes in that
header (previously that was indirectly included via include/fenv.h,
which now only gets included too late in math_private.h, after those
prototypes).
Tested for x86_64 and x86, and tested with build-many-glibcs.py that
installed stripped shared libraries are unchanged by the patch.
* sysdeps/aarch64/fpu/fenv_private.h: New file. Based on ....
* sysdeps/aarch64/fpu/math_private.h: ... this file. All contents
moved to fenv_private.h except for ...
(TOINT_INTRINSICS): Kept in math_private.h.
(roundtoint): Likewise.
(converttoint): Likewise.
* sysdeps/arm/fenv_private.h: Change multiple-include guard to
[ARM_FENV_PRIVATE_H]. Include next <fenv_private.h>.
* sysdeps/arm/math_private.h: Remove.
* sysdeps/generic/fenv_private.h: New file. Contents moved from
....
* sysdeps/generic/math_private.h: ... this file. Include
<stdbool.h>. Do not include <fenv.h> or <get-rounding-mode.h>.
Include <fenv_private.h>. Remove functions and macros moved to
fenv_private.h.
* sysdeps/i386/fpu/math_private.h: Remove.
* sysdeps/mips/math_private.h: Move to ....
* sysdeps/mips/fpu/fenv_private.h: ... here. Change
multiple-include guard to [MIPS_FENV_PRIVATE_H]. Remove
[__mips_hard_float] conditional. Include next <fenv_private.h>.
* sysdeps/powerpc/fpu/fenv_private.h: Change multiple-include
guard to [POWERPC_FENV_PRIVATE_H]. Include next <fenv_private.h>.
* sysdeps/powerpc/fpu/math_private.h: Do not include
<fenv_private.h>.
* sysdeps/riscv/rvf/math_private.h: Move to ....
* sysdeps/riscv/rvf/fenv_private.h: ... here. Change
multiple-include guard to [RISCV_FENV_PRIVATE_H]. Include next
<fenv_private.h>.
* sysdeps/sparc/fpu/fenv_private.h: Change multiple-include guard
to [SPARC_FENV_PRIVATE_H]. Include next <fenv_private.h>.
* sysdeps/sparc/fpu/math_private.h: Remove.
* sysdeps/i386/fpu/fenv_private.h: Move to ....
* sysdeps/x86/fpu/fenv_private.h: ... here. Change
multiple-include guard to [X86_FENV_PRIVATE_H]. Include next
<fenv_private.h>.
* sysdeps/x86_64/fpu/math_private.h: Do not include
<sysdeps/i386/fpu/fenv_private.h>.
As done in commit 284f42bc77, memcmp
can be used after memchr to avoid the initialization overhead of the
two-way algorithm for the first match. This has shown improvement
>40% for first match.
Completing the move of macros out of math-tests.h to smaller headers
following typo-proof conventions instead of using #ifndef, this patch
moves the EXCEPTION_SET_FORCES_TRAP macro out to its own
math-tests-trap-force.h header.
Tested with build-many-glibcs.py.
* sysdeps/generic/math-tests-trap-force.h: New file.
* sysdeps/generic/math-tests.h: Include <math-tests-trap-force.h>.
(EXCEPTION_SET_FORCES_TRAP): Do not define here.
* sysdeps/powerpc/math-tests.h: Remove file.
* sysdeps/powerpc/fpu/math-tests-trap-force.h: New file.
Continuing moving macros out of math-tests.h to smaller headers
following typo-proof conventions instead of using #ifndef, this patch
moves the EXCEPTION_ENABLE_SUPPORTED macro out to its own
math-tests-trap.h header.
Tested with build-many-glibcs.py.
* sysdeps/generic/math-tests-trap.h: New file.
* sysdeps/generic/math-tests.h: Include <math-tests-trap.h>.
(EXCEPTION_ENABLE_SUPPORTED): Do not define here.
* sysdeps/aarch64/math-tests.h: Remove file.
* sysdeps/arm/math-tests.h: Likewise.
* sysdeps/riscv/math-tests.h: Likewise.
* sysdeps/aarch64/math-tests-trap.h: New file.
* sysdeps/arm/math-tests-trap.h: Likewise.
* sysdeps/riscv/math-tests-trap.h: Likewise.
Continuing moving macros out of math-tests.h to smaller headers
following typo-proof conventions instead of using #ifndef, this patch
moves the EXCEPTION_TESTS_* macros for individual types out to their
own sysdeps header.
As with ROUNDING_TESTS_*, there is no need to define these macros if
FE_ALL_EXCEPT == 0 and the individual exception macros are undefined;
thus, math-tests-exceptions.h headers are only needed for soft-float
ARM and RISC-V, while the other cases that defined these macros do not
need to do so (and the associated math-tests.h headers are thus
removed without needing replacement by math-tests-exceptions.h
headers).
Tested with build-many-glibcs.py.
* sysdeps/generic/math-tests-exceptions.h: New file.
* sysdeps/generic/math-tests.h: Include <math-tests-exceptions.h>.
(EXCEPTION_TESTS_float): Do not define here.
(EXCEPTION_TESTS_double): Likewise.
(EXCEPTION_TESTS_long_double): Likewise.
(EXCEPTION_TESTS_float128): Likewise.
* sysdeps/arm/math-tests.h [__SOFTFP__] (EXCEPTION_TESTS_float):
Likewise.
[__SOFTFP__] (EXCEPTION_TESTS_double): Likewise.
[__SOFTFP__] (EXCEPTION_TESTS_long_double): Likewise.
* sysdeps/arm/nofpu/math-tests-exceptions.h: New file.
* sysdeps/m68k/coldfire/math-tests.h: Remove file.
* sysdeps/mips/math-tests.h: Likewise.
* sysdeps/nios2/math-tests.h: Likewise.
* sysdeps/riscv/math-tests.h [!__riscv_flen]
(EXCEPTION_TESTS_float): Do not define here.
[!__riscv_flen] (EXCEPTION_TESTS_double): Likewise.
[!__riscv_flen] (EXCEPTION_TESTS_long_double): Likewise.
* sysdeps/riscv/nofpu/math-tests-exceptions.h: New file.
The NEWS entry for sinf improvements is listed for 2.28, while it was
committed in 2.29, so move it there and mention tanf.
Committed as obvious.
* NEWS: Move optimized sinf entry to 2.29.
Speedup tanf range reduction by using the new sincosf range
reduction algorithm. Overall code quality is improved due to
inlining, so there is a speedup even if no range reduction is
required.
tanf throughput gains on Cortex-A72:
* |x| < M_PI_4 : 1.1x
* |x| < M_PI_2 : 1.2x
* |x| < 2 * M_PI: 1.5x
* |x| < 120.0 : 1.6x
* |x| < Inf : 12.1x
* sysdeps/ieee754/flt-32/s_tanf.c (__tanf): Use fast range reduction.
This patch completes the move of ROUNDING_TESTS_* macros to typo-proof
conventions by stopping redefining them in test-*-vlen*.h. Instead,
libm-test-driver.c is made to check TEST_MATHVEC when setting
non-to-nearest rounding modes.
Tested for x86_64.
* math/test-double-vlen2.h: Don't include <math-tests-rounding.h>.
(ROUNDING_TESTS_double): Remove.
* math/test-double-vlen4.h: Don't include <math-tests-rounding.h>.
(ROUNDING_TESTS_double): Remove.
* math/test-double-vlen8.h: Don't include <math-tests-rounding.h>.
(ROUNDING_TESTS_double): Remove.
* math/test-float-vlen16.h: Don't include <math-tests-rounding.h>.
(ROUNDING_TESTS_float): Remove.
* math/test-float-vlen4.h: Don't include <math-tests-rounding.h>.
(ROUNDING_TESTS_float): Remove.
* math/test-float-vlen8.h: Don't include <math-tests-rounding.h>.
(ROUNDING_TESTS_float): Remove.
* math/libm-test-driver.c (IF_ROUND_INIT_FE_DOWNWARD): Check
!TEST_MATHVEC here.
(IF_ROUND_INIT_FE_TOWARDZERO): Likewise.
(IF_ROUND_INIT_FE_UPWARD): Likewise.
Continuing moving macros out of math-tests.h to smaller headers
following typo-proof conventions instead of using #ifndef, this patch
moves the ROUNDING_TESTS_* macros for individual types out to their
own sysdeps header.
In the soft-float case where FE_TONEAREST is the only rounding mode
macro defined, there is no need to define ROUNDING_TESTS_*; it is only
necessary when rounding modes macros are defined that may not be
supported at runtime. Thus, the ROUNDING_TESTS_* definitions for some
configurations are just removed, not moved to new
math-tests-rounding.h headers; the only architectures needing
math-tests-rounding.h are those where the macros are defined in
bits/fenv.h because of the possibility of a soft-float compilation
using a hard-float glibc with the same ABI (i.e., ARM and RISC-V).
The test-*-vlen*.h headers, by using #undef, do not yet follow
typo-proof conventions (but they no longer implicitly rely on being
included before math-tests.h, and this area can always be cleaned up
further in future).
Tested with build-many-glibcs.py.
* sysdeps/generic/math-tests-rounding.h: New file.
* sysdeps/generic/math-tests.h: Include <math-tests-rounding.h>.
(ROUNDING_TESTS_float): Do not define here.
(ROUNDING_TESTS_double): Likewise.
(ROUNDING_TESTS_long_double): Likewise.
(ROUNDING_TESTS_float128): Likewise.
* math/test-double-vlen2.h: Include <math-tests-rounding.h>.
(ROUNDING_TESTS_double): Undefine before defining.
* math/test-double-vlen4.h: Include <math-tests-rounding.h>.
(ROUNDING_TESTS_double): Undefine before defining.
* math/test-double-vlen8.h: Include <math-tests-rounding.h>.
(ROUNDING_TESTS_double): Undefine before defining.
* math/test-float-vlen16.h: Include <math-tests-rounding.h>.
(ROUNDING_TESTS_float): Undefine before defining.
* math/test-float-vlen4.h: Include <math-tests-rounding.h>.
(ROUNDING_TESTS_float): Undefine before defining.
* math/test-float-vlen8.h: Include <math-tests-rounding.h>.
(ROUNDING_TESTS_float): Undefine before defining.
* sysdeps/arm/nofpu/math-tests-rounding.h: New file.
* sysdeps/arm/math-tests.h [__SOFTFP__] (ROUNDING_TESTS_float): Do
not define here.
[__SOFTFP__] (ROUNDING_TESTS_double): Likewise.
[__SOFTFP__] (ROUNDING_TESTS_long_double): Likewise.
* sysdeps/riscv/nofpu/math-tests-rounding.h: New file.
* sysdeps/riscv/math-tests.h [!__riscv_flen]
(ROUNDING_TESTS_float): Do not define here.
[!__riscv_flen] (ROUNDING_TESTS_double): Likewise.
[!__risv_flen] (ROUNDING_TESTS_long_double): Likewise.
* sysdeps/m68k/coldfire/math-tests.h [!__mcffpu__]
(ROUNDING_TESTS_float): Likewise.
[!__mcffpu__] (ROUNDING_TESTS_double): Likewise.
[!__mcffpu__] (ROUNDING_TESTS_long_double): Likewise.
* sysdeps/mips/math-tests.h [__mips_soft_float]
(ROUNDING_TESTS_float): Likewise.
[__mips_soft_float] (ROUNDING_TESTS_double): Likewise.
[__mips_soft_float] (ROUNDING_TESTS_long_double): Likewise.
* sysdeps/nios2/math-tests.h (ROUNDING_TESTS_float): Likewise.
(ROUNDING_TESTS_double): Likewise.
(ROUNDING_TESTS_long_double): Likewise.
This patch adds the PF_XDP, AF_XDP and SOL_XDP macros from Linux 4.18 to
sysdeps/unix/sysv/linux/bits/socket.h.
* sysdeps/unix/sysv/linux/bits/socket.h (PF_MAX): Set to 45.
(PF_XDP): New macro.
(AF_XDP): New macro.
(SOL_XDP): New macro.
This patch adds constants from netinet/tcp.h in Linux 4.18, and an
associated struct tcp_zerocopy_receive, to sysdeps/gnu/netinet/tcp.h.
The new TCP_REPAIR_* constants seemed sufficiently related to those
already present to include them.
Note that this patch does not include additions to struct tcp_info;
there are many other elements in this structure in the Linux kernel
that are not included in the glibc version (which was last extended in
2007, it seems). Such additions to the end of the structure may be OK
with the expected way it is used (size passed explicitly to the kernel
with getsockopt), but in principle any change to the size of a type
provided by glibc is an ABI change for external applications /
libraries using that type in their ABIs, and has the associated risks
of such a change.
Tested for x86_64.
* sysdeps/gnu/netinet/tcp.h (TCP_ZEROCOPY_RECEIVE): New macro.
(TCP_INQ): Likewise.
(TCP_CM_INQ): Likewise.
(TCP_REPAIR_ON): Likewise.
(TCP_REPAIR_OFF): Likewise.
(TCP_REPAIR_OFF_NO_WP): Likewise.
(struct tcp_zerocopy_receive): New type.
This patch updates struct signalfd_siginfo in sys/signalfd.h with new
members from Linux 4.18 (plus ssi_addr_lsb, added to the kernel in
2.6.37 without being added to sys/signalfd.h at that time). The
__pad2 member name follows the kernel and the existing __pad name.
Tested for x86_64.
* sysdeps/unix/sysv/linux/sys/signalfd.h (struct
signalfd_siginfo): Add ssi_addr_lsb, ssi_syscall, ssi_call_addr
and ssi_arch members.
This patch adds two new constants from Linux 4.18 to elf.h,
NT_VMCOREDD and AT_MINSIGSTKSZ.
Tested for x86_64.
* elf/elf.c (NT_VMCOREDD): New macro.
(AT_MINSIGSTKSZ): Likewise.
New generic optimization of sinf and cosf introduced by commit
599cf39766 shows improvement
compared to powerpc specific assembly version. Hence removing
the powerpc assembly versions to make use of generic code.
The House of Force is a well-known technique to exploit heap
overflow. In essence, this exploit takes three steps:
1. Overwrite the size of top chunk with very large value (e.g. -1).
2. Request x bytes from top chunk. As the size of top chunk
is corrupted, x can be arbitrarily large and top chunk will
still be offset by x.
3. The next allocation from top chunk will thus be controllable.
If we verify the size of top chunk at step 2, we can stop such attack.
This patch moves little endian specific POWER9 optimization files to
sysdeps/powerpc/powerpc64/le and creates POWER9 ifunc functions
only for little endian.
This variant of strlen uses vector loads and operations to reduce the
size of the code and also eliminate the non-ascii fallback. This
works very well for falkor because of its two vector units and
efficient vector ops. In the best case it reduces latency of cases in
bench-strlen by 48%, with gains throughout the benchmark.
strlen-walk also sees uniform gains in the 5%-15% range.
Overall the routine appears to work better than the stock one for falkor
regardless of the benchmark, length of string or cache state.
The same cannot be said of a53 and a72 though. a53 performance was
greatly reduced and for a72 it was a bit of a mixed bag, slightly on the
negative side but I reckon it might be fast in some situations.
* sysdeps/aarch64/strlen.S (__strlen): Rename to STRLEN.
[!STRLEN](STRLEN): Set to __strlen.
* sysdeps/aarch64/multiarch/strlen.c: New file.
* sysdeps/aarch64/multiarch/strlen_generic.S: Likewise.
* sysdeps/aarch64/multiarch/strlen_asimd.S: Likewise.
* sysdeps/aarch64/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Add strlen.
* sysdeps/aarch64/multiarch/Makefile (sysdep_routines): Add
strlen_generic and strlen_asimd.
Reviewed-By: szabolcs.nagy@arm.com
CC: pinskia@gmail.com
The internal functions __kernel_sinf and __kernel_cosf are used only by
lgammaf_r. Removing the internal functions and using the generic sinf
and cosf is better overall. Benchmarking on Cortex-A72 shows the generic
sinf and cosf are 1.4x and 2.3x faster in the range |x| < PI/4, and 0.66x
and 1.1x for |x| < PI/2, so it should make lgammaf_r faster on average.
GLIBC regression tests pass on AArch64.
* sysdeps/ieee754/flt-32/e_lgammaf_r.c (sin_pif): Use __sinf/__cosf.
* sysdeps/ieee754/flt-32/k_cosf.c (__kernel_cosf): Remove all code.
* sysdeps/ieee754/flt-32/k_sinf.c (__kernel_sinf): Likewise.
Fix a few missing spaces, it's now identical to the regenerated version.
Passes GLIBC tests on x64.
* sysdeps/x86_64/fpu/libm-test-ulps: Regenerate to fix spaces.
The second patch improves performance of sinf and cosf using the same
algorithms and polynomials. The returned values are identical to sincosf
for the same input. ULP definitions for AArch64 and x64 are updated.
sinf/cosf througput gains on Cortex-A72:
* |x| < 0x1p-12 : 1.2x
* |x| < M_PI_4 : 1.8x
* |x| < 2 * M_PI: 1.7x
* |x| < 120.0 : 2.3x
* |x| < Inf : 3.0x
* NEWS: Mention sinf, cosf, sincosf.
* sysdeps/aarch64/libm-test-ulps: Update ULP for sinf, cosf, sincosf.
* sysdeps/x86_64/fpu/libm-test-ulps: Update ULP for sinf and cosf.
* sysdeps/x86_64/fpu/multiarch/s_sincosf-fma.c: Add definitions of
constants rather than including generic sincosf.h.
* sysdeps/x86_64/fpu/s_sincosf_data.c: Remove.
* sysdeps/ieee754/flt-32/s_cosf.c (cosf): Rewrite.
* sysdeps/ieee754/flt-32/s_sincosf.h (reduced_sin): Remove.
(reduced_cos): Remove.
(sinf_poly): New function.
* sysdeps/ieee754/flt-32/s_sinf.c (sinf): Rewrite.
This patch updates sysdeps/unix/sysv/linux/syscall-names.list for
Linux 4.18. The io_pgetevents and rseq syscalls are added to the
kernel on various architectures, so need to be mentioned in this file.
Tested with build-many-glibcs.py.
* sysdeps/unix/sysv/linux/syscall-names.list: Update kernel
version to 4.18.
(io_pgetevents): New syscall.
(rseq): Likewise.
The install.texi documentation of uses of Perl and Python is
substantially out of date.
The description of Perl is "to test the installation" (which I
interpret as referring to test-installation.pl), but it's used for
more tests than that, and to build the manual, and to regenerate one
file in the source tree.
The description of Python is only for pretty-printer tests, but it's
used for other tests / benchmarks as well (and for other internal uses
such as updating Unicode data, for which we already require Python 3,
but I think install.texi only needs to describe uses from the main
glibc Makefiles).
This patch updates the descriptions of what those tools are used for.
The Python information (and information about other tools for testing
pretty printers) was awkwardly in the middle of the general
description of building and testing glibc, rather than with the rest
of information about tools used in glibc build and test; this patch
moves the information about those tools into the main list.
Tested with regeneration of INSTALL as well as "make info" and "make
pdf".
* manual/install.texi (Configuring and compiling): Do not list
tools used for testing pretty printers here.
(Tools for Compilation): List Python, PExpect and GDB here.
Update descriptions of uses of Perl and Python.
* INSTALL: Regenerate.
Add the workload test properties (max-throughput, latency, etc.) to
the schema to prevent benchmark output validation from failing.
* benchtests/scripts/benchout.schema.json (properties): Add
new properties.
Add the duration and iterations attributes to the workloads tests to
make the json schema parser happy
* benchtests/bench-skeleton.c (main): Add duration and
iterations attributes.
Adjust the non-glibc code to agree with what Gawk needs for
rational range interpretation (RRI) for regular expression ranges.
In unibyte locales, Gawk wants ranges to use the underlying byte
rather than the character code point. This change does not affect
glibc proper.
* posix/regcomp.c (parse_byte) [!LIBC && RE_ENABLE_I18N]:
In unibyte locales, use the byte value rather than
running it through btowc.
Continuing moving macros out of math-tests.h to smaller headers
following typo-proof conventions instead of using #ifndef, this patch
moves the SNAN_TESTS_* macros for individual types out to their own
sysdeps header (while the type-generic SNAN_TESTS wrapper for those
macros remains in math-tests.h).
Tested for x86_64 and x86, and with build-many-glibcs.py.
* sysdeps/generic/math-tests-snan.h: New file.
* sysdeps/generic/math-tests.h: Include <math-tests-snan.h>.
(SNAN_TESTS_float): Do not define here.
(SNAN_TESTS_double): Likewise.
(SNAN_TESTS_long_double): Likewise.
(SNAN_TESTS_float128): Likewise.
* sysdeps/i386/fpu/math-tests-snan.h: New file.
* sysdeps/i386/fpu/math-tests.h: Remove file.
* sysdeps/ia64/math-tests-snan.h: New file.
* sysdeps/ia64/math-tests.h: Remove file.
* sysdeps/x86/math-tests.h: Likewise.
* sysdeps/x86_64/fpu/math-tests-snan.h: New file.
This patch is a complete rewrite of sincosf. The new version is
significantly faster, as well as simple and accurate.
The worst-case ULP is 0.5607, maximum relative error is 0.5303 * 2^-23 over
all 4 billion inputs. In non-nearest rounding modes the error is 1ULP.
The algorithm uses 3 main cases: small inputs which don't need argument
reduction, small inputs which need a simple range reduction and large inputs
requiring complex range reduction. The code uses approximate integer
comparisons to quickly decide between these cases.
The small range reducer uses a single reduction step to handle values up to
120.0. It is fastest on targets which support inlined round instructions.
The large range reducer uses integer arithmetic for simplicity. It does a
32x96 bit multiply to compute a 64-bit modulo result. This is more than
accurate enough to handle the worst-case cancellation for values close to
an integer multiple of PI/4. It could be further optimized, however it is
already much faster than necessary.
sincosf throughput gains on Cortex-A72:
* |x| < 0x1p-12 : 1.6x
* |x| < M_PI_4 : 1.7x
* |x| < 2 * M_PI: 1.5x
* |x| < 120.0 : 1.8x
* |x| < Inf : 2.3x
* math/Makefile: Add s_sincosf_data.c.
* sysdeps/ia64/fpu/s_sincosf_data.c: New file.
* sysdeps/ieee754/flt-32/s_sincosf.h (abstop12): Add new function.
(sincosf_poly): Likewise.
(reduce_small): Likewise.
(reduce_large): Likewise.
* sysdeps/ieee754/flt-32/s_sincosf.c (sincosf): Rewrite.
* sysdeps/ieee754/flt-32/s_sincosf_data.c: New file with sincosf data.
* sysdeps/m68k/m680x0/fpu/s_sincosf_data.c: New file.
* sysdeps/x86_64/fpu/s_sincosf_data.c: New file.
This patch currently only affects aarch64.
The roundtoint and converttoint internal functions are only called with small
values, so 32 bit result is enough for converttoint and it is a signed int
conversion so the return type is changed to int32_t.
The original idea was to help the compiler keeping the result in uint64_t,
then it's clear that no sign extension is needed and there is no accidental
undefined or implementation defined signed int arithmetics.
But it turns out gcc does a good job with inlining so changing the type has
no overhead and the semantics of the conversion is less surprising this way.
Since we want to allow the asuint64 (x + 0x1.8p52) style conversion, the top
bits were never usable and the existing code ensures that only the bottom
32 bits of the conversion result are used.
On aarch64 the neon intrinsics (which round ties to even) are changed to
round and lround (which round ties away from zero) this does not affect the
results in a significant way, but more portable (relies on round and lround
being inlined which works with -fno-math-errno).
The TOINT_SHIFT and TOINT_RINT macros were removed, only keep separate code
paths for TOINT_INTRINSICS and !TOINT_INTRINSICS.
* sysdeps/aarch64/fpu/math_private.h (roundtoint): Use round.
(converttoint): Use lround.
* sysdeps/ieee754/flt-32/math_config.h (roundtoint): Declare and
document the semantics when TOINT_INTRINSICS is set.
(converttoint): Likewise.
(TOINT_RINT): Remove.
(TOINT_SHIFT): Remove.
* sysdeps/ieee754/flt-32/e_expf.c (__expf): Remove the TOINT_RINT code
path.
Commit 298d0e3129 ("Consolidate Linux
getdents{64} implementation") broke the implementation because it does
not take into account struct offset differences.
The new implementation is close to the old one, before the
consolidation, but has been cleaned up slightly.
* Since __fentry__ is almost the same as _mcount, reuse the code by
#including it twice with different #defines around.
* Remove LA usages - they are needed in 31-bit mode to clear the top
bit, but in 64-bit they appear to do nothing.
* Add CFI rule for the nonstandard return register. This rule applies
to the current function (binutils generates a new CIE - see
gas/dw2gencfi.c:select_cie_for_fde()), so it is not necessary to put
__fentry__ into a new file.
* Fix CFI offset for %r14.
* Add CFI rule for %r0.
* Fix unwound value of %r15 being off by 244 bytes.
* Unwinding in __fentry__@plt does not work, no plan to fix it - it
would require asking linker to generate CFI for return address in
%r0. From functional perspective keeping it broken is fine, since
the callee did not have a chance to do anything yet. From
convenience perspective it would be possible to enhance GDB in the
future to treat __fentry__@plt in a special way.
* Fix whitespace.
* Fix offsets in comments, which were copied from 32-bit code.
* 32-bit version will not be implemented, since it's not compatible
with the corresponding PLT stubs: they assume %r12 points to GOT,
which is not the case for gcc-emitted __fentry__ stub, which runs
before the prolog.
This patch adds the runtime support in glibc for the -mfentry
gcc feature introduced in [1] and [2].
[1] https://gcc.gnu.org/ml/gcc-patches/2018-07/msg00784.html
[2] https://gcc.gnu.org/ml/gcc-patches/2018-07/msg00912.html
ChangeLog:
* sysdeps/s390/s390-64/Versions (__fentry__): Add.
* sysdeps/s390/s390-64/s390x-mcount.S: Move the common
code to s390x-mcount.h and #include it.
* sysdeps/s390/s390-64/s390x-mcount.h: New file.
* sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist
(__fentry__): Add.
__fentry__ symbol is currently not defined for other architectures.
Attempts to introduce it cause abicheck to fail, because it will be
available since 2.29 earliest, and not 2.13, which is the case for
Intel. With the new code, abicheck passes for i686-linux-gnu,
x86_64-linux-gnu and x86_64-linux-gnu32 triples.
ChangeLog:
* stdlib/Versions: Remove __fentry__.
* sysdeps/i386/Versions: Add __fentry__.
* sysdeps/x86_64/Versions: Add __fentry__.
The following combinations need to be tested:
* 32- (g5, esa and zarch) and 64-bit
* linux32 glibc/configure CC='gcc -m31 -march=g5'
* linux32 glibc/configure CC='gcc -m31'
* linux32 glibc/configure CC='gcc -m31 -mzarch'
* With and without VX:
* glibc/configure libc_cv_asm_s390_vx=no
* With and without profiling (using LD_PROFILE)
* With and without pltexit (using LD_AUDIT)
ChangeLog:
* sysdeps/s390/Makefile: Register the new tests.
* sysdeps/s390/tst-dl-runtime-mod.S: New file.
* sysdeps/s390/tst-dl-runtime-profile-audit.c: New file.
* sysdeps/s390/tst-dl-runtime-profile-noaudit.c: New file.
* sysdeps/s390/tst-dl-runtime-resolve-audit.c: New file.
* sysdeps/s390/tst-dl-runtime-resolve-noaudit.c: New file.
* sysdeps/s390/tst-dl-runtime.c: New file.
Following the recent discussion of using Python instead of Perl and
Awk for glibc build / test, this patch replaces gen-libm-test.pl with
a new gen-libm-test.py script. This script should work with all
Python versions supported by glibc (tested by hand with Python 2.7,
tested in the build system with Python 3.5; configure prefers Python 3
if available).
This script is designed to give identical output to gen-libm-test.pl
for ease of verification of the change, except for generated comments
referring to .py instead of .pl. (That is, identical for actual
inputs passed to the script, not necessarily for all possible input;
for example, this version more precisely follows the C standard syntax
for floating-point constants when deciding when to add LIT macro
calls.) In one place a comment notes that the generation of
NON_FINITE flags is replicating a bug in the Perl script to assist in
such comparisons (with the expectation that this bug can then be
separately fixed in the Python script later).
Tested for x86_64, including comparison of generated files (and hand
testing of the case of generating a sorted libm-test-ulps file, which
isn't covered by normal "make check").
I'd expect to follow this up by extending the new script to produce
the ulps tables for the manual as well (replacing
manual/libm-err-tab.pl, so that then we just have one ulps file
parser) - at which point the manual build would depend on both Perl
and Python (eliminating the Perl dependency would require someone to
rewrite summary.pl in Python, and that would only eliminate the
*direct* Perl dependency; current makeinfo is written in Perl so there
would still be an indirect dependency).
I think install.texi is more or less equally out-of-date regarding
Perl and Python uses before and after this patch, so I don't think
this patch depends on my patch
<https://sourceware.org/ml/libc-alpha/2018-08/msg00133.html> to update
install.texi regarding such uses (pending review).
* math/gen-libm-test.py: New file.
* math/gen-libm-test.pl: Remove.
* math/Makefile [$(PERL) != no]: Change condition to [PYTHON].
($(objpfx)libm-test-ulps.h): Use gen-libm-test.py instead of
gen-libm-test.pl.
($(libm-test-c-noauto-obj)): Likewise.
($(libm-test-c-auto-obj)): Likewise.
($(libm-test-c-narrow-obj)): Likewise.
(regen-ulps): Likewise.
* math/README.libm-test: Update references to gen-libm-test.pl.
* math/libm-test-driver.c (struct test_fj_f_data): Update comment
referencing gen-libm-test.pl.
* math/libm-test-nexttoward.inc (nexttoward_test_data): Likewise.
* math/libm-test-support.c: Likewise.
* math/libm-test-support.h: Likewise.
* sysdeps/generic/libm-test-ulps: Likewise.
MIN_PAGE_SIZE is normally set to 4096 but for testing it can be set to
16 so that it exercises the page crossing code for every misaligned
access. The value was set to 15, which is obviously wrong, so fixed
as obvious and tested.
* sysdeps/aarch64/strlen.S [TEST_PAGE_CROSS](MIN_PAGE_SIZE):
Fix value.
When libm tests were split into separate per-function .inc files, a
comment relating to the nexttoward tests ended up at the end of
libm-test-nextdown.inc (because the split was based on starting each
function's tests with the <function>_test_data definition, which
failed to allow for comments before such definitions). This patch
moves that comment to the correct location.
Tested for x86_64.
* math/libm-test-nextdown.inc (do_test): Move comment to ....
* math/libm-test-nexttoward.inc (nexttoward_test_data): ... here.
Drop realloc_bufs in favour of making alloc_bufs transparently
reallocate the buffers if it had allocated before. Also consolidate
computation of buffer lengths so that they don't get repeated on every
reallocation.
* benchtests/bench-string.h (buf1_size, buf2_size): New
variables.
(init_sizes): New function.
(test_init): Use it.
(alloc_buf, exit_error): New functions.
(alloc_bufs): Use ALLOC_BUF.
(realloc_bufs): Remove.
* benchtests/bench-memcmp.c (do_test): Adjust.
* benchtests/bench-memset-large.c (do_test): Likewise.
* benchtests/bench-memset-walk.c (do_test): Likewise.
* benchtests/bench-memset.c (do_test): Likewise.
* benchtests/bench-strncmp.c (do_test): Likewise.
Since RISC-V stores the thread pointer in a general register libthread_db
can just ask the debugger for the register contents instead of trying to
call ps_get_thread_area. This enables thread debugging in gdb.
* sysdeps/riscv/nptl/tls.h (DB_THREAD_SELF): Use REGISTER instead
of CONST_THREAD_AREA.
Move STATE_SAVE_OFFSET and STATE_SAVE_MASK to sysdep.h to make
sysdeps/x86/cpu-features.h a C header file.
* sysdeps/x86/cpu-features.h (STATE_SAVE_OFFSET): Removed.
(STATE_SAVE_MASK): Likewise.
Don't check __ASSEMBLER__ to include <cpu-features-offsets.h>.
* sysdeps/x86/sysdep.h (STATE_SAVE_OFFSET): New.
(STATE_SAVE_MASK): Likewise.
* sysdeps/x86_64/dl-trampoline.S: Include <cpu-features-offsets.h>
instead of <cpu-features.h>.
* sysdeps/riscv/rv64/rvd/libm-test-ulps: Update.
Note: I regen'd these from scratch, but I'm only committing the
increases, as I only tested on hardware. There were a few 2->1
decreases that I omitted, possibly "for now".
* sysdeps/riscv/rvf/math_private.h (libc_feholdexcept_setround_riscv):
Fix rounding save-restore bug.
Fixes about a hundred off-by-ULP failures in the math testsuite.
Some TEST_* lines in libm-test-*.inc end with semicolons not commas.
This works at present because gen-libm-test.pl ignores whatever comes
after the TEST_* call, but is logically wrong (since the TEST_* calls
generate array elements, not statements) and the Python replacement
for gen-libm-test.pl that I'm working on is stricter about the syntax
here. This patch fixes the lines in question to use commas like most
such lines already do.
Tested for x86_64.
* math/libm-test-ilogb.inc (ilogb_test_data): Use ',' not ';'
after TEST_* calls.
* math/libm-test-llogb.inc (llogb_test_data): Likewise.
* math/libm-test-logb.inc (logb_test_data): Likewise.
Looking at the benchtests, both strstr and strcasestr spend a lot of time
in a slow initialization loop handling one character per iteration.
This can be simplified and use the much faster strlen/strnlen/strchr/memcmp.
Read ahead a few cachelines to reduce the number of strnlen calls, which
improves performance by ~3-4%. This patch improves the time taken for the
full strstr benchtest by >40%.
* string/strcasestr.c (STRCASESTR): Simplify and speedup first match.
* string/strstr.c (AVAILABLE): Likewise.
There is no need to include <init-arch.h> in assembly codes since all
x86 IFUNC selector functions are written in C. Tested on i686 and
x86-64. There is no code change in libc.so, ld.so and libmvec.so.
* sysdeps/i386/i686/multiarch/bzero-ia32.S: Don't include
<init-arch.h>.
* sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core-avx2.S: Likewise.
* sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core-avx2.S: Likewise.
* sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S: Likewise.
The convenience install target 'install-locale-files' is created
to allow distributions to install all of the SUPPORTED locales as
files instead of into the locale-archive.
You invoke the new convenience target like this:
make localedata/install-locale-files DESTDIR=<prefix>
Python 2 does not have a FileNotFoundError so drop it in favour of
simply printing out the last (and most informative) line of the
exception.
* benchtests/scripts/compare_strings.py: Import traceback.
(parse_file): Pretty-print error.
MIPS soft-float glibc does not support floating-point exceptions and
rounding modes, and uses a different ABI from hard-float so a
soft-float compilation cannot use a glibc that does support
floating-point exceptions and rounding modes. Thus, bits/fenv.h
should not, when compiling for soft-float, define macros for the
unsupported features.
This patch changes it accordingly to define those macros only for
hard-float. None of the exception macros are defined for soft-float,
with FE_ALL_EXCEPT defined to 0 in that case, and only FE_TONEAREST is
defined of the rounding-mode macros, and FE_NOMASK_ENV is not defined;
this is consistent with how architectures lacking exception and
rounding mode support generally define things in this header. As well
as making the header more correct for this case, this also means the
generic math_private.h optimizations for this case automatically apply
(inlining libm-internal fenv.h function calls that are trivial when
exceptions and rounding modes are not supported).
The mips64 sfp-machine.h then needs similar changes to disable more of
the exception and rounding mode handling for soft-float. (The mips32
sfp-machine.h is already used only for soft-float, has no integration
with hardware exceptions or rounding modes and so needs no changes.)
Existing binaries might use the old FE_NOMASK_ENV value as an argument
to fesetenv / feupdateenv and expect an error for it (given that it
was defined in a header that also defined FE_ALL_EXCEPT to a nonzero
value). To preserve that error, wrappers for the fallback fesetenv
and feupdateenv are created in sysdeps/mips/nofpu/.
Tested for mips64 (hard-float and soft-float, all three ABIs).
[BZ #23479]
* sysdeps/mips/bits/fenv.h (FE_INEXACT): Define only if
[__mips_hard_float].
(FE_UNDERFLOW): Likewise.
(FE_OVERFLOW): Likewise.
(FE_DIVBYZERO): Likewise.
(FE_INVALID): Likewise.
(FE_ALL_EXCEPT): Define to 0 if [!__mips_hard_float].
(FE_TOWARDZERO): Define only if [__mips_hard_float].
(FE_UPWARD): Likewise.
(FE_DOWNWARD): Likewise.
(__FE_UNDEFINED): Define if [!__mips_hard_float]
(FE_NOMASK_ENV): Define only if [__mips_hard_float].
* sysdeps/mips/mips64/sfp-machine.h (_FP_DECL_EX): Define only if
[__mips_hard_float].
(FP_ROUNDMODE): Likewise.
(FP_RND_NEAREST): Likewise.
(FP_RND_ZERO): Likewise.
(FP_RND_PINF): Likewise.
(FP_RND_MINF): Likewise.
(FP_EX_INVALID): Likewise.
(FP_EX_OVERFLOW): Likewise.
(FP_EX_UNDERFLOW): Likewise.
(FP_EX_DIVZERO): Likewise.
(FP_EX_INEXACT): Likewise.
(FP_INIT_ROUNDMODE): Likewise.
* sysdeps/mips/nofpu/fesetenv.c: New file.
* sysdeps/mips/nofpu/feupdateenv.c: Likewise.
math/test-misc.c contains some code that uses fenv.h macros
FE_UNDERFLOW, FE_OVERFLOW and FE_UPWARD without being conditional on
those macros being defined.
That would normally break the build for configurations (typically
soft-float) not defining those macros. However, the code in question
is inside LDBL_MANT_DIG > DBL_MANT_DIG conditionals. And, while we
have configurations lacking rounding mode and exception support where
LDBL_MANT_DIG > DBL_MANT_DIG (soft-float MIPS64 and RISC-V), those
configurations currently define the fenv.h macros in question even for
soft-float.
There may be some case for defining those macros in cases where a
soft-float compilation could use a hard-float libm (where both
soft-float and hard-float can use the same ABI, as on ARM and RISC-V,
for example). But MIPS is not such a case - the hard-float and
soft-float ABIs are incompatible - and thus I am testing a patch to
stop defining those macros for soft-float MIPS (motivated by reducing
the extent to which architectures need their own definitions of
math-tests.h macros - if lack of rounding mode / exception support can
be determined by the lack of macros in fenv.h, that avoids the need
for math-tests.h to declare that lack as well). Introducing a case of
LDBL_MANT_DIG > DBL_MANT_DIG without these macros defined shows up the
problem with math/test-misc.c. This patch then fixes that problem by
adding appropriate conditionals.
Tested for MIPS64 in conjunction with changes to stop defining the
macros in question in bits/fenv.h for soft-float.
* math/test-misc.c (do_test) [LDBL_MANT_DIG > DBL_MANT_DIG]: Make
code using FE_UNDERFLOW conditional on [FE_UNDERFLOW], code using
FE_OVERFLOW conditional on [FE_OVERFLOW] and code using FE_UPWARD
conditional on [FE_UPWARD].
Problem and fix reported by Assaf Gordon in:
https://lists.gnu.org/r/bug-gnulib/2018-07/txtqLKNwBdefE.txt
* posix/regcomp.c (free_charset) [!_LIBC]: Free range_starts and
range_ends members too, as they are defined in 'struct
re_charset_t' even if not _LIBC. This affects only Gnulib.
Continuing moving macros out of math-tests.h to smaller headers
following typo-proof conventions instead of using #ifndef, this patch
moves SNAN_TESTS_PRESERVE_PAYLOAD out to its own sysdeps header.
Tested with build-many-glibcs.py.
* sysdeps/generic/math-tests-snan-payload.h: New file.
* sysdeps/hppa/math-tests-snan-payload.h: Likewise.
* sysdeps/mips/math-tests-snan-payload.h: Likewise.
* sysdeps/riscv/math-tests-snan-payload.h: Likewise.
* sysdeps/generic/math-tests.h: Include
<math-tests-snan-payload.h>.
(SNAN_TESTS_PRESERVE_PAYLOAD): Do not define macro here.
* sysdeps/hppa/math-tests.h: Remove file.
* sysdeps/mips/math-tests.h [!__mips_nan2008]
(SNAN_TESTS_PRESERVE_PAYLOAD): Do not define macro here.
* sysdeps/riscv/math-tests.h (SNAN_TESTS_PRESERVE_PAYLOAD):
Likewise.
The math-tests.h header has many different macros and groups of
macros, defined using #ifndef in the generic version which is included
by architecture versions with #include_next after possibly defining
non-default versions of some of those macros.
This use of #ifndef is contrary to our normal typo-proof conventions
for macro definitions. This patch moves one of the macros,
SNAN_TESTS_TYPE_CAST, out to its own sysdeps header, to follow those
typo-proof conventions more closely.
Tested with build-many-glibcs.py.
2018-08-01 Joseph Myers <joseph@codesourcery.com>
* sysdeps/generic/math-tests-snan-cast.h: New file.
* sysdeps/powerpc/math-tests-snan-cast.h: Likewise.
* sysdeps/generic/math-tests.h: Include <math-tests-snan-cast.h>.
(SNAN_TESTS_TYPE_CAST): Do not define macro here.
* sysdeps/powerpc/math-tests.h (SNAN_TESTS_TYPE_CAST): Likewise.
Exec needs that mach_setup_thread does *not* set up TLS since it works on
another task, so we have to split this into mach_setup_tls.
* mach/mach.h (__mach_setup_tls, mach_setup_tls): Add prototypes.
* mach/setup-thread.c (__mach_setup_thread): Move TLS setup to...
(__mach_setup_tls): ... new function.
(mach_setup_tls): New alias.
* hurd/hurdsig.c (_hurdsig_init): Call __mach_setup_tls after
__mach_setup_thread.
* sysdeps/mach/hurd/profil.c (update_waiter): Likewise.
* sysdeps/mach/hurd/setitimer.c (setitimer_locked): Likewise.
* mach/Versions [libc] (mach_setup_tls): Add symbol.
* sysdeps/mach/hurd/i386/libc.abilist (mach_setup_tls): Likewise.
GNU_PROPERTY_X86_FEATURE_1_AND may not be the first property item. We
need to check each property item until we reach the end of the property
or find GNU_PROPERTY_X86_FEATURE_1_AND.
This patch adds 2 tests. The first test checks if IBT is enabled and
the second test reads the output from the first test to check if IBT
is is enabled. The second second test fails if IBT isn't enabled
properly.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
[BZ #23467]
* sysdeps/unix/sysv/linux/x86/Makefile (tests): Add
tst-cet-property-1 and tst-cet-property-2 if CET is enabled.
(CFLAGS-tst-cet-property-1.o): New.
(ASFLAGS-tst-cet-property-dep-2.o): Likewise.
($(objpfx)tst-cet-property-2): Likewise.
($(objpfx)tst-cet-property-2.out): Likewise.
* sysdeps/unix/sysv/linux/x86/tst-cet-property-1.c: New file.
* sysdeps/unix/sysv/linux/x86/tst-cet-property-2.c: Likewise.
* sysdeps/unix/sysv/linux/x86/tst-cet-property-dep-2.S: Likewise.
* sysdeps/x86/dl-prop.h (_dl_process_cet_property_note): Parse
each property item until GNU_PROPERTY_X86_FEATURE_1_AND is found.
All tests should be added to $(tests).
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
[BZ #23458]
* sysdeps/x86/Makefile (tests): Add tst-get-cpu-features-static.
ld.so symbols to be overriden by libc need to be extern to really get
overriden.
* sysdeps/mach/hurd/not-errno.h: New file.
* sysdeps/mach/hurd/i386/localplt.data: Update accordingly.
ld.so symbols to be overriden by libc need to be extern to really get
overriden.
* sysdeps/mach/hurd/dl-unistd.h (__access, __brk, __lseek, __read,
__sbrk): Do not set attribute_hidden.
* sysdeps/mach/hurd/i386/ld.abilist: Update accordingly.
* sysdeps/mach/hurd/i386/localplt.data: Update accordingly.
Simply check if "ptr < ptr_end" since "ptr" is always incremented by 8.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* sysdeps/x86/dl-prop.h (_dl_process_cet_property_note): Don't
parse beyond the note end.
This patch make the OFD tests return unsupported if kernel does not
support OFD locks (it was added on 3.15).
Checked on a ia64-linux-gnu with Linux 3.14.
* sysdeps/unix/sysv/linux/tst-ofdlocks.c: Return unsupported if
kernel does not support OFD locks.
* sysdeps/unix/sysv/linux/tst-ofdlocks-compat.c: Likewise.
ld.so symbols to be overriden by libc need to be extern to really get
overriden.
More fixes are needed to avoid the hidden attribute.
* sysdeps/mach/hurd/Versions (libc): Make __access and
__access_noerrno external so they can override the ld symbols.
(ld): Make __access, __read, __sbrk, __strtoul_internal, __write,
__writev, __open64, __access_noerrno extern so they can be overrided.
* sysdeps/mach/hurd/i386/libc.abilist: Update accordingly.
* sysdeps/mach/hurd/i386/ld.abilist: Update accordingly.
cpu-features.h has
#define bit_cpu_LZCNT (1 << 5)
#define index_cpu_LZCNT COMMON_CPUID_INDEX_1
#define reg_LZCNT
But the LZCNT feature bit is in COMMON_CPUID_INDEX_80000001:
Initial EAX Value: 80000001H
ECX Extended Processor Signature and Feature Bits:
Bit 05: LZCNT available
index_cpu_LZCNT should be COMMON_CPUID_INDEX_80000001, not
COMMON_CPUID_INDEX_1. The VMX feature bit is in COMMON_CPUID_INDEX_1:
Initial EAX Value: 01H
Feature Information Returned in the ECX Register:
5 VMX
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
[BZ # 23456]
* sysdeps/x86/cpu-features.h (index_cpu_LZCNT): Set to
COMMON_CPUID_INDEX_80000001.
On s390x, the test string/tst-xbzero-opt is failing if build with gcc head:
FAIL: no clear/prepare: expected 32 got 0
FAIL: no clear/test: expected some got 0
FAIL: ordinary clear/prepare: expected 32 got 0
INFO: ordinary clear/test: found 0 patterns (memset not eliminated)
PASS: explicit clear/prepare: expected 32 got 32
PASS: explicit clear/test: expected 0 got 0
In setup_no_clear / setup_ordinary_clear, GCC is omitting the memcpy loop
in prepare_test_buffer. Thus count_test_patterns does not find any of the
test_pattern.
This patch calls use_test_buffer in order to force the compiler to really copy
the pattern to buf.
ChangeLog:
* string/tst-xbzero-opt.c (use_test_buffer): New function.
(prepare_test_buffer): Call use_test_buffer as compiler barrier.
In commit 9479b6d5e0 we updated all of
the collation data to harmonize with the new version of ISO 14651
which is derived from Unicode 9.0.0. This collation update brought
with it some changes to locales which were not desirable by some
users, in particular it altered the meaning of the
locale-dependent-range regular expression, namely [a-z] and [A-Z], and
for en_US it caused uppercase letters to be matched by [a-z] for the
first time. The matching of uppercase letters by [a-z] is something
which is already known to users of other locales which have this
property, but this change could cause significant problems to en_US
and other similar locales that had never had this change before.
Whether this behaviour is desirable or not is contentious and GNU Awk
has this to say on the topic:
https://www.gnu.org/software/gawk/manual/html_node/Ranges-and-Locales.html
While the POSIX standard also has this further to say: "RE Bracket
Expression":
http://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xbd_chap09.html
"The current standard leaves unspecified the behavior of a range
expression outside the POSIX locale. ... As noted above, efforts were
made to resolve the differences, but no solution has been found that
would be specific enough to allow for portable software while not
invalidating existing implementations."
In glibc we implement the requirement of ISO POSIX-2:1993 and use
collation element order (CEO) to construct the range expression, the
API internally is __collseq_table_lookup(). The fact that we use CEO
and also have 4-level weights on each collation rule means that we can
in practice reorder the collation rules in iso14651_t1_common (the new
data) to provide consistent range expression resolution *and* the
weights should maintain the expected total order. Therefore this
patch does three things:
* Reorder the collation rules for the LATIN script in
iso14651_t1_common to deinterlace uppercase and lowercase letters in
the collation element orders.
* Adds new test data en_US.UTF-8.in for sort-test.sh which exercises
strcoll* and strxfrm* and ensures the ISO 14651 collation remains.
* Add back tests to tst-fnmatch.input and tst-regexloc.c which
exercise that [a-z] does not match A or Z.
The reordering of the ISO 14651 data is done in an entirely mechanical
fashion using the following program attached to the bug:
https://sourceware.org/bugzilla/show_bug.cgi?id=23393#c28
It is up for discussion if the iso14651_t1_common data should be
refined further to have 3 very tight collation element ranges that
include only a-z, A-Z, and 0-9, which would implement the solution
sought after in:
https://sourceware.org/bugzilla/show_bug.cgi?id=23393#c12
and implemented here:
https://www.sourceware.org/ml/libc-alpha/2018-07/msg00854.html
No regressions on x86_64.
Verified that removal of the iso14651_t1_common change causes tst-fnmatch
to regress with:
422: fnmatch ("[a-z]", "A", 0) = 0 (FAIL, expected FNM_NOMATCH) ***
...
425: fnmatch ("[A-Z]", "z", 0) = 0 (FAIL, expected FNM_NOMATCH) ***
Set installed NPTL header as the expected one (instead of an
internal one for glibc testsuite) and add a hurd specific
stdc-predef with __STDC_NO_THREADS__.
Checked on both i686-linux-gnu and i686-gnu that both threads.h
and stdc-predef.h are the expected ones.
* nptl/threads.h: Move to ...
* sysdeps/nptl/threads.h: ... here.
* sysdeps/hurd/stdc-predef.h: New file.
Verify that setcontext works with gaps above and below the newly
allocated shadow stack.
* sysdeps/unix/sysv/linux/x86/Makefile (tests): Add
tst-cet-setcontext-1 if CET is enabled.
(CFLAGS-tst-cet-setcontext-1.c): Add -mshstk.
* sysdeps/unix/sysv/linux/x86/tst-cet-setcontext-1.c: New file.
Remove conformace assumption of NPTL implementation for ISO C threads
and revert wrong libcrypt addition on linknamespace-libs-XPG4.
The i686-gnu target now shows two new conformance failures:
FAIL: conform/ISO11/threads.h/conform
FAIL: conform/ISO11/threads.h/linknamespace
It is expected due missing HTL ISO C threads support and both conformance
.out files indicates the reason ("#error "HTL does not implement ISO C
threads").
Checked on i686-linux-gnu and i686-gnu.
* include/threads.h: Move to ...
* sysdeps/nptl/threads.h: ... here.
* sysdeps/htl/threads.h: New file.
* conform/Makefile (linknamespace-libs-ISO11): Use
static-thread-library instead of linking libpthread.
(linknamespace-libs-XPG4): Revert wrong libcrypt.a addition.
This patch adds a field to ucontext_t to save shadow stack:
1. getcontext and swapcontext are updated to save the caller's shadow
stack pointer and return addresses.
2. setcontext and swapcontext are updated to restore shadow stack and
jump to new context directly.
3. makecontext is updated to allocate a new shadow stack and set the
caller's return address to __start_context.
Since makecontext allocates a new shadow stack when making a new
context and kernel allocates a new shadow stack for clone/fork/vfork
syscalls, we track the current shadow stack base. In setcontext and
swapcontext, if the target shadow stack base is the same as the current
shadow stack base, we unwind the shadow stack. Otherwise it is a stack
switch and we look for a restore token.
We enable shadow stack at run-time only if program and all used shared
objects, including dlopened ones, are shadow stack enabled, which means
that they must be compiled with GCC 8 or above and glibc 2.28 or above.
We need to save and restore shadow stack only if shadow stack is enabled.
When caller of getcontext, setcontext, swapcontext and makecontext is
compiled with smaller ucontext_t, shadow stack won't be enabled at
run-time. We check if shadow stack is enabled before accessing the
extended field in ucontext_t.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* sysdeps/unix/sysv/linux/x86/sys/ucontext.h (ucontext_t): Add
__ssp.
* sysdeps/unix/sysv/linux/x86_64/__start_context.S: Include
<asm/prctl.h> and "ucontext_i.h" when shadow stack is enabled.
(__push___start_context): New.
* sysdeps/unix/sysv/linux/x86_64/getcontext.S: Include
<asm/prctl.h>.
(__getcontext): Record the current shadow stack base. Save the
caller's shadow stack pointer and base.
* sysdeps/unix/sysv/linux/x86_64/makecontext.c: Include
<pthread.h>, <libc-pointer-arith.h> and <sys/prctl.h>.
(__push___start_context): New prototype.
(__makecontext): Call __push___start_context to allocate a new
shadow stack, push __start_context onto the new stack as well
as the new shadow stack.
* sysdeps/unix/sysv/linux/x86_64/setcontext.S: Include
<asm/prctl.h>.
(__setcontext): Restore the target shadow stack.
* sysdeps/unix/sysv/linux/x86_64/swapcontext.S: Include
<asm/prctl.h>.
(__swapcontext): Record the current shadow stack base. Save
the caller's shadow stack pointer and base. Restore the target
shadow stack.
* sysdeps/unix/sysv/linux/x86_64/sysdep.h
(STACK_SIZE_TO_SHADOW_STACK_SIZE_SHIFT): New.
* sysdeps/unix/sysv/linux/x86_64/ucontext_i.sym (oSSP): New.
This will be used to record the current shadow stack base for shadow
stack switching by getcontext, makecontext, setcontext and swapcontext.
If the target shadow stack base is the same as the current shadow stack
base, we unwind the shadow stack. Otherwise it is a stack switch and
we look for a restore token to restore the target shadow stack.
* sysdeps/i386/nptl/tcb-offsets.sym (SSP_BASE_OFFSET): New.
* sysdeps/i386/nptl/tls.h (tcbhead_t): Replace __glibc_reserved2
with ssp_base.
* sysdeps/x86_64/nptl/tcb-offsets.sym (SSP_BASE_OFFSET): New.
* sysdeps/x86_64/nptl/tls.h (tcbhead_t): Replace __glibc_reserved2
with ssp_base.
CET arch_prctl bits should be defined in <asm/prctl.h> from Linux kernel
header files. Add x86 <include/asm/prctl.h> for pre-CET kernel header
files.
Note: sysdeps/unix/sysv/linux/x86/include/asm/prctl.h should be removed
if <asm/prctl.h> from the required kernel header files contains CET
arch_prctl bits.
/* CET features:
IBT: GNU_PROPERTY_X86_FEATURE_1_IBT
SHSTK: GNU_PROPERTY_X86_FEATURE_1_SHSTK
*/
/* Return CET features in unsigned long long *addr:
features: addr[0].
shadow stack base address: addr[1].
shadow stack size: addr[2].
*/
# define ARCH_CET_STATUS 0x3001
/* Disable CET features in unsigned int features. */
# define ARCH_CET_DISABLE 0x3002
/* Lock all CET features. */
# define ARCH_CET_LOCK 0x3003
/* Allocate a new shadow stack with unsigned long long *addr:
IN: requested shadow stack size: *addr.
OUT: allocated shadow stack address: *addr.
*/
# define ARCH_CET_ALLOC_SHSTK 0x3004
/* Return legacy region bitmap info in unsigned long long *addr:
address: addr[0].
size: addr[1].
*/
# define ARCH_CET_LEGACY_BITMAP 0x3005
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* sysdeps/unix/sysv/linux/x86/include/asm/prctl.h: New file.
* sysdeps/unix/sysv/linux/x86/cpu-features.c: Include
<sys/prctl.h> and <asm/prctl.h>.
(get_cet_status): Call arch_prctl with ARCH_CET_STATUS.
* sysdeps/unix/sysv/linux/x86/dl-cet.h: Include <sys/prctl.h>
and <asm/prctl.h>.
(dl_cet_allocate_legacy_bitmap): Call arch_prctl with
ARCH_CET_LEGACY_BITMAP.
(dl_cet_disable_cet): Call arch_prctl with ARCH_CET_DISABLE.
(dl_cet_lock_cet): Call arch_prctl with ARCH_CET_LOCK.
* sysdeps/x86/libc-start.c: Include <startup.h>.
This patch updates the manual and adds a new chapter to the manual,
explaining types macros, constants and functions defined by ISO C11
threads.h standard.
[BZ# 14092]
* manual/debug.texi: Update adjacent chapter name.
* manual/probes.texi: Likewise.
* manual/threads.texi (ISO C Threads): New section.
(POSIX Threads): Convert to a section.
This patch adds to testsuite new test cases to test all new introduced
C11 threads functions, types and macros are tested.
Checked with a build for all major ABI (aarch64-linux-gnu, alpha-linux-gnu,
arm-linux-gnueabi, i386-linux-gnu, ia64-linux-gnu, m68k-linux-gnu,
microblaze-linux-gnu [1], mips{64}-linux-gnu, nios2-linux-gnu,
powerpc{64le}-linux-gnu, s390{x}-linux-gnu, sparc{64}-linux-gnu,
and x86_64-linux-gnu).
Also ran a full check on aarch64-linux-gnu, x86_64-linux-gnu, i686-linux-gnu,
arm-linux-gnueabhf, and powerpc64le-linux-gnu.
Adhemerval Zanella <adhemerval.zanella@linaro.org>
Juan Manuel Torres Palma <jmtorrespalma@gmail.com>
[BZ #14092]
* nptl/Makefile (tests): Add new test files.
* nptl/tst-call-once.c : New file. Tests C11 functions and types.
* nptl/tst-cnd-basic.c: Likewise.
* nptl/tst-cnd-broadcast.c: Likewise.
* nptl/tst-cnd-timedwait.c: Likewise.
* nptl/tst-mtx-basic.c: Likewise.
* nptl/tst-mtx-recursive.c: Likewise.
* nptl/tst-mtx-timedlock.c: Likewise.
* nptl/tst-mtx-trylock.c: Likewise.
* nptl/tst-thrd-basic.c: Likewise.
* nptl/tst-thrd-detach.c: Likewise.
* nptl/tst-thrd-sleep.c: Likewise.
* nptl/tst-tss-basic.c: Likewise.
This patch adds the tss_* definitions from C11 threads (ISO/IEC 9899:2011),
more specifically tss_create, tss_delete, tss_get, tss_set, and required
types.
Mostly of the definitions are composed based on POSIX conterparts, including
tss_t (pthread_key_t).
Checked with a build for all major ABI (aarch64-linux-gnu, alpha-linux-gnu,
arm-linux-gnueabi, i386-linux-gnu, ia64-linux-gnu, m68k-linux-gnu,
microblaze-linux-gnu [1], mips{64}-linux-gnu, nios2-linux-gnu,
powerpc{64le}-linux-gnu, s390{x}-linux-gnu, sparc{64}-linux-gnu,
and x86_64-linux-gnu).
Also ran a full check on aarch64-linux-gnu, x86_64-linux-gnu, i686-linux-gnu,
arm-linux-gnueabhf, and powerpc64le-linux-gnu.
[BZ #14092]
* conform/data/threads.h-data (thread_local): New macro.
(TSS_DTOR_ITERATIONS): Likewise.
(tss_t): New type.
(tss_dtor_t): Likewise.
(tss_create): New function.
(tss_get): Likewise.
(tss_set): Likewise.
(tss_delete): Likewise.
* nptl/Makefile (libpthread-routines): Add tss_create, tss_delete,
tss_get, and tss_set objects.
* nptl/Versions (libpthread) [GLIBC_2.28]: Likewise.
* nptl/tss_create.c: New file.
* nptl/tss_delete.c: Likewise.
* nptl/tss_get.c: Likewise.
* nptl/tss_set.c: Likewise.
* sysdeps/nptl/threads.h (thread_local): New define.
(TSS_DTOR_ITERATIONS): Likewise.
(tss_t): New typedef.
(tss_dtor_t): Likewise.
(tss_create): New prototype.
(tss_get): Likewise.
(tss_set): Likewise.
(tss_delete): Likewise.
This patch adds the cnd_* definitions from C11 threads (ISO/IEC 9899:2011),
more specifically cnd_broadcast, cnd_destroy, cnd_init, cnd_signal,
cnd_timedwait, cnd_wait, and required types.
Mostly of the definitions are composed based on POSIX conterparts, and
cnd_t is also based on internal pthreads fields, but with distinct internal
layout to avoid possible issues with code interchange (such as trying to pass
POSIX structure on C11 functions and to avoid inclusion of pthread.h). The
idea is to make it possible to share POSIX internal implementation for mostly
of the code making adjust where only required.
Checked with a build for all major ABI (aarch64-linux-gnu, alpha-linux-gnu,
arm-linux-gnueabi, i386-linux-gnu, ia64-linux-gnu, m68k-linux-gnu,
microblaze-linux-gnu [1], mips{64}-linux-gnu, nios2-linux-gnu,
powerpc{64le}-linux-gnu, s390{x}-linux-gnu, sparc{64}-linux-gnu,
and x86_64-linux-gnu).
Also ran a full check on aarch64-linux-gnu, x86_64-linux-gnu, i686-linux-gnu,
arm-linux-gnueabhf, and powerpc64le-linux-gnu.
[BZ #14092]
* conform/data/threads.h-data (cnd_t): New type.
(cnd_init): New function.
(cnd_signal): Likewise.
(cnd_broadcast): Likewise.
(cnd_wait): Likewise.
(cnd_timedwait): Likewise.
(cnd_destroy): Likewise.
* nptl/Makefile (libpthread-routines): Add cnd_broadcast,
cnd_destroy, cnd_init, cnd_signal, cnd_timedwait, and cnd_wait
object.
* nptl/Versions (libpthread) [GLIBC_2.28]: Likewise.
* nptl/cnd_broadcast.c: New file.
* nptl/cnd_destroy.c: Likewise.
* nptl/cnd_init.c: Likewise.
* nptl/cnd_signal.c: Likewise.
* nptl/cnd_timedwait.c: Likewise.
* nptl/cnd_wait.c: Likewise.
* sysdeps/nptl/threads.h (cnd_t): New type.
(cnd_init): New prototype.
(cnd_signa): Likewise.
(cnd_broadcast): Likewise.
(cnd_wait): Likewise.
(cnd_timedwait): Likewise.
(cnd_destroy): Likewise.
This patch adds the call_* definitions from C11 threads (ISO/IEC 9899:2011),
more specifically call_once and required types.
Mostly of the definitions are composed based on POSIX conterparts,including
once_flag (pthread_once_t). The idea is to make possible to share POSIX
internal implementations for mostly of the code (and making adjustment only
when required).
Checked with a build for all major ABI (aarch64-linux-gnu, alpha-linux-gnu,
arm-linux-gnueabi, i386-linux-gnu, ia64-linux-gnu, m68k-linux-gnu,
microblaze-linux-gnu [1], mips{64}-linux-gnu, nios2-linux-gnu,
powerpc{64le}-linux-gnu, s390{x}-linux-gnu, sparc{64}-linux-gnu,
and x86_64-linux-gnu).
Also ran a full check on aarch64-linux-gnu, x86_64-linux-gnu, i686-linux-gnu,
arm-linux-gnueabhf, and powerpc64le-linux-gnu.
[BZ #14092]
* conform/data/threads.h-data (ONCE_FLAG_INIT): New macro.
(once_flag): New type.
(call_once): New function.
* nptl/Makefile (libpthread-routines): Add call_once object.
* nptl/Versions (libphread) [GLIBC_2.28]: Add call_once symbol.
* nptl/call_once.c: New file.
* sysdeps/nptl/threads.h (ONCE_FLAG_INIT): New define.
(once_flag): New type.
(call_once): New prototype.
This patch adds the mtx_* definitions from C11 threads (ISO/IEC 9899:2011),
more specifically mtx_init, mtx_destroy, mtx_lock, mtx_timedlock, mtx_trylock,
mtx_unlock, and required types.
Mostly of the definitions are composed based on POSIX conterparts, and mtx_t
is also based on internal pthread fields, but with a distinct internal layout
to avoid possible issues with code interchange (such as trying to pass POSIX
structure on C11 functions and to avoid inclusion of pthread.h). The idea
is to make possible to share POSIX internal implementations for mostly of
the code (and making adjustment only when required).
Checked with a build for all major ABI (aarch64-linux-gnu, alpha-linux-gnu,
arm-linux-gnueabi, i386-linux-gnu, ia64-linux-gnu, m68k-linux-gnu,
microblaze-linux-gnu [1], mips{64}-linux-gnu, nios2-linux-gnu,
powerpc{64le}-linux-gnu, s390{x}-linux-gnu, sparc{64}-linux-gnu,
and x86_64-linux-gnu).
Also ran a full check on aarch64-linux-gnu, x86_64-linux-gnu, i686-linux-gnu,
arm-linux-gnueabhf, and powerpc64le-linux-gnu.
[BZ #14092]
* conform/data/threads.h-data (mtx_plain): New constant.
(mtx_recursive): Likewise.
(mtx_timed): Likewise.
(mtx_t): New type.
(mtx_init): New function.
(mtx_lock): Likewise.
(mtx_timedlock): Likewise.
(mtx_trylock): Likewise.
(mtx_unlock): Likewise.
(mtx_destroy): Likewise.
* nptl/Makefile (libpthread-routines): Add mtx_destroy, mtx_init,
mtx_lock, mtx_timedlock, mtx_trylock, and mtx_unlock object.
* nptl/Versions (libpthread) [GLIBC_2.28]): Add mtx_init, mtx_lock,
mtx_timedlock, mtx_trylock, mtx_unlock, and mtx_destroy.
* nptl/mtx_destroy.c: New file.
* nptl/mtx_init.c: Likewise.
* nptl/mtx_lock.c: Likewise.
* nptl/mtx_timedlock.c: Likewise.
* nptl/mtx_trylock.c: Likewise.
* nptl/mtx_unlock.c: Likewise.
* sysdeps/nptl/threads.h (mtx_plain): New enumeration.
(mtx_recursive): Likewise.
(mtx_timed): Likewise.
(mtx_t): New type.
(mtx_init): New prototype.
(mtx_lock): Likewise.
(mtx_timedlock): Likewise.
(mtx_trylock): Likewise.
(mtx_unlock): Likewise.
(mtx_destroy): Likewise.
This patch adds the thrd_* definitions from C11 threads (ISO/IEC 9899:2011),
more specifically thrd_create, thrd_curent, rhd_detach, thrd_equal,
thrd_exit, thrd_join, thrd_sleep, thrd_yield, and required types.
Mostly of the definitions are composed based on POSIX conterparts, such as
thrd_t (using pthread_t). For thrd_* function internally direct
POSIX pthread call are used with the exceptions:
1. thrd_start uses pthread_create internal implementation, but changes
how to actually calls the start routine. This is due the difference
in signature between POSIX and C11, where former return a 'void *'
and latter 'int'.
To avoid calling convention issues due 'void *' to int cast, routines
from C11 threads are started slight different than default pthread one.
Explicit cast to expected return are used internally on pthread_create
and the result is stored back to void also with an explicit cast.
2. thrd_sleep uses nanosleep internal direct syscall to avoid clobbering
errno and to handle expected standard return codes. It is a
cancellation entrypoint to be consistent with both thrd_join and
cnd_{timed}wait.
3. thrd_yield also uses internal direct syscall to avoid errno clobbering.
Checked with a build for all major ABI (aarch64-linux-gnu, alpha-linux-gnu,
arm-linux-gnueabi, i386-linux-gnu, ia64-linux-gnu, m68k-linux-gnu,
microblaze-linux-gnu [1], mips{64}-linux-gnu, nios2-linux-gnu,
powerpc{64le}-linux-gnu, s390{x}-linux-gnu, sparc{64}-linux-gnu,
and x86_64-linux-gnu).
Also ran a full check on aarch64-linux-gnu, x86_64-linux-gnu, i686-linux-gnu,
arm-linux-gnueabhf, and powerpc64le-linux-gnu.
[BZ #14092]
* conform/Makefile (conformtest-headers-ISO11): Add threads.h.
(linknamespace-libs-ISO11): Add libpthread.a.
* conform/data/threads.h-data: New file: add C11 thrd_* types and
functions.
* include/stdc-predef.h (__STDC_NO_THREADS__): Remove definition.
* nptl/Makefile (headers): Add threads.h.
(libpthread-routines): Add new C11 thread thrd_create, thrd_current,
thrd_detach, thrd_equal, thrd_exit, thrd_join, thrd_sleep, and
thrd_yield.
* nptl/Versions (libpthread) [GLIBC_2.28]): Add new C11 thread
thrd_create, thrd_current, thrd_detach, thrd_equal, thrd_exit,
thrd_join, thrd_sleep, and thrd_yield symbols.
* nptl/descr.h (struct pthread): Add c11 field.
* nptl/pthreadP.h (ATTR_C11_THREAD): New define.
* nptl/pthread_create.c (START_THREAD_DEFN): Call C11 thread start
routine with expected function prototype.
(__pthread_create_2_1): Add C11 threads check based on attribute
value.
* sysdeps/unix/sysdep.h (INTERNAL_SYSCALL_CANCEL): New macro.
* nptl/thrd_create.c: New file.
* nptl/thrd_current.c: Likewise.
* nptl/thrd_detach.c: Likewise.
* nptl/thrd_equal.c: Likewise.
* nptl/thrd_exit.c: Likewise.
* nptl/thrd_join.c: Likewise.
* nptl/thrd_priv.h: Likewise.
* nptl/thrd_sleep.c: Likewise.
* nptl/thrd_yield.c: Likewise.
* include/threads.h: Likewise.
Add <bits/indirect-return.h> and include it in <ucontext.h>.
__INDIRECT_RETURN defined in <bits/indirect-return.h> indicates if
swapcontext requires special compiler treatment. The default
__INDIRECT_RETURN is empty.
On x86, when shadow stack is enabled, __INDIRECT_RETURN is defined
with indirect_return attribute, which has been added to GCC 9, to
indicate that swapcontext returns via indirect branch. Otherwise
__INDIRECT_RETURN is defined with returns_twice attribute.
When shadow stack is enabled, remove always_inline attribute from
prepare_test_buffer in string/tst-xbzero-opt.c to avoid:
tst-xbzero-opt.c: In function ‘prepare_test_buffer’:
tst-xbzero-opt.c:105:1: error: function ‘prepare_test_buffer’ can never be inlined because it uses setjmp
prepare_test_buffer (unsigned char *buf)
when indirect_return attribute isn't available.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* bits/indirect-return.h: New file.
* misc/sys/cdefs.h (__glibc_has_attribute): New.
* sysdeps/x86/bits/indirect-return.h: Likewise.
* stdlib/Makefile (headers): Add bits/indirect-return.h.
* stdlib/ucontext.h: Include <bits/indirect-return.h>.
(swapcontext): Add __INDIRECT_RETURN.
* string/tst-xbzero-opt.c (ALWAYS_INLINE): New.
(prepare_test_buffer): Use it.
The shadow stack prevents us from pushing the saved return PC onto
the stack and returning normally. Instead we pop the shadow stack
and return directly. This is the safest way to return and ensures
any stack manipulations done by the vfork'd child doesn't cause the
parent to terminate when CET is enabled.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* sysdeps/unix/sysv/linux/i386/vfork.S (SYSCALL_ERROR_HANDLER):
Redefine if shadow stack is enabled.
(SYSCALL_ERROR_LABEL): Likewise.
(__vfork): Pop shadow stack and jump back to to caller directly
when shadow stack is in use.
* sysdeps/unix/sysv/linux/x86_64/vfork.S (SYSCALL_ERROR_HANDLER):
Redefine if shadow stack is enabled.
(SYSCALL_ERROR_LABEL): Likewise.
(__vfork): Pop shadow stack and jump back to to caller directly
when shadow stack is in use.
Add endbr64 to tst-quadmod1.S and tst-quadmod2.S so that func and foo
can be called indirectly.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* sysdeps/x86_64/tst-quadmod1.S (func): Add endbr64 if IBT is
enabled.
(foo): Likewise.
* sysdeps/x86_64/tst-quadmod2.S (func) : Likewise.
(foo): Likewise.
This bug is very similar to bug 23036: The existing code assumed that
the length count included the length byte itself.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* scripts/check-execstack.awk: Consider `xfail' variable containing a
list
of libraries whose stack executability is expected.
* elf/Makefile ($(objpfx)check-execstack.out): Pass
$(check-execstack-xfail) to check-execstack.awk through `xfail'
variable.
* sysdeps/mach/hurd/i386/Makefile (check-execstack-xfail): Set to ld.so
libc.so libpthread.so.
* sysdeps/mach/hurd/pipe2.c: New file, copy from pipe.c. Evolve it to
implement __pipe2.
* sysdeps/mach/hurd/pipe.c (__pipe): Reimplement using __pipe2.
The argparse library is used on compare_bench script to improve command line
argument parsing. The 'schema validation file' is now optional, reducing by
one the number of required parameters.
* benchtests/scripts/compare_bench.py (__main__): use the argparse
library to improve command line parsing.
(__main__): make schema file as optional parameter (--schema),
defaulting to benchtests/scripts/benchout.schema.json.
(main): move out of the parsing stuff to __main_ and leave it
only as caller of main comparison functions.
Multiple updates for Occitan language including alternative month names,
update abday and abmon, fix typos in day, fix d_fmt, correct LC_NAME,
and use “copy "ca_ES"” as LC_COLLATE.
[BZ #23140]
* localedata/locales/oc_FR (mon): Rename to...
(alt_mon): This, then update October (typo fix).
(mon): New content (genitive case, month names preceded by
"de" or "d’").
[BZ #23422]
* localedata/locales/oc_FR (abday): Update all items.
(day): Update Wednesday and Saturday (typo fixes).
(abmon): Update all items, except May.
(d_fmt): Update "%d.%m.%Y" -> "%d/%m/%Y".
(LC_IDENTIFICATION): Bump the revision number and date.
Keep the "category" entries in alphabetic order.
(LC_ADDRESS): Remove no longer needed comment.
(LC_COLLATE): Use “copy "ca_ES"”.
(LC_NAME): Set the correct values of "name_fmt", "name_mr", and
"name_mrs".
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Various glibc testcases use tmpnam in ways subject to race conditions
(generate a temporary file name, then later open that file without
O_EXCL).
This patch fixes those tests to use mkstemp - generally a minimal
local fix to use mkstemp instead of tmpnam, rather than a larger fix
to use other testsuite infrastructure for temporary files. The
unchanged use of tmpnam in posix/wordexp-test.c would fail safe in the
event of a race (it's generating a name for use with mkdir rather than
for a file to be opened for writing).
Tested for x86_64.
* grp/tst_fgetgrent.c: Include <unistd.h>.
(main): Use mkstemp instead of tmpnam.
* io/test-utime.c (main): Likewise.
* posix/annexc.c (macrofile): Change to modifiable array.
(get_null_defines): Use mkstemp instead of tmpnam. Do not remove
macrofile here.
* posix/bug-getopt1.c: Include <stdlib.h>.
(do_test): Use mkstemp instead of tmpnam.
* posix/bug-getopt2.c: Include <stdlib.h>.
(do_test): Use mkstemp instead of tmpnam.
* posix/bug-getopt3.c: Include <stdlib.h>.
(do_test): Use mkstemp instead of tmpnam.
* posix/bug-getopt4.c: Include <stdlib.h>.
(do_test): Use mkstemp instead of tmpnam.
* posix/bug-getopt5.c: Include <stdlib.h>.
(do_test): Use mkstemp instead of tmpnam.
* stdio-common/bug7.c: Include <stdlib.h> and <unistd.h>.
(main): Use mkstemp instead of tmpnam.
* stdio-common/tst-fdopen.c: Include <stdlib.h>.
(main): Use mkstemp instead of tmpnam.
* stdio-common/tst-ungetc.c: Include <stdlib.h>.
(main): use mkstemp instead of tmpnam.
* stdlib/isomac.c (macrofile): Change to modifiable array.
(get_null_defines): Use mkstemp instead of tmpnam. Do not remove
macrofile here.
i386 add_n.S and sub_n.S use a trick to implment jump tables with LEA.
We can't use conditional branches nor normal jump tables since jump
table entries use EFLAGS set by jump table index. This patch adds
_CET_ENDBR to indirect jump targets and adjust destination for
_CET_ENDBR.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* sysdeps/i386/add_n.S: Include <sysdep.h>, instead of
"sysdep.h".
(__mpn_add_n): Save and restore %ebx if IBT is enabed. Add
_CET_ENDBR to indirect jump targets and adjust jump destination
for _CET_ENDBR.
* sysdeps/i386/i686/add_n.S: Include <sysdep.h>, instead of
"sysdep.h".
(__mpn_add_n): Save and restore %ebx if IBT is enabed. Add
_CET_ENDBR to indirect jump targets and adjust jump destination
for _CET_ENDBR.
* sysdeps/i386/sub_n.S: Include <sysdep.h>, instead of
"sysdep.h".
(__mpn_sub_n): Save and restore %ebx if IBT is enabed. Add
_CET_ENDBR to indirect jump targets and adjust jump destination
for _CET_ENDBR.
Add _CET_ENDBR to STRCMP_SSE42, which is called indirectly, to support
IBT.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* sysdeps/x86_64/multiarch/strcmp-sse42.S (STRCMP_SSE42): Add
_CET_ENDBR.
Add _CET_ENDBR to functions in crti.S, which are called indirectly, to
support IBT.
Tested on i686 and x86-64.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* sysdeps/i386/crti.S (_init): Add _CET_ENDBR.
(_fini): Likewise.
* sysdeps/x86_64/crti.S (_init): Likewise.
(_fini): Likewise.
Always include <dl-cet.h> and cet-tunables.h> when CET is enabled.
Otherwise, configure glibc with --enable-cet --disable-tunables will
fail to build.
* sysdeps/x86/cpu-features.c: Always include <dl-cet.h> and
cet-tunables.h> when CET is enabled.
Intel Control-flow Enforcement Technology (CET) instructions:
https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-en
forcement-technology-preview.pdf
includes Indirect Branch Tracking (IBT) and Shadow Stack (SHSTK).
GNU_PROPERTY_X86_FEATURE_1_IBT is added to GNU program property to
indicate that all executable sections are compatible with IBT when
ENDBR instruction starts each valid target where an indirect branch
instruction can land. Linker sets GNU_PROPERTY_X86_FEATURE_1_IBT on
output only if it is set on all relocatable inputs.
On an IBT capable processor, the following steps should be taken:
1. When loading an executable without an interpreter, enable IBT and
lock IBT if GNU_PROPERTY_X86_FEATURE_1_IBT is set on the executable.
2. When loading an executable with an interpreter, enable IBT if
GNU_PROPERTY_X86_FEATURE_1_IBT is set on the interpreter.
a. If GNU_PROPERTY_X86_FEATURE_1_IBT isn't set on the executable,
disable IBT.
b. Lock IBT.
3. If IBT is enabled, when loading a shared object without
GNU_PROPERTY_X86_FEATURE_1_IBT:
a. If legacy interwork is allowed, then mark all pages in executable
PT_LOAD segments in legacy code page bitmap. Failure of legacy code
page bitmap allocation causes an error.
b. If legacy interwork isn't allowed, it causes an error.
GNU_PROPERTY_X86_FEATURE_1_SHSTK is added to GNU program property to
indicate that all executable sections are compatible with SHSTK where
return address popped from shadow stack always matches return address
popped from normal stack. Linker sets GNU_PROPERTY_X86_FEATURE_1_SHSTK
on output only if it is set on all relocatable inputs.
On a SHSTK capable processor, the following steps should be taken:
1. When loading an executable without an interpreter, enable SHSTK if
GNU_PROPERTY_X86_FEATURE_1_SHSTK is set on the executable.
2. When loading an executable with an interpreter, enable SHSTK if
GNU_PROPERTY_X86_FEATURE_1_SHSTK is set on interpreter.
a. If GNU_PROPERTY_X86_FEATURE_1_SHSTK isn't set on the executable
or any shared objects loaded via the DT_NEEDED tag, disable SHSTK.
b. Otherwise lock SHSTK.
3. After SHSTK is enabled, it is an error to load a shared object
without GNU_PROPERTY_X86_FEATURE_1_SHSTK.
To enable CET support in glibc, --enable-cet is required to configure
glibc. When CET is enabled, both compiler and assembler must support
CET. Otherwise, it is a configure-time error.
To support CET run-time control,
1. _dl_x86_feature_1 is added to the writable ld.so namespace to indicate
if IBT or SHSTK are enabled at run-time. It should be initialized by
init_cpu_features.
2. For dynamic executables:
a. A l_cet field is added to struct link_map to indicate if IBT or
SHSTK is enabled in an ELF module. _dl_process_pt_note or
_rtld_process_pt_note is called to process PT_NOTE segment for
GNU program property and set l_cet.
b. _dl_open_check is added to check IBT and SHSTK compatibilty when
dlopening a shared object.
3. Replace i386 _dl_runtime_resolve and _dl_runtime_profile with
_dl_runtime_resolve_shstk and _dl_runtime_profile_shstk, respectively if
SHSTK is enabled.
CET run-time control can be changed via GLIBC_TUNABLES with
$ export GLIBC_TUNABLES=glibc.tune.x86_shstk=[permissive|on|off]
$ export GLIBC_TUNABLES=glibc.tune.x86_ibt=[permissive|on|off]
1. permissive: SHSTK is disabled when dlopening a legacy ELF module.
2. on: IBT or SHSTK are always enabled, regardless if there are IBT or
SHSTK bits in GNU program property.
3. off: IBT or SHSTK are always disabled, regardless if there are IBT or
SHSTK bits in GNU program property.
<cet.h> from CET-enabled GCC is automatically included by assembly codes
to add GNU_PROPERTY_X86_FEATURE_1_IBT and GNU_PROPERTY_X86_FEATURE_1_SHSTK
to GNU program property. _CET_ENDBR is added at the entrance of all
assembly functions whose address may be taken. _CET_NOTRACK is used to
insert NOTRACK prefix with indirect jump table to support IBT. It is
defined as notrack when _CET_NOTRACK is defined in <cet.h>.
[BZ #21598]
* configure.ac: Add --enable-cet.
* configure: Regenerated.
* elf/Makefille (all-built-dso): Add a comment.
* elf/dl-load.c (filebuf): Moved before "dynamic-link.h".
Include <dl-prop.h>.
(_dl_map_object_from_fd): Call _dl_process_pt_note on PT_NOTE
segment.
* elf/dl-open.c: Include <dl-prop.h>.
(dl_open_worker): Call _dl_open_check.
* elf/rtld.c: Include <dl-prop.h>.
(dl_main): Call _rtld_process_pt_note on PT_NOTE segment. Call
_rtld_main_check.
* sysdeps/generic/dl-prop.h: New file.
* sysdeps/i386/dl-cet.c: Likewise.
* sysdeps/unix/sysv/linux/x86/cpu-features.c: Likewise.
* sysdeps/unix/sysv/linux/x86/dl-cet.h: Likewise.
* sysdeps/x86/cet-tunables.h: Likewise.
* sysdeps/x86/check-cet.awk: Likewise.
* sysdeps/x86/configure: Likewise.
* sysdeps/x86/configure.ac: Likewise.
* sysdeps/x86/dl-cet.c: Likewise.
* sysdeps/x86/dl-procruntime.c: Likewise.
* sysdeps/x86/dl-prop.h: Likewise.
* sysdeps/x86/libc-start.h: Likewise.
* sysdeps/x86/link_map.h: Likewise.
* sysdeps/i386/dl-trampoline.S (_dl_runtime_resolve): Add
_CET_ENDBR.
(_dl_runtime_profile): Likewise.
(_dl_runtime_resolve_shstk): New.
(_dl_runtime_profile_shstk): Likewise.
* sysdeps/linux/x86/Makefile (sysdep-dl-routines): Add dl-cet
if CET is enabled.
(CFLAGS-.o): Add -fcf-protection if CET is enabled.
(CFLAGS-.os): Likewise.
(CFLAGS-.op): Likewise.
(CFLAGS-.oS): Likewise.
(asm-CPPFLAGS): Add -fcf-protection -include cet.h if CET
is enabled.
(tests-special): Add $(objpfx)check-cet.out.
(cet-built-dso): New.
(+$(cet-built-dso:=.note)): Likewise.
(common-generated): Add $(cet-built-dso:$(common-objpfx)%=%.note).
($(objpfx)check-cet.out): New.
(generated): Add check-cet.out.
* sysdeps/x86/cpu-features.c: Include <dl-cet.h> and
<cet-tunables.h>.
(TUNABLE_CALLBACK (set_x86_ibt)): New prototype.
(TUNABLE_CALLBACK (set_x86_shstk)): Likewise.
(init_cpu_features): Call get_cet_status to check CET status
and update dl_x86_feature_1 with CET status. Call
TUNABLE_CALLBACK (set_x86_ibt) and TUNABLE_CALLBACK
(set_x86_shstk). Disable and lock CET in libc.a.
* sysdeps/x86/cpu-tunables.c: Include <cet-tunables.h>.
(TUNABLE_CALLBACK (set_x86_ibt)): New function.
(TUNABLE_CALLBACK (set_x86_shstk)): Likewise.
* sysdeps/x86/sysdep.h (_CET_NOTRACK): New.
(_CET_ENDBR): Define if not defined.
(ENTRY): Add _CET_ENDBR.
* sysdeps/x86/dl-tunables.list (glibc.tune): Add x86_ibt and
x86_shstk.
* sysdeps/x86_64/dl-trampoline.h (_dl_runtime_resolve): Add
_CET_ENDBR.
(_dl_runtime_profile): Likewise.
This patch changes longjmp to always restore the TOC pointer (r2 register)
to the caller frame on powerpc64 and powerpc64le. This is related to bug
21895 that reports a situation where you have a static longjmp to a
shared object file.
[BZ #21895]
* sysdeps/powerpc/powerpc64/__longjmp-common.S: Remove condition code for
restoring r2 in longjmp.
* sysdeps/powerpc/powerpc64/Makefile: Added tst-setjmp-bug21895-static to
test list.
Added rules to build test tst-setjmp-bug21895-static.
Added module setjmp-bug21895 and rules to build a shared object from it.
* sysdeps/powerpc/powerpc64/setjmp-bug21895.c: New test file.
* sysdeps/powerpc/powerpc64/tst-setjmp-bug21895-static.c: New test file.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Improve strstr performance. Strstr tends to be slow because it uses
many calls to memchr and a slow byte loop to scan for the next match.
Performance is significantly improved by using strnlen on larger blocks
and using strchr to search for the next matching character. strcasestr
can also use strnlen to scan ahead, and memmem can use memchr to check
for the next match.
On the GLIBC bench tests the performance gains on Cortex-A72 are:
strstr: +25%
strcasestr: +4.3%
memmem: +18%
On a 256KB dataset strstr performance improves by 67%, strcasestr by 47%.
Reviewd-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Since SHADOW_STACK_POINTER_OFFSET is defined in jmp_buf-ssp.h, we must
undef SHADOW_STACK_POINTER_OFFSET after including <jmp_buf-ssp.h>.
* sysdeps/unix/sysv/linux/x86_64/____longjmp_chk.S: Undef
SHADOW_STACK_POINTER_OFFSET after including <jmp_buf-ssp.h>.