Without this, on gcc 4.8.2 the built glibc crashes when memcpy
or memset are invoked, since they call themselves recursively.
See commit 85c2e6110c for the generic inhibit_loop_to_libcall.
This sets __HAVE_64B_ATOMICS if provided. It also sets
USE_ATOMIC_COMPILER_BUILTINS to true if the existing atomic ops use the
__atomic* builtins (aarch64, mips partially) or if this has been
tested (x86_64); otherwise, this is set to false so that C11 atomics will
be based on the existing atomic operations.
Continuing the removal of the obsolete INTDEF / INTUSE mechanism, this
patch eliminates its use for _dl_init. Since _dl_init was already
declared with hidden visibility, creating a second hidden alias for it
was completely pointless, so this patch replaces all uses of
_dl_init_internal with plain _dl_init instead of using hidden_proto /
hidden_def (which are only needed when you want a hidden alias for a
non-hidden symbol; it's quite possible there are cases where they are
used but don't need to be because the symbol in question is not part
of the public ABI and is only used within a single library, so using
attributes_hidden instead would suffice).
Tested for x86_64 that installed stripped shared libraries are
unchanged by the patch.
[BZ #14132]
* elf/dl-init.c (_dl_init): Don't use INTDEF.
* sysdeps/aarch64/dl-machine.h (RTLD_START): Use _dl_init instead
of _dl_init_internal.
* sysdeps/alpha/dl-machine.h (RTLD_START): Likewise.
* sysdeps/arm/dl-machine.h (RTLD_START): Likewise.
* sysdeps/hppa/dl-machine.h (RTLD_START): Likewise.
* sysdeps/i386/dl-machine.h (RTLD_START): Likewise.
* sysdeps/ia64/dl-machine.h (RTLD_START): Likewise.
* sysdeps/m68k/dl-machine.h (RTLD_START): Likewise.
* sysdeps/microblaze/dl-machine.h (RTLD_START): Likewise.
* sysdeps/mips/dl-machine.h (RTLD_START): Likewise.
* sysdeps/powerpc/powerpc32/dl-start.S (_start): Likewise.
* sysdeps/s390/s390-32/dl-machine.h (RTLD_START): Likewise.
* sysdeps/s390/s390-64/dl-machine.h (RTLD_START): Likewise.
* sysdeps/sh/dl-machine.h (RTLD_START): Likewise.
* sysdeps/sparc/sparc32/dl-machine.h (RTLD_START): Likewise.
* sysdeps/sparc/sparc64/dl-machine.h (RTLD_START): Likewise.
* sysdeps/tile/dl-start.S (_start): Likewise.
* sysdeps/x86_64/dl-machine.h (RTLD_START): Likewise.
* sysdeps/x86_64/x32/dl-machine.h (RTLD_START): Likewise.
Customize memcmp.c for tile, using similar tricks from memcpy:
- replace MERGE macro with dblalign.
- replace memcmp_bytes function with revbytes.
- use __glibc_likely.
- use post-increment addressing.
The schedule is still not perfect: the compiler is not hoisting
code above the comparison branch, which could save a bundle or two.
memcmp speeds up by 30-40% on shorter aligned tests in benchtest,
with some tests with unaligned lengths taking a small performance hit.
strnlen() is based on the existing tile strlen() with length
checking added. It speeds up by up to 5x, but on average across
the benchtest corpus by around 35%. No regressions are seen.
strstr() does 8-byte aligned loads and compares using a 2-byte
filter on the first two bytes of the needle and then testing
the remaining bytes in needle using memcmp(). It speeds up
about 5x in the best case (for "found" needles), about 2x looking
at benchtest as a whole, with some slowdowns as much as 45%.
on a few cases (including the "fail" case for 128KB search).
strcasestr() is based on strstr() but uses a SIMD tolower
routine to convert 8-bytes to lower case in 5 instructions.
It also uses a 2-byte filter and then strncasecmp() for the
remaining bytes. strncasecmp() is not optimized for SIMD, so
there is futher room for improvement. However, it is still up
to 16x faster for "found" needles, averaging 2x faster on the
whole corpus of benchtests. It does slow down by up to 35%
on a few cases, similarly to strstr().
Some types of relocations technically need to be signed rather than
unsigned: in particular ones that are used with moveli or movei,
or for jump and branch. This is almost never a problem. Jump and
branch opcodes are pretty much uniformly resolved by the static linker
(unless you omit -fpic for a shared library, which is not recommended).
The moveli and movei opcodes that need to be sign-extended generally
are for positive displacements, like the construction of the address of
main() from _start(). However, tst-pie1 ends up with main below _start
(in a different module) and the test failed due to signedness issues in
relocation handling.
This commit treats the value as signed when shifting (to preserve the
high bit) and also sign-extends the value generated from the updated
bundle when comparing with the desired bundle, which we do to make sure
no overflow occurred. As a result, the tst-pie1 test now passes.
Various architectures have files such as sysdeps/<arch>/shlib-versions
whose contents are in fact entirely Linux-specific, relating only to
the symbol / shared library versions for the port to Linux on that
architecture, when any future port to a different OS on that
architecture would use the symbol version of the glibc release it goes
in, as standard for new ports.
This patch moves such files under sysdeps/unix/sysv/linux/, merging in
the contents of sysdeps/<arch>/nptl/shlib-versions in the process.
The only bits not moved are those relating to libgcc_s versions, which
don't appear OS-specific in the same way that glibc's symbol versions
so. It deliberately does not change the regular expressions given for
matching configurations in each file; some match only Linux although
not Linux-specific, or match other OSes although Linux-specific. It
is with a view to at least the following further cleanups:
* Move architecture-specific content from the toplevel shlib-versions
and nptl/shlib-versions into sysdeps shlib-versions files, so
eliminating another difference between ex-ports and non-ex-ports
architectures.
* Likewise, for OS-specific content in shlib-versions files.
* At that point, the first field in shlib-versions files (the regular
expression matching a configuration triplet) should be redundant, so
eliminate that field and leave shlib-versions selection working
purely on a sysdeps basis (with limited use of %ifdef in
shlib-versions files when needed) rather than having its own
separate mechanism to select what configuration information is
relevant.
* Move the build of gnu/lib-names.h to a similar mechanism to that
used for gnu/stubs.h (each library build installing a version of the
header specifically for that build), so we can eliminate the
duplication of soname information in the makefiles and get it purely
from shlib-versions files again.
There may be other cleanups possible as well (in particular, I'm not
sure that all cases where the same "Earliest symbol set" information
is repeated for many different libraries actually should need to
repeat it rather than specifying it just once for DEFAULT for the
given configuration, and separately specifying any non-default choices
of soname).
Tested x86_64 that the installed shared libraries are unchanged by
this patch.
* sysdeps/aarch64/shlib-versions: Move to ...
* sysdeps/unix/sysv/linux/aarch64/shlib-versions: ... here.
* sysdeps/alpha/shlib-versions: Move to ...
* sysdeps/unix/sysv/linux/alpha/shlib-versions: ... here.
* sysdeps/arm/shlib-versions: Move to ...
* sysdeps/unix/sysv/linux/arm/shlib-versions: ... here.
* sysdeps/hppa/shlib-versions: Move all contents except for
libgcc_s entry to ...
* sysdeps/unix/sysv/linux/hppa/shlib-versions: ... here. Merge in
entry from ...
* sysdeps/hppa/nptl/shlib-versions: ... here. Remove file.
* sysdeps/ia64/shlib-versions: Move to ...
* sysdeps/unix/sysv/linux/ia64/shlib-versions: ... here. Merge in
entry from ...
* sysdeps/ia64/nptl/shlib-versions: ... here. Remove file.
* sysdeps/m68k/coldfire/shlib-versions: Move to ...
* sysdeps/unix/sysv/linux/m68k/coldfire/shlib-versions: ... here.
* sysdeps/microblaze/shlib-versions: Move to ...
* sysdeps/unix/sysv/linux/microblaze/shlib-versions: ... here.
* sysdeps/mips/shlib-versions: Move to ...
* sysdeps/unix/sysv/linux/mips/shlib-versions: ... here. Merge in
entry from ...
* sysdeps/mips/nptl/shlib-versions: ... here. Remove file.
* sysdeps/tile/shlib-versions: Move to ...
* sysdeps/unix/sysv/linux/tile/shlib-versions: ... here.
* sysdeps/unix/sysv/linux/x86_64/64/shlib-versions: Merge in entry
from ...
* sysdeps/x86_64/64/shlib-versions: ... here. Remove file.
* sysdeps/unix/sysv/linux/x86_64/x32/shlib-versions: Merge in
entry from ...
* sysdeps/x86_64/x32/shlib-versions: ... here. Remove file.
Define MEMCPY_OK_FOR_FWD_MEMMOVE in memcopy.h and let arch-specific
implementations of that file override the value if necessary. This
override is only useful for tile and moving this macro to memcopy.h
allows us to remove the tile-specific memmove.c.
This patch removes the configure test for working -z relro.
The use of -z relro in Makeconfig became unconditional with
commit 2e6ab1df44c412bb9d30b26a4d8a679150a7e375
Author: Ulrich Drepper <drepper@redhat.com>
Date: Sat Oct 28 06:44:04 2006 +0000
Remove conditional code which now is unnecessary.
(commit reference from git://repo.or.cz/glibc/history), so since then
the configure test has not controlled anything about how glibc is
built - simply about whether configure succeeds and allows a build to
be attempted. The test for whether the option did something useful
(as opposed to whether it exists - which we can certainly just assume
by now) was originally added in
<https://sourceware.org/ml/libc-hacker/2004-09/msg00069.html> to
disable the option in a case when it did nothing useful on ia64 (as a
result of something deliberate in the linker on ia64). Since 2006
that disabling has been of no effect, and given that the current test
does not set libc_relro_required for ia64, it does nothing whatever
useful for the original motivating case. Also at around the same time
in 2006 the test was made to give an error for missing or broken -z
relro support on various architectures.
So effectively all the test does now is verify that, on certain
architectures, the linker has not been changed deliberately to make
the option ineffective. I see no apparent reason why such a change
should be expected, or why the build should be stopped if it were to
be made (any more than we disallow build on ia64); I think we can
trust binutils patch review to point out the consequences of any
change to COMMONPAGESIZE setting. The only thing that might now make
sense would be disabling the -z relro use on an architecture-specific
basis if there were an architecture-specific reason to consider that
to make sense; it would be for the ia64 maintainer to decide if that
makes sense for ia64 at present, but I think that could be done
through sysdeps Makefiles - no special configure tests needed.
Tested for x86_64 that this patch makes no change to the installed
shared libraries.
Together with
<https://sourceware.org/ml/libc-alpha/2014-06/msg00788.html> (pending
review) this substantially eliminates architecture-specific cases from
architecture-independent configure.ac files. There remains an i386
case in sysdeps/mach/hurd/configure.ac that should properly move to
the i386 subdirectory. (There are also OS-specific cases outside
OS-specific directories; in principle I think should should also
move.)
* configure.ac (libc_commonpagesize): Remove variable.
(libc_relro_required): Likewise.
(libc_cv_z_relro): Remove configure test.
* configure: Regenerated.
* sysdeps/aarch64/preconfigure (libc_commonpagesize): Do not set
variable.
(libc_relro_required): Likewise.
* sysdeps/alpha/preconfigure (libc_commonpagesize): Likewise.
(libc_relro_required): Likewise.
* sysdeps/arm/preconfigure.ac (libc_commonpagesize): Likewise.
(libc_relro_required): Likewise.
* sysdeps/arm/preconfigure: Regenerated.
* sysdeps/ia64/preconfigure: Remove file.
* sysdeps/tile/preconfigure (libc_commonpagesize): Do not set
variable.
(libc_relro_required): Likewise.
This patch defines ELF_MACHINE_NO_RELA on all architectures. Tested
only on x86_64 to verify that the sources before and after are
identical except for two instructions that pass the current line
number in dl-machine.h to assert_fail.
This patch relies on the C version of the rwlocks posted earlier.
With C rwlocks it is very straight forward to do adaptive elision
using TSX. It is based on the infrastructure added earlier
for mutexes, but uses its own elision macros. The macros
are fairly general purpose and could be used for other
elision purposes too.
This version is much cleaner than the earlier assembler based
version, and in particular implements adaptation which makes
it safer.
I changed the behavior slightly to not require any changes
in the test suite and fully conform to all expected
behaviors (generally at the cost of not eliding in
various situations). In particular this means the timedlock
variants are not elided. Nested trylock aborts.
As recently discussed
<https://sourceware.org/ml/libc-alpha/2014-02/msg00670.html>, it
doesn't seem particularly useful for libm-test-ulps files to contain
huge amounts of data on ulps for individual tests; just the global
maximum observed ulps for each function, together with the
verification of exceptions, errno and special results such as
infinities and NaNs for each test, suffices to verify that a
function's behavior on the given test inputs is within the expected
accuracy. Removing this data reduces source tree churn caused by
updates to these files when libm tests are added, and reduces the
frequency with which testsuite additions actually need libm-test-ulps
changes at all.
Accordingly, this patch removes that data, so that individual tests
get checked against the global bounds for the given function and only
generate an error if those are exceeded. Tested x86_64 (including
verifying that if an ulps value is artificially reduced, the tests do
indeed fail as they should and "make regen-ulps" generates the
expected changes).
* math/libm-test.inc (struct ulp_data): Don't refer to ulps for
individual tests in comment.
(libm-test-ulps.h): Don't refer to test_ulps in #include comment.
(prev_max_error): New variable.
(prev_real_max_error): Likewise.
(prev_imag_max_error): Likewise.
(compare_ulp_data): Don't refer to test names in comment.
(find_test_ulps): Remove function.
(find_function_ulps): Likewise.
(find_complex_function_ulps): Likewise.
(init_max_error): Take function name as argument. Look up ulps
for that function.
(print_ulps): Remove function.
(print_max_error): Use prev_max_error instead of calling
find_function_ulps.
(print_complex_max_error): Use prev_real_max_error and
prev_imag_max_error instead of calling find_complex_function_ulps.
(check_float_internal): Take max_ulp parameter instead of calling
find_test_ulps. Don't call print_ulps.
(check_float): Update call to check_float_internal.
(check_complex): Update calls to check_float_internal.
(START): Pass argument to init_max_error.
* math/gen-libm-test.pl (%results): Don't include "kind"
information.
(parse_ulps): Don't handle ulps of individual tests.
(print_ulps_file): Likewise.
(output_ulps): Likewise.
* math/README.libm-test: Update.
* manual/libm-err-tab.pl (parse_ulps): Don't handle ulps of
individual tests.
* sysdeps/aarch64/libm-test-ulps: Remove individual test ulps.
* sysdeps/alpha/fpu/libm-test-ulps: Likewise.
* sysdeps/arm/libm-test-ulps: Likewise.
* sysdeps/i386/fpu/libm-test-ulps: Likewise.
* sysdeps/ia64/fpu/libm-test-ulps: Likewise.
* sysdeps/m68k/coldfire/fpu/libm-test-ulps: Likewise.
* sysdeps/m68k/m680x0/fpu/libm-test-ulps: Likewise.
* sysdeps/microblaze/libm-test-ulps: Likewise.
* sysdeps/mips/mips32/libm-test-ulps: Likewise.
* sysdeps/mips/mips64/libm-test-ulps: Likewise.
* sysdeps/powerpc/fpu/libm-test-ulps: Likewise.
* sysdeps/powerpc/nofpu/libm-test-ulps: Likewise.
* sysdeps/s390/fpu/libm-test-ulps: Likewise.
* sysdeps/sh/libm-test-ulps: Likewise.
* sysdeps/sparc/fpu/libm-test-ulps: Likewise.
* sysdeps/tile/libm-test-ulps: Likewise.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
* sysdeps/hppa/fpu/libm-test-ulps: Remove individual test ulps.
IEEE 754-2008 defines two ways in which tiny results can be detected,
"before rounding" (based on the infinite-precision result) and "after
rounding" (based on the result when rounded to normal precision as if
the exponent range were unbounded). All binary operations on an
architecture must use the same choice of how tininess is detected.
soft-fp has so far implemented only before-rounding tininess
detection. This patch adds support for after-rounding tininess
detection. A new macro _FP_TININESS_AFTER_ROUNDING is added that
sfp-machine.h must define (soft-fp is meant to be self-contained so
the existing tininess.h files aren't used here, though the information
going in sfp-machine.h has been taken from them). The soft-fp macros
dealing with raising underflow exceptions then handle the cases where
the choice matters specially, rounding a copy of the input to the
appropriate precision to see if a value that's tiny before rounding
isn't tiny after rounding.
Tested for mips64 using GCC trunk (which now uses soft-fp on MIPS, so
supporting exceptions and rounding modes for long double where not
previously supported - this is the immediate motivation for doing this
patch now) together with (a) a patch to sysdeps/mips/math-tests.h to
enable exceptions / rounding modes tests for long double for GCC 4.9
and later, and (b) corresponding changes applied to libgcc's soft-fp
and sfp-machine.h files. In the libgcc context this is also tested on
x86_64 (also an after-rounding architecture) with testcases for
__float128 that I intend to add to the GCC testsuite when updating
soft-fp there.
(To be clear: this patch does not fix any glibc bugs that were
user-visible in past releases, since after-rounding architectures
didn't use soft-fp in any affected case with support for
floating-point exceptions - so there is no corresponding Bugzilla bug.
Rather, it works together with the GCC changes to use soft-fp on MIPS
to allow previously absent long double functionality to work properly,
and allows soft-fp to be used in glibc on after-rounding architectures
in cases where it couldn't previously be used.)
* soft-fp/op-common.h (_FP_DECL): Mark exponent as possibly
unused.
(_FP_PACK_SEMIRAW): Determine tininess based on rounding shifted
value if _FP_TININESS_AFTER_ROUNDING and unrounded value is in
subnormal range.
(_FP_PACK_CANONICAL): Determine tininess based on rounding to
normal precision if _FP_TININESS_AFTER_ROUNDING and unrounded
value has largest subnormal exponent.
* soft-fp/soft-fp.h [FP_NO_EXCEPTIONS]
(_FP_TININESS_AFTER_ROUNDING): Undefine and redefine to 0.
* sysdeps/aarch64/soft-fp/sfp-machine.h
(_FP_TININESS_AFTER_ROUNDING): New macro.
* sysdeps/alpha/soft-fp/sfp-machine.h
(_FP_TININESS_AFTER_ROUNDING): Likewise.
* sysdeps/arm/soft-fp/sfp-machine.h (_FP_TININESS_AFTER_ROUNDING):
Likewise.
* sysdeps/mips/mips64/soft-fp/sfp-machine.h
(_FP_TININESS_AFTER_ROUNDING): Likewise.
* sysdeps/mips/soft-fp/sfp-machine.h
(_FP_TININESS_AFTER_ROUNDING): Likewise.
* sysdeps/powerpc/soft-fp/sfp-machine.h
(_FP_TININESS_AFTER_ROUNDING): Likewise.
* sysdeps/sh/soft-fp/sfp-machine.h (_FP_TININESS_AFTER_ROUNDING):
Likewise.
* sysdeps/sparc/sparc32/soft-fp/sfp-machine.h
(_FP_TININESS_AFTER_ROUNDING): Likewise.
* sysdeps/sparc/sparc64/soft-fp/sfp-machine.h
(_FP_TININESS_AFTER_ROUNDING): Likewise.
* sysdeps/tile/sfp-machine.h (_FP_TININESS_AFTER_ROUNDING):
Likewise.
I've moved the TILE-Gx and TILEPro ports to the main sysdeps hierarchy,
along with the linux-generic ports infrastructure. Beyond the README
update, the move was just
git mv ports/sysdeps/tile sysdeps/tile
git mv ports/sysdeps/unix/sysv/linux/tile \
sysdeps/unix/sysv/linux/tile
git mv ports/sysdeps/unix/sysv/linux/generic \
sysdeps/unix/sysv/linux/generic
I updated the relevant ChangeLogs along the lines of the ARM move
in commit c6bfe5c4d7 and tested the 64-bit tilegx build to confirm that
there were no changes in "objdump -dr" output in the shared objects.