It has been a long practice for software using IEEE 754 floating-point
arithmetic run on MIPS processors to use an encoding of Not-a-Number
(NaN) data different to one used by software run on other processors.
And as of IEEE 754-2008 revision [1] this encoding does not follow one
recommended in the standard, as specified in section 6.2.1, where it
is stated that quiet NaNs should have the first bit (d1) of their
significand set to 1 while signalling NaNs should have that bit set to
0, but MIPS software interprets the two bits in the opposite manner.
As from revision 3.50 [2][3] the MIPS Architecture provides for
processors that support the IEEE 754-2008 preferred NaN encoding format.
As the two formats (further referred to as "legacy NaN" and "2008 NaN")
are incompatible to each other, tools have to provide support for the
two formats to help people avoid using incompatible binary modules.
The change is comprised of two functional groups of features, both of
which are required for correct support.
1. Dynamic linker support.
To enforce the NaN encoding requirement in dynamic linking a new ELF
file header flag has been defined. This flag is set for 2008-NaN
shared modules and executables and clear for legacy-NaN ones. The
dynamic linker silently ignores any incompatible modules it
encounters in dependency processing.
To avoid unnecessary processing of incompatible modules in the
presence of a shared module cache, a set of new cache flags has been
defined to mark 2008-NaN modules for the three ABIs supported.
Changes to sysdeps/unix/sysv/linux/mips/readelflib.c have been made
following an earlier code quality suggestion made here:
http://sourceware.org/ml/libc-ports/2009-03/msg00036.html
and are therefore a little bit more extensive than the minimum
required.
Finally a new name has been defined for the dynamic linker so that
2008-NaN and legacy-NaN binaries can coexist on a single system that
supports dual-mode operation and that a legacy dynamic linker that
does not support verifying the 2008-NaN ELF file header flag is not
chosen to interpret a 2008-NaN binary by accident.
2. Floating environment support.
IEEE 754-2008 features are controlled in the Floating-Point Control
and Status (FCSR) register and updates are needed to floating
environment support so that the 2008-NaN flag is set correctly and
the kernel default, inferred from the 2008-NaN ELF file header flag
at the time an executable is loaded, respected.
As the NaN encoding format is a property of GCC code generation that is
both a user-selected GCC configuration default and can be overridden
with GCC options, code that needs to know what NaN encoding standard it
has been configured for checks for the __mips_nan2008 macro that is
defined internally by GCC whenever the 2008-NaN mode has been selected.
This mode is determined at the glibc configuration time and therefore a
few consistency checks have been added to catch cases where compilation
flags have been overridden by the user.
The 2008 NaN set of features relies on kernel support as the in-kernel
floating-point emulator needs to be aware of the NaN encoding used even
on hard-float processors and configure the FPU context according to the
value of the 2008 NaN ELF file header flag of the executable being
started. As at this time work on kernel support is still in progress
and the relevant changes have not made their way yet to linux.org master
repository.
Therefore the minimum version supported has been artificially set to
10.0.0 so that 2008-NaN code is not accidentally run on a Linux kernel
that does not suppport it. It is anticipated that the version is
adjusted later on to the actual initial linux.org kernel version to
support this feature. Legacy NaN encoding support is unaffected, older
kernel versions remain supported.
[1] "IEEE Standard for Floating-Point Arithmetic", IEEE Computer
Society, IEEE Std 754-2008, 29 August 2008
[2] "MIPS Architecture For Programmers, Volume I-A: Introduction to the
MIPS32 Architecture", MIPS Technologies, Inc., Document Number:
MD00082, Revision 3.50, September 20, 2012
[3] "MIPS Architecture For Programmers, Volume I-A: Introduction to the
MIPS64 Architecture", MIPS Technologies, Inc., Document Number:
MD00083, Revision 3.50, September 20, 2012
Only enter the aligned copy loop with buffers that can be 8-byte
aligned. This improves performance slightly on Cortex-A9 and
Cortex-A15 cores for large copies with buffers that are 4-byte
aligned but not 8-byte aligned.
ports/ChangeLog.arm:
2013-09-16 Will Newton <will.newton@linaro.org>
* sysdeps/arm/armv7/multiarch/memcpy_impl.S: Tighten check
on entry to aligned copy loop to improve performance.
Another example of all the 64bit arches getting the definition via a
common file, but the 32bit ones all adding it by themselves and hppa
was missed.
I'm not entirely sure about the usage of GLIBC_2.19 symbols here.
We'd like to backport this so people can use it, but it means we'd
be releasing a glibc-2.17/glibc-2.18 with a GLIBC_2.19 symbol in it.
But maybe it won't be a big deal since you'd only get that 2.19 ref
if you actually used the symbol ?
There hasn't been a glibc release where hppa worked w/out a bunch of
patches, so in reality there's only two distros that matter -- Gentoo
and Debian.
Reported-by: Jeroen Roovers <jer@gentoo.org>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
The mov lr, pc instruction will lose the Thumb bit from the return address
so use blx lr instead.
ports/ChangeLog.arm:
2013-08-30 Will Newton <will.newton@linaro.org>
[BZ #15909]
* sysdeps/unix/sysv/linux/arm/clone.S (__clone): Use blx
instead of mov lr, pc.
This implementation of strlen is faster than the armv6 version for
all string lengths greater than 1 on a Cortex-A15.
ports/ChangeLog.arm:
2013-08-09 Will Newton <will.newton@linaro.org>
* sysdeps/arm/armv6t2/strlen.S: New file.
Reserved bits in the Floating-Point Control and Status Register (FCSR)
should not be implicitly cleared by fedisableexcept or feenableexcept,
there is no reason to. Among these are the 8 condition codes and one of
the two bits reserved for architecture implementers (bits #22 & #21).
As to the latter, there is no reason to treat any of them as reserved
either, they should be user controllable and settable via __fpu_control
override as the user sees fit. For example in processors implemented by
MIPS Technologies, such as the 5Kf or the 24Kf, these bits are used to
change the treatment of denormalised operands and tiny results: bit #22
is Flush Override (FO) and bit #21 is Flush to Nearest (FN). They cause
non-IEEE-compliant behaviour, but some programs may have a use for such
modes of operation; the library should not obstruct such use just as it
does not for the architectural Flush to Zero (FS) bit (bit #24).
Therefore the change adjusts the reserved mask accordingly and also
documents the distinction between bits 22:21 and 20:18.
* sysdeps/powerpc/nofpu/sim-full.c: Add FIXME note about
the need for thread-specific variables preserved across signal
handlers.
* sysdeps/powerpc/nofpu/soft-supp.h: Likewise.
* sysdeps/powerpc/soft-fp/sfp-machine.h: Likewise.
A recently-added test (dlfcn/tststatic5) pointed out that tile was not
properly initializing the variable pagesize in certain cases. This
change just copies the existing code from MIPS.
The sfp-machine.h is based on the gcc version, but extended with
required new macros by comparison with other architectures and by
investigating the hardware support for FP on tile.
This function is now called from dl_open_worker with the GL(dl_load_lock)
lock held and no longer needs local protection. GL(dl_load_lock) also
correctly protects _dl_lookup_symbol_x called here that relies on the
caller to have serialized access to the data structures it uses.
This patch introduces two new convenience functions to set the default
thread attributes used for creating threads. This allows a programmer
to set the default thread attributes just once in a process and then
run pthread_create without additional attributes.
The macros in lowlevellock.h are returning positive errors, but the
users of the macros expect negative. This causes e.g. sem_wait to
sometimes return an error with errno set to -EWOULDBLOCK.
Signed-off-by: Kirk Meyer <kirk.meyer@sencore.com>
Signed-off-by: David Holsgrove <david.holsgrove@xilinx.com>
This turns out to be helpful when doing a from-scratch cross-compile of
gcc and glibc, since you can then do "make install-headers" in glibc
even before you have a functioning tile gcc.
Resolves: #15465
The program name may be unavailable if the user application tampers
with argc and argv[]. Some parts of the dynamic linker caters for
this while others don't, so this patch consolidates the check and
fallback into a single macro and updates all users.
The existing test avoided passing -mcmodel=large if the compiler didn't
support it. However, we need to test not just the compiler support, but
also the toolchain (as and ld) support, so make the test more complete.
In addition, we have to avoid using the hwN_plt() assembly operators if
that support is missing, so guard the uses with #ifdef NO_PLT_PCREL.
This allows us to properly build glibc with the current community
binutils, which doesn't yet have the PC-relative PLT operator support.
The -mcmodel=large support is in gcc 4.8, but the toolchain support
won't be present in the community until binutils 2.24.
[BZ #15442] This adds support for the inverse interpretation of the
quiet bit of IEEE 754 floating-point NaN data that some processors
use. This includes in particular MIPS architecture processors; the
payload used for the canonical qNaN encoding is updated accordingly
so as not to interfere with the quiet bit.
Joseph Myers noted that there were several old and really very
incorrect values in the hppa libm-test-ulps. This patch removes
all of the ulps values for ceil, floor, rint, round, trun,
llrint, and llround, all of which were previously incorreclty
added (including some negative values which are really wrong).
---
ports/
2013-05-15 Carlos O'Donell <carlos@redhat.com>
* sysdeps/hppa/fpu/libm-test-ulps: Remove old values for ceil, floor,
rint, round, trunc, llrint, and llround.
Update libm-test-ulps for hppa. There are a few entries
with 4 or 5 ulps, but these appear to be expected. A more
thorough review will be required if hppa switches long-double
to a different type.
---
ports/
2013-05-15 Carlos O'Donell <carlos@redhat.com>
* sysdeps/hppa/fpu/libm-test-ulps: Regenerate.
The following patch fixes both _FPU_GETCW and
_FPU_SETCW for hppa. The initial implementation was
flawed and not well tested. We failed to set cw,
and passed in the value of a register to fldd.
This patch fixes both of those errors and allows
the libm tests to pass without failure.
Signed-off-by: Guy Martin <gmsoft@tuxicoman.be>
Signed-off-by: Carlos O'Donell <carlos@redhat.com>
---
2013-05-15 Guy Martin <gmsoft@tuxicoman.be>
Carlos O'Donell <carlos@redhat.com>
[BZ# 15000]
* ports/sysdeps/hppa/fpu/fpu_control.h (_FPU_GETCW): Set cw.
(_FPU_SETCW): Pass address to fldd.
2013-05-12 Marcus Shawcroft <marcus.shawcroft@linaro.org>
* sysdeps/unix/sysv/linux/aarch64/clone.S (__clone):
Do not call sycall_error directly with a confitional branch.
* sysdeps/unix/sysv/linux/aarch64/ioctl.S (__ioctl):
Do not call sycall_error directly with a confitional branch.
These macros often set up a variable that later macros sometimes do
not use. Add unused attribute to avoid that.
Similarly, the ia64 code tends to check the err field rather than
the val (which is opposite of most arches) leading to the same
kind of warning. Replace this with a dummy reference.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
The current code declares double constants by using a char buffer and
then casting the pointer to a different type. This makes the aliasing
logic unhappy. Change it to use a union instead to avoid that.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Function pointers on ia64 are like parisc -- they're plabels. While
the parisc port enjoys a gcc builtin for extracting the address here,
ia64 has no such luck.
Casting & dereferencing in one go triggers a strict aliasing warning.
Use a union to fix that.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
The ia64_rse_is_rnat_slot func expects an unsigned pointer, but we're
passing in a signed pointer. The signness doesn't matter here, so
convert it to unsigned.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
The strcpy and strchr (and related) functions are four times faster
than the byte-by-byte default versions.
The strlen function is twice as fast for long strings and 50% faster
for short strings over the armv4 version.
* sysdeps/unix/sysv/linux/bits/mman-linux.h (MAP_ANONYMOUS):
Allow definition via __MAP_ANONYMOUS.
* sysdeps/unix/sysv/linux/mips/bits/mman.h: Remove all defines
provided by bits/mman-linux.h and include <bits/mman-linux.h>.
(__MAP_ANONYMOUS): Define.
Written from scratch rather than copied from GMP, due to LGPL 2.1 vs
GPL 3, but tested with the GMP testsuite.
This is 250% faster than the generic code as measured on Cortex-A15,
and the same speed as GMP on the same core, and probably everywhere.
Written from scratch rather than copied from GMP, due to LGPL 2.1 vs
GPL 3, but tested with the GMP testsuite.
This is 50% faster than the generic code as measured on Cortex-A15.
It is 25% slower than the current GMP routine on the same core.
Written from scratch rather than copied from GMP, due to LGPL 2.1 vs
GPL 3, but tested with the GMP testsuite.
This is 25% faster than the generic code as measured on Cortex-A15,
and the same speed as GMP on the same core. It's probably slower
than GMP on the A8 and A9 cores though.
There was only one user. It's "condition" argument was used
for "ia" rather than an actual condition. The apcs26 syntax
is almost certainly not needed, given current binutils requirements.
For arm this makes no difference--the result is bit-for-bit identical;
for thumb this results in smaller encodings. Perhaps it ought not and
this is in fact an assembler bug, but I also think it's clearer.
The preceeding patches have allowed for the few incompatibilities
between arm and thumb2 mode, or have marked the file as not wanting
to use thumb2 mode.
Factor out the sequence needed to call kuser_get_tls, as we can't
play subtract into pc games in thumb mode. Prepare for hard-tp,
pulling the save of LR into the macro.
There are several places in which we access negative offsets from
the thread-pointer, but thumb2 only supports positive offsets in
memory references.
Avoid duplicating the rather large macros in which these references
are embedded by abstracting out the operation.