The mov lr, pc instruction will lose the Thumb bit from the return address
so use blx lr instead.
ports/ChangeLog.arm:
2013-08-30 Will Newton <will.newton@linaro.org>
[BZ #15909]
* sysdeps/unix/sysv/linux/arm/clone.S (__clone): Use blx
instead of mov lr, pc.
This implementation of strlen is faster than the armv6 version for
all string lengths greater than 1 on a Cortex-A15.
ports/ChangeLog.arm:
2013-08-09 Will Newton <will.newton@linaro.org>
* sysdeps/arm/armv6t2/strlen.S: New file.
Reserved bits in the Floating-Point Control and Status Register (FCSR)
should not be implicitly cleared by fedisableexcept or feenableexcept,
there is no reason to. Among these are the 8 condition codes and one of
the two bits reserved for architecture implementers (bits #22 & #21).
As to the latter, there is no reason to treat any of them as reserved
either, they should be user controllable and settable via __fpu_control
override as the user sees fit. For example in processors implemented by
MIPS Technologies, such as the 5Kf or the 24Kf, these bits are used to
change the treatment of denormalised operands and tiny results: bit #22
is Flush Override (FO) and bit #21 is Flush to Nearest (FN). They cause
non-IEEE-compliant behaviour, but some programs may have a use for such
modes of operation; the library should not obstruct such use just as it
does not for the architectural Flush to Zero (FS) bit (bit #24).
Therefore the change adjusts the reserved mask accordingly and also
documents the distinction between bits 22:21 and 20:18.
* sysdeps/powerpc/nofpu/sim-full.c: Add FIXME note about
the need for thread-specific variables preserved across signal
handlers.
* sysdeps/powerpc/nofpu/soft-supp.h: Likewise.
* sysdeps/powerpc/soft-fp/sfp-machine.h: Likewise.
A recently-added test (dlfcn/tststatic5) pointed out that tile was not
properly initializing the variable pagesize in certain cases. This
change just copies the existing code from MIPS.
The sfp-machine.h is based on the gcc version, but extended with
required new macros by comparison with other architectures and by
investigating the hardware support for FP on tile.
This function is now called from dl_open_worker with the GL(dl_load_lock)
lock held and no longer needs local protection. GL(dl_load_lock) also
correctly protects _dl_lookup_symbol_x called here that relies on the
caller to have serialized access to the data structures it uses.
This patch introduces two new convenience functions to set the default
thread attributes used for creating threads. This allows a programmer
to set the default thread attributes just once in a process and then
run pthread_create without additional attributes.
The macros in lowlevellock.h are returning positive errors, but the
users of the macros expect negative. This causes e.g. sem_wait to
sometimes return an error with errno set to -EWOULDBLOCK.
Signed-off-by: Kirk Meyer <kirk.meyer@sencore.com>
Signed-off-by: David Holsgrove <david.holsgrove@xilinx.com>
This turns out to be helpful when doing a from-scratch cross-compile of
gcc and glibc, since you can then do "make install-headers" in glibc
even before you have a functioning tile gcc.
Resolves: #15465
The program name may be unavailable if the user application tampers
with argc and argv[]. Some parts of the dynamic linker caters for
this while others don't, so this patch consolidates the check and
fallback into a single macro and updates all users.
The existing test avoided passing -mcmodel=large if the compiler didn't
support it. However, we need to test not just the compiler support, but
also the toolchain (as and ld) support, so make the test more complete.
In addition, we have to avoid using the hwN_plt() assembly operators if
that support is missing, so guard the uses with #ifdef NO_PLT_PCREL.
This allows us to properly build glibc with the current community
binutils, which doesn't yet have the PC-relative PLT operator support.
The -mcmodel=large support is in gcc 4.8, but the toolchain support
won't be present in the community until binutils 2.24.
[BZ #15442] This adds support for the inverse interpretation of the
quiet bit of IEEE 754 floating-point NaN data that some processors
use. This includes in particular MIPS architecture processors; the
payload used for the canonical qNaN encoding is updated accordingly
so as not to interfere with the quiet bit.
Joseph Myers noted that there were several old and really very
incorrect values in the hppa libm-test-ulps. This patch removes
all of the ulps values for ceil, floor, rint, round, trun,
llrint, and llround, all of which were previously incorreclty
added (including some negative values which are really wrong).
---
ports/
2013-05-15 Carlos O'Donell <carlos@redhat.com>
* sysdeps/hppa/fpu/libm-test-ulps: Remove old values for ceil, floor,
rint, round, trunc, llrint, and llround.
Update libm-test-ulps for hppa. There are a few entries
with 4 or 5 ulps, but these appear to be expected. A more
thorough review will be required if hppa switches long-double
to a different type.
---
ports/
2013-05-15 Carlos O'Donell <carlos@redhat.com>
* sysdeps/hppa/fpu/libm-test-ulps: Regenerate.
The following patch fixes both _FPU_GETCW and
_FPU_SETCW for hppa. The initial implementation was
flawed and not well tested. We failed to set cw,
and passed in the value of a register to fldd.
This patch fixes both of those errors and allows
the libm tests to pass without failure.
Signed-off-by: Guy Martin <gmsoft@tuxicoman.be>
Signed-off-by: Carlos O'Donell <carlos@redhat.com>
---
2013-05-15 Guy Martin <gmsoft@tuxicoman.be>
Carlos O'Donell <carlos@redhat.com>
[BZ# 15000]
* ports/sysdeps/hppa/fpu/fpu_control.h (_FPU_GETCW): Set cw.
(_FPU_SETCW): Pass address to fldd.
2013-05-12 Marcus Shawcroft <marcus.shawcroft@linaro.org>
* sysdeps/unix/sysv/linux/aarch64/clone.S (__clone):
Do not call sycall_error directly with a confitional branch.
* sysdeps/unix/sysv/linux/aarch64/ioctl.S (__ioctl):
Do not call sycall_error directly with a confitional branch.
These macros often set up a variable that later macros sometimes do
not use. Add unused attribute to avoid that.
Similarly, the ia64 code tends to check the err field rather than
the val (which is opposite of most arches) leading to the same
kind of warning. Replace this with a dummy reference.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
The current code declares double constants by using a char buffer and
then casting the pointer to a different type. This makes the aliasing
logic unhappy. Change it to use a union instead to avoid that.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Function pointers on ia64 are like parisc -- they're plabels. While
the parisc port enjoys a gcc builtin for extracting the address here,
ia64 has no such luck.
Casting & dereferencing in one go triggers a strict aliasing warning.
Use a union to fix that.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
The ia64_rse_is_rnat_slot func expects an unsigned pointer, but we're
passing in a signed pointer. The signness doesn't matter here, so
convert it to unsigned.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
The strcpy and strchr (and related) functions are four times faster
than the byte-by-byte default versions.
The strlen function is twice as fast for long strings and 50% faster
for short strings over the armv4 version.
* sysdeps/unix/sysv/linux/bits/mman-linux.h (MAP_ANONYMOUS):
Allow definition via __MAP_ANONYMOUS.
* sysdeps/unix/sysv/linux/mips/bits/mman.h: Remove all defines
provided by bits/mman-linux.h and include <bits/mman-linux.h>.
(__MAP_ANONYMOUS): Define.
Written from scratch rather than copied from GMP, due to LGPL 2.1 vs
GPL 3, but tested with the GMP testsuite.
This is 250% faster than the generic code as measured on Cortex-A15,
and the same speed as GMP on the same core, and probably everywhere.
Written from scratch rather than copied from GMP, due to LGPL 2.1 vs
GPL 3, but tested with the GMP testsuite.
This is 50% faster than the generic code as measured on Cortex-A15.
It is 25% slower than the current GMP routine on the same core.
Written from scratch rather than copied from GMP, due to LGPL 2.1 vs
GPL 3, but tested with the GMP testsuite.
This is 25% faster than the generic code as measured on Cortex-A15,
and the same speed as GMP on the same core. It's probably slower
than GMP on the A8 and A9 cores though.
There was only one user. It's "condition" argument was used
for "ia" rather than an actual condition. The apcs26 syntax
is almost certainly not needed, given current binutils requirements.
For arm this makes no difference--the result is bit-for-bit identical;
for thumb this results in smaller encodings. Perhaps it ought not and
this is in fact an assembler bug, but I also think it's clearer.
The preceeding patches have allowed for the few incompatibilities
between arm and thumb2 mode, or have marked the file as not wanting
to use thumb2 mode.
Factor out the sequence needed to call kuser_get_tls, as we can't
play subtract into pc games in thumb mode. Prepare for hard-tp,
pulling the save of LR into the macro.
There are several places in which we access negative offsets from
the thread-pointer, but thumb2 only supports positive offsets in
memory references.
Avoid duplicating the rather large macros in which these references
are embedded by abstracting out the operation.
Some routines are written with complex LDM/STM insns that cannot be
used in thumb mode, or are highly conditional requiring excessive
IT insns.
When a future patch goes in to enable thumb2 by default, this marker
will be used to override that default.
New defines from gcc 4.8:
#define __ARM_ARCH_ISA_ARM 1
#define __ARM_ARCH_PROFILE 65
#define __ARM_ARCH_ISA_THUMB 2
#define __ARM_ARCH 7
all of which got in the way of the one we wanted:
#define __ARM_ARCH_7A__ 1