Commit Graph

71 Commits

Author SHA1 Message Date
Chris Metcalf
563a74d86c tile: fix copyright header blocks in just-committed files
I accidentally committed versions not following the conventions.
2014-10-06 13:47:02 -04:00
Chris Metcalf
d9cd52e64d tile: optimize memcmp
Customize memcmp.c for tile, using similar tricks from memcpy:

- replace MERGE macro with dblalign.
- replace memcmp_bytes function with revbytes.
- use __glibc_likely.
- use post-increment addressing.

The schedule is still not perfect: the compiler is not hoisting
code above the comparison branch, which could save a bundle or two.
memcmp speeds up by 30-40% on shorter aligned tests in benchtest,
with some tests with unaligned lengths taking a small performance hit.
2014-10-06 11:20:59 -04:00
Chris Metcalf
c86f7b80f4 tilegx: provide optimized strnlen, strstr, and strcasestr
strnlen() is based on the existing tile strlen() with length
checking added.  It speeds up by up to 5x, but on average across
the benchtest corpus by around 35%.  No regressions are seen.

strstr() does 8-byte aligned loads and compares using a 2-byte
filter on the first two bytes of the needle and then testing
the remaining bytes in needle using memcmp().  It speeds up
about 5x in the best case (for "found" needles), about 2x looking
at benchtest as a whole, with some slowdowns as much as 45%.
on a few cases (including the "fail" case for 128KB search).

strcasestr() is based on strstr() but uses a SIMD tolower
routine to convert 8-bytes to lower case in 5 instructions.
It also uses a 2-byte filter and then strncasecmp() for the
remaining bytes.  strncasecmp() is not optimized for SIMD, so
there is futher room for improvement.  However, it is still up
to 16x faster for "found" needles, averaging 2x faster on the
whole corpus of benchtests.  It does slow down by up to 35%
on a few cases, similarly to strstr().
2014-10-06 11:19:18 -04:00
Chris Metcalf
1c4c1a6f4d tilegx: optimize string copy_byte() internal function
We can use one "shufflebytes" instruction instead of 3 "bfins"
instructions to optimize the string functions.
2014-10-06 11:18:41 -04:00
Chris Metcalf
8622092d58 [BZ #17354] tile: Fix up corner cases with signed relocations
Some types of relocations technically need to be signed rather than
unsigned: in particular ones that are used with moveli or movei,
or for jump and branch.  This is almost never a problem.  Jump and
branch opcodes are pretty much uniformly resolved by the static linker
(unless you omit -fpic for a shared library, which is not recommended).
The moveli and movei opcodes that need to be sign-extended generally
are for positive displacements, like the construction of the address of
main() from _start().  However, tst-pie1 ends up with main below _start
(in a different module) and the test failed due to signedness issues in
relocation handling.

This commit treats the value as signed when shifting (to preserve the
high bit) and also sign-extends the value generated from the updated
bundle when comparing with the desired bundle, which we do to make sure
no overflow occurred.  As a result, the tst-pie1 test now passes.
2014-09-06 12:24:03 -04:00
Joseph Myers
29c4f53e2a Move architecture shlib-versions files to Linux-specific directories.
Various architectures have files such as sysdeps/<arch>/shlib-versions
whose contents are in fact entirely Linux-specific, relating only to
the symbol / shared library versions for the port to Linux on that
architecture, when any future port to a different OS on that
architecture would use the symbol version of the glibc release it goes
in, as standard for new ports.

This patch moves such files under sysdeps/unix/sysv/linux/, merging in
the contents of sysdeps/<arch>/nptl/shlib-versions in the process.
The only bits not moved are those relating to libgcc_s versions, which
don't appear OS-specific in the same way that glibc's symbol versions
so.  It deliberately does not change the regular expressions given for
matching configurations in each file; some match only Linux although
not Linux-specific, or match other OSes although Linux-specific.  It
is with a view to at least the following further cleanups:

* Move architecture-specific content from the toplevel shlib-versions
  and nptl/shlib-versions into sysdeps shlib-versions files, so
  eliminating another difference between ex-ports and non-ex-ports
  architectures.

* Likewise, for OS-specific content in shlib-versions files.

* At that point, the first field in shlib-versions files (the regular
  expression matching a configuration triplet) should be redundant, so
  eliminate that field and leave shlib-versions selection working
  purely on a sysdeps basis (with limited use of %ifdef in
  shlib-versions files when needed) rather than having its own
  separate mechanism to select what configuration information is
  relevant.

* Move the build of gnu/lib-names.h to a similar mechanism to that
  used for gnu/stubs.h (each library build installing a version of the
  header specifically for that build), so we can eliminate the
  duplication of soname information in the makefiles and get it purely
  from shlib-versions files again.

There may be other cleanups possible as well (in particular, I'm not
sure that all cases where the same "Earliest symbol set" information
is repeated for many different libraries actually should need to
repeat it rather than specifying it just once for DEFAULT for the
given configuration, and separately specifying any non-default choices
of soname).

Tested x86_64 that the installed shared libraries are unchanged by
this patch.

	* sysdeps/aarch64/shlib-versions: Move to ...
	* sysdeps/unix/sysv/linux/aarch64/shlib-versions: ... here.
	* sysdeps/alpha/shlib-versions: Move to ...
	* sysdeps/unix/sysv/linux/alpha/shlib-versions: ... here.
	* sysdeps/arm/shlib-versions: Move to ...
	* sysdeps/unix/sysv/linux/arm/shlib-versions: ... here.
	* sysdeps/hppa/shlib-versions: Move all contents except for
	libgcc_s entry to ...
	* sysdeps/unix/sysv/linux/hppa/shlib-versions: ... here.  Merge in
	entry from ...
	* sysdeps/hppa/nptl/shlib-versions: ... here.  Remove file.
	* sysdeps/ia64/shlib-versions: Move to ...
	* sysdeps/unix/sysv/linux/ia64/shlib-versions: ... here.  Merge in
	entry from ...
	* sysdeps/ia64/nptl/shlib-versions: ... here.  Remove file.
	* sysdeps/m68k/coldfire/shlib-versions: Move to ...
	* sysdeps/unix/sysv/linux/m68k/coldfire/shlib-versions: ... here.
	* sysdeps/microblaze/shlib-versions: Move to ...
	* sysdeps/unix/sysv/linux/microblaze/shlib-versions: ... here.
	* sysdeps/mips/shlib-versions: Move to ...
	* sysdeps/unix/sysv/linux/mips/shlib-versions: ... here.  Merge in
	entry from ...
	* sysdeps/mips/nptl/shlib-versions: ... here.  Remove file.
	* sysdeps/tile/shlib-versions: Move to ...
	* sysdeps/unix/sysv/linux/tile/shlib-versions: ... here.
	* sysdeps/unix/sysv/linux/x86_64/64/shlib-versions: Merge in entry
	from ...
	* sysdeps/x86_64/64/shlib-versions: ... here.  Remove file.
	* sysdeps/unix/sysv/linux/x86_64/x32/shlib-versions: Merge in
	entry from ...
	* sysdeps/x86_64/x32/shlib-versions: ... here.  Remove file.
2014-07-17 14:31:12 +00:00
Siddhesh Poyarekar
64df73c2ea Fix Wundef warning for MEMCPY_OK_FOR_FWD_MEMMOVE
Define MEMCPY_OK_FOR_FWD_MEMMOVE in memcopy.h and let arch-specific
implementations of that file override the value if necessary.  This
override is only useful for tile and moving this macro to memcopy.h
allows us to remove the tile-specific memmove.c.
2014-06-28 06:05:24 +05:30
Joseph Myers
cb403c34c6 Remove relro configure test.
This patch removes the configure test for working -z relro.

The use of -z relro in Makeconfig became unconditional with

commit 2e6ab1df44c412bb9d30b26a4d8a679150a7e375
Author: Ulrich Drepper <drepper@redhat.com>
Date:   Sat Oct 28 06:44:04 2006 +0000

    Remove conditional code which now is unnecessary.

(commit reference from git://repo.or.cz/glibc/history), so since then
the configure test has not controlled anything about how glibc is
built - simply about whether configure succeeds and allows a build to
be attempted.  The test for whether the option did something useful
(as opposed to whether it exists - which we can certainly just assume
by now) was originally added in
<https://sourceware.org/ml/libc-hacker/2004-09/msg00069.html> to
disable the option in a case when it did nothing useful on ia64 (as a
result of something deliberate in the linker on ia64).  Since 2006
that disabling has been of no effect, and given that the current test
does not set libc_relro_required for ia64, it does nothing whatever
useful for the original motivating case.  Also at around the same time
in 2006 the test was made to give an error for missing or broken -z
relro support on various architectures.

So effectively all the test does now is verify that, on certain
architectures, the linker has not been changed deliberately to make
the option ineffective.  I see no apparent reason why such a change
should be expected, or why the build should be stopped if it were to
be made (any more than we disallow build on ia64); I think we can
trust binutils patch review to point out the consequences of any
change to COMMONPAGESIZE setting.  The only thing that might now make
sense would be disabling the -z relro use on an architecture-specific
basis if there were an architecture-specific reason to consider that
to make sense; it would be for the ia64 maintainer to decide if that
makes sense for ia64 at present, but I think that could be done
through sysdeps Makefiles - no special configure tests needed.

Tested for x86_64 that this patch makes no change to the installed
shared libraries.

Together with
<https://sourceware.org/ml/libc-alpha/2014-06/msg00788.html> (pending
review) this substantially eliminates architecture-specific cases from
architecture-independent configure.ac files.  There remains an i386
case in sysdeps/mach/hurd/configure.ac that should properly move to
the i386 subdirectory.  (There are also OS-specific cases outside
OS-specific directories; in principle I think should should also
move.)

	* configure.ac (libc_commonpagesize): Remove variable.
	(libc_relro_required): Likewise.
	(libc_cv_z_relro): Remove configure test.
	* configure: Regenerated.
	* sysdeps/aarch64/preconfigure (libc_commonpagesize): Do not set
	variable.
	(libc_relro_required): Likewise.
	* sysdeps/alpha/preconfigure (libc_commonpagesize): Likewise.
	(libc_relro_required): Likewise.
	* sysdeps/arm/preconfigure.ac (libc_commonpagesize): Likewise.
	(libc_relro_required): Likewise.
	* sysdeps/arm/preconfigure: Regenerated.
	* sysdeps/ia64/preconfigure: Remove file.
	* sysdeps/tile/preconfigure (libc_commonpagesize): Do not set
	variable.
	(libc_relro_required): Likewise.
2014-06-27 16:51:22 +00:00
Siddhesh Poyarekar
4cf5b6d0d7 Fix Wundef warning for ELF_MACHINE_NO_RELA
This patch defines ELF_MACHINE_NO_RELA on all architectures.  Tested
only on x86_64 to verify that the sources before and after are
identical except for two instructions that pass the current line
number in dl-machine.h to assert_fail.
2014-06-26 22:30:40 +05:30
Andi Kleen
8491ed6d70 Add adaptive elision to rwlocks
This patch relies on the C version of the rwlocks posted earlier.
With C rwlocks it is very straight forward to do adaptive elision
using TSX. It is based on the infrastructure added earlier
for mutexes, but uses its own elision macros. The macros
are fairly general purpose and could be used for other
elision purposes too.

This version is much cleaner than the earlier assembler based
version, and in particular implements adaptation which makes
it safer.

I changed the behavior slightly to not require any changes
in the test suite and fully conform to all expected
behaviors (generally at the cost of not eliding in
various situations). In particular this means the timedlock
variants are not elided.  Nested trylock aborts.
2014-06-13 13:15:28 -07:00
Roland McGrath
c9cab3d2f9 Tile: Define TLS_DEFINE_INIT_TP 2014-06-11 12:25:27 -07:00
Chris Metcalf
2d0fc4dcfc tile: move sysdeps/unix/sysv/linux/tile nptl files. 2014-06-10 14:10:17 -04:00
Andreas Schwab
774f928582 Remove second argument from TLS_INIT_TP macro 2014-05-27 14:48:46 +02:00
Roland McGrath
e0db65176f Clean up __exit_thread. 2014-05-13 09:49:20 -07:00
Chris Metcalf
d42f3448a7 tile: Fix cut-and-paste bug in commit fcccd5128. 2014-04-04 15:03:47 -04:00
Roland McGrath
fcccd51286 Factor mmap/munmap of PT_LOAD segments out of _dl_map_object_from_fd et al. 2014-04-03 10:47:14 -07:00
Siddhesh Poyarekar
fdf4534d02 Fix -Wundef warnins for __FP_FAST_FMA*
The macros are defined by the compiler, so we can only verify whether
they are defined or not.
2014-03-21 17:28:43 +05:30
Roland McGrath
498a22333b Compile with -Wundef. 2014-03-14 11:32:51 -07:00
Joseph Myers
e6b6a85705 Don't include individual test ulps in libm-test-ulps.
As recently discussed
<https://sourceware.org/ml/libc-alpha/2014-02/msg00670.html>, it
doesn't seem particularly useful for libm-test-ulps files to contain
huge amounts of data on ulps for individual tests; just the global
maximum observed ulps for each function, together with the
verification of exceptions, errno and special results such as
infinities and NaNs for each test, suffices to verify that a
function's behavior on the given test inputs is within the expected
accuracy.  Removing this data reduces source tree churn caused by
updates to these files when libm tests are added, and reduces the
frequency with which testsuite additions actually need libm-test-ulps
changes at all.

Accordingly, this patch removes that data, so that individual tests
get checked against the global bounds for the given function and only
generate an error if those are exceeded.  Tested x86_64 (including
verifying that if an ulps value is artificially reduced, the tests do
indeed fail as they should and "make regen-ulps" generates the
expected changes).

	* math/libm-test.inc (struct ulp_data): Don't refer to ulps for
	individual tests in comment.
	(libm-test-ulps.h): Don't refer to test_ulps in #include comment.
	(prev_max_error): New variable.
	(prev_real_max_error): Likewise.
	(prev_imag_max_error): Likewise.
	(compare_ulp_data): Don't refer to test names in comment.
	(find_test_ulps): Remove function.
	(find_function_ulps): Likewise.
	(find_complex_function_ulps): Likewise.
	(init_max_error): Take function name as argument.  Look up ulps
	for that function.
	(print_ulps): Remove function.
	(print_max_error): Use prev_max_error instead of calling
	find_function_ulps.
	(print_complex_max_error): Use prev_real_max_error and
	prev_imag_max_error instead of calling find_complex_function_ulps.
	(check_float_internal): Take max_ulp parameter instead of calling
	find_test_ulps.  Don't call print_ulps.
	(check_float): Update call to check_float_internal.
	(check_complex): Update calls to check_float_internal.
	(START): Pass argument to init_max_error.
	* math/gen-libm-test.pl (%results): Don't include "kind"
	information.
	(parse_ulps): Don't handle ulps of individual tests.
	(print_ulps_file): Likewise.
	(output_ulps): Likewise.
	* math/README.libm-test: Update.
	* manual/libm-err-tab.pl (parse_ulps): Don't handle ulps of
	individual tests.
	* sysdeps/aarch64/libm-test-ulps: Remove individual test ulps.
	* sysdeps/alpha/fpu/libm-test-ulps: Likewise.
	* sysdeps/arm/libm-test-ulps: Likewise.
	* sysdeps/i386/fpu/libm-test-ulps: Likewise.
	* sysdeps/ia64/fpu/libm-test-ulps: Likewise.
	* sysdeps/m68k/coldfire/fpu/libm-test-ulps: Likewise.
	* sysdeps/m68k/m680x0/fpu/libm-test-ulps: Likewise.
	* sysdeps/microblaze/libm-test-ulps: Likewise.
	* sysdeps/mips/mips32/libm-test-ulps: Likewise.
	* sysdeps/mips/mips64/libm-test-ulps: Likewise.
	* sysdeps/powerpc/fpu/libm-test-ulps: Likewise.
	* sysdeps/powerpc/nofpu/libm-test-ulps: Likewise.
	* sysdeps/s390/fpu/libm-test-ulps: Likewise.
	* sysdeps/sh/libm-test-ulps: Likewise.
	* sysdeps/sparc/fpu/libm-test-ulps: Likewise.
	* sysdeps/tile/libm-test-ulps: Likewise.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.

	* sysdeps/hppa/fpu/libm-test-ulps: Remove individual test ulps.
2014-03-05 15:02:38 +00:00
Joseph Myers
ace614b8a5 soft-fp: support after-rounding tininess detection.
IEEE 754-2008 defines two ways in which tiny results can be detected,
"before rounding" (based on the infinite-precision result) and "after
rounding" (based on the result when rounded to normal precision as if
the exponent range were unbounded).  All binary operations on an
architecture must use the same choice of how tininess is detected.

soft-fp has so far implemented only before-rounding tininess
detection.  This patch adds support for after-rounding tininess
detection.  A new macro _FP_TININESS_AFTER_ROUNDING is added that
sfp-machine.h must define (soft-fp is meant to be self-contained so
the existing tininess.h files aren't used here, though the information
going in sfp-machine.h has been taken from them).  The soft-fp macros
dealing with raising underflow exceptions then handle the cases where
the choice matters specially, rounding a copy of the input to the
appropriate precision to see if a value that's tiny before rounding
isn't tiny after rounding.

Tested for mips64 using GCC trunk (which now uses soft-fp on MIPS, so
supporting exceptions and rounding modes for long double where not
previously supported - this is the immediate motivation for doing this
patch now) together with (a) a patch to sysdeps/mips/math-tests.h to
enable exceptions / rounding modes tests for long double for GCC 4.9
and later, and (b) corresponding changes applied to libgcc's soft-fp
and sfp-machine.h files.  In the libgcc context this is also tested on
x86_64 (also an after-rounding architecture) with testcases for
__float128 that I intend to add to the GCC testsuite when updating
soft-fp there.

(To be clear: this patch does not fix any glibc bugs that were
user-visible in past releases, since after-rounding architectures
didn't use soft-fp in any affected case with support for
floating-point exceptions - so there is no corresponding Bugzilla bug.
Rather, it works together with the GCC changes to use soft-fp on MIPS
to allow previously absent long double functionality to work properly,
and allows soft-fp to be used in glibc on after-rounding architectures
in cases where it couldn't previously be used.)

	* soft-fp/op-common.h (_FP_DECL): Mark exponent as possibly
	unused.
	(_FP_PACK_SEMIRAW): Determine tininess based on rounding shifted
	value if _FP_TININESS_AFTER_ROUNDING and unrounded value is in
	subnormal range.
	(_FP_PACK_CANONICAL): Determine tininess based on rounding to
	normal precision if _FP_TININESS_AFTER_ROUNDING and unrounded
	value has largest subnormal exponent.
	* soft-fp/soft-fp.h [FP_NO_EXCEPTIONS]
	(_FP_TININESS_AFTER_ROUNDING): Undefine and redefine to 0.
	* sysdeps/aarch64/soft-fp/sfp-machine.h
	(_FP_TININESS_AFTER_ROUNDING): New macro.
	* sysdeps/alpha/soft-fp/sfp-machine.h
	(_FP_TININESS_AFTER_ROUNDING): Likewise.
	* sysdeps/arm/soft-fp/sfp-machine.h (_FP_TININESS_AFTER_ROUNDING):
	Likewise.
	* sysdeps/mips/mips64/soft-fp/sfp-machine.h
	(_FP_TININESS_AFTER_ROUNDING): Likewise.
	* sysdeps/mips/soft-fp/sfp-machine.h
	(_FP_TININESS_AFTER_ROUNDING): Likewise.
	* sysdeps/powerpc/soft-fp/sfp-machine.h
	(_FP_TININESS_AFTER_ROUNDING): Likewise.
	* sysdeps/sh/soft-fp/sfp-machine.h (_FP_TININESS_AFTER_ROUNDING):
	Likewise.
	* sysdeps/sparc/sparc32/soft-fp/sfp-machine.h
	(_FP_TININESS_AFTER_ROUNDING): Likewise.
	* sysdeps/sparc/sparc64/soft-fp/sfp-machine.h
	(_FP_TININESS_AFTER_ROUNDING): Likewise.
	* sysdeps/tile/sfp-machine.h (_FP_TININESS_AFTER_ROUNDING):
	Likewise.
2014-02-12 18:27:12 +00:00
Chris Metcalf
4372980f58 Move tilegx, tilepro, and linux-generic from ports to libc.
I've moved the TILE-Gx and TILEPro ports to the main sysdeps hierarchy,
along with the linux-generic ports infrastructure.  Beyond the README
update, the move was just

    git mv ports/sysdeps/tile sysdeps/tile
    git mv ports/sysdeps/unix/sysv/linux/tile \
      sysdeps/unix/sysv/linux/tile
    git mv ports/sysdeps/unix/sysv/linux/generic \
      sysdeps/unix/sysv/linux/generic

I updated the relevant ChangeLogs along the lines of the ARM move
in commit c6bfe5c4d7 and tested the 64-bit tilegx build to confirm that
there were no changes in "objdump -dr" output in the shared objects.
2014-02-10 11:04:39 -05:00