Add the missing fallback file for elide.h to fix non x86 builds.
Sorry about that. This is just a noop macro file that makes
all elision code to be optimized out.
With the recent tuning the C version of rwlocks is basically the same
performance as the x86 assembler version for uncontended locks (with a
a few cycles near the run-to-run variability). For others it should not
matter anyways.
So remove the assembler code and use the C version like other
architectures.
This patch relies on the C version of the rwlocks posted earlier.
With C rwlocks it is very straight forward to do adaptive elision
using TSX. It is based on the infrastructure added earlier
for mutexes, but uses its own elision macros. The macros
are fairly general purpose and could be used for other
elision purposes too.
This version is much cleaner than the earlier assembler based
version, and in particular implements adaptation which makes
it safer.
I changed the behavior slightly to not require any changes
in the test suite and fully conform to all expected
behaviors (generally at the cost of not eliding in
various situations). In particular this means the timedlock
variants are not elided. Nested trylock aborts.
One difference of the C versions to the assembler wr/rdlock
is that the C compiler saves some registers which are unnecessary
for the fast path in the prologue of the functions. Split the
uncontended fast path out into a separate function. Only when contention is
detected is the full featured function called. This makes
the fast path code (nearly) identical to the assembler version,
and gives uncontended performance within a few cycles.
v2: Rename some functions and add space.
The implementation of __get_nprocs uses a stactic variable to cache
the value of the current number of processors. The caching breaks when
'time (NULL) == 0':
$ cat nproc.c
#include <stdio.h>
#include <time.h>
#include <sys/time.h>
int main(int argc, char *argv[])
{
time_t t;
struct timeval tv = {0, 0};
printf("settimeofday({0, 0}, NULL) = %d\n", settimeofday(&tv, NULL));
t = time(NULL);
printf("Time: %d, CPUs: %d\n", (unsigned int)t, get_nprocs());
return 0;
}
$ gcc -O3 nproc.c
$ ./a.out
settimeofday({0, 0}, NULL) = -1
Time: 1401311578, CPUs: 4
$ sudo ./a.out
settimeofday({0, 0}, NULL) = 0
Time: 0, CPUs: 0
The problem is with the condition used to check whether a cached
value should be returned or not:
static int cached_result;
static time_t timestamp;
time_t now = time (NULL);
time_t prev = timestamp;
atomic_read_barrier ();
if (now == prev)
return cached_result;
This patch fixes the problem by ensuring that 'cached_result' has
been set at least once before returning it.
Continuing the series of patches to clean up conformtest expectations
for "POSIX" (1995/6) based on review of the expectations against the
standard, this patch cleans up expectations for sys/mman.h, sys/stat.h
and sys/types.h. Tested x86_64; no new XFAILs needed.
* conform/data/sys/mman.h-data [POSIX] (size_t): Do not require
type.
[POSIX] (off_t): Likewise.
* conform/data/sys/stat.h-data (S_IRGRP): Require constant.
[POSIX] (S_ISBLK): Require macro.
[POSIX] (S_ISCHR): Likewise.
[POSIX] (S_ISDIR): Likewise.
[POSIX] (S_ISFIFO): Likewise.
[POSIX] (S_ISREG): Likewise.
[POSIX || XPG3 || XPG4 || UNIX98] (S_TYPEISTMO): Do not list
optional-macro.
* conform/data/sys/types.h-data [POSIX] (blkcnt_t): Do not require
type.
[POSIX] (time_t): Likewise.
[POSIX] (timer_t): Likewise.
POSIX requires that we make a copy, so we allocate a new string
and free it in posix_spawn_file_actions_destroy.
Reported by David Reid, Alex Gaynor, and Glyph Lefkowitz. This bug
may have security implications.
Optimization is achieved on 8 byte aligned strings with double word
comparison using cmpb instruction. On unaligned strings loop unrolling
is applied for Power7 gain.
As with other issues of this kind, bug 17042 is log2 (1) wrongly
returning -0 instead of +0 in round-downward mode because of
implementations effectively in terms of log1p (x - 1). This patch
fixes the issue in the same way used for log and log10.
Tested x86_64 and x86 and ulps updated accordingly. Also tested for
mips64 to confirm a fix was needed for ldbl-128 and to validate that
fix (also applied to ldbl-128ibm since that version of log2l is
essentially the same as the ldbl-128 one).
[BZ #17042]
* sysdeps/i386/fpu/e_log2.S (__ieee754_log2): Take absolete value
when x - 1 is zero.
* sysdeps/i386/fpu/e_log2f.S (__ieee754_log2f): Likewise.
* sysdeps/i386/fpu/e_log2l.S (__ieee754_log2l): Likewise.
* sysdeps/ieee754/ldbl-128/e_log2l.c (__ieee754_log2l): Return
0.0L for an argument of 1.0L.
* sysdeps/ieee754/ldbl-128ibm/e_log2l.c (__ieee754_log2l):
Likewise.
* sysdeps/x86_64/fpu/e_log2l.S (__ieee754_log2l): Take absolute
value when x - 1 is zero.
* math/libm-test.inc (log2_test): Use ALL_RM_TEST.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
The hppa port has no need of a custom lowlevellock.c, it should
use the generic version which is updated and correct. This
similarly fixes bug 15119 for hppa.