With the recent tuning the C version of rwlocks is basically the same
performance as the x86 assembler version for uncontended locks (with a
a few cycles near the run-to-run variability). For others it should not
matter anyways.
So remove the assembler code and use the C version like other
architectures.
This patch relies on the C version of the rwlocks posted earlier.
With C rwlocks it is very straight forward to do adaptive elision
using TSX. It is based on the infrastructure added earlier
for mutexes, but uses its own elision macros. The macros
are fairly general purpose and could be used for other
elision purposes too.
This version is much cleaner than the earlier assembler based
version, and in particular implements adaptation which makes
it safer.
I changed the behavior slightly to not require any changes
in the test suite and fully conform to all expected
behaviors (generally at the cost of not eliding in
various situations). In particular this means the timedlock
variants are not elided. Nested trylock aborts.
The implementation of __get_nprocs uses a stactic variable to cache
the value of the current number of processors. The caching breaks when
'time (NULL) == 0':
$ cat nproc.c
#include <stdio.h>
#include <time.h>
#include <sys/time.h>
int main(int argc, char *argv[])
{
time_t t;
struct timeval tv = {0, 0};
printf("settimeofday({0, 0}, NULL) = %d\n", settimeofday(&tv, NULL));
t = time(NULL);
printf("Time: %d, CPUs: %d\n", (unsigned int)t, get_nprocs());
return 0;
}
$ gcc -O3 nproc.c
$ ./a.out
settimeofday({0, 0}, NULL) = -1
Time: 1401311578, CPUs: 4
$ sudo ./a.out
settimeofday({0, 0}, NULL) = 0
Time: 0, CPUs: 0
The problem is with the condition used to check whether a cached
value should be returned or not:
static int cached_result;
static time_t timestamp;
time_t now = time (NULL);
time_t prev = timestamp;
atomic_read_barrier ();
if (now == prev)
return cached_result;
This patch fixes the problem by ensuring that 'cached_result' has
been set at least once before returning it.
The hppa port has no need of a custom lowlevellock.c, it should
use the generic version which is updated and correct. This
similarly fixes bug 15119 for hppa.
In several cases we've had asm routines rely on syscalls not clobbering
call-clobbered registers, and that's now deemed ABI. So take advantage
of this in the INLINE_SYSCALL path as well.
Shrinks libc.so by about 1k.
This macro was removed by
2005-11-16 Daniel Jacobowitz <dan@codesourcery.com>
but not applied to the (still separate) eabi port so necro'd
when the eabi port superceded the old abi. It was thence
copied into the new AArch64 port.