glibc/sysdeps/alpha/memchr.S
Roland McGrath 847242451c Wed May 29 00:57:37 1996 David Mosberger-Tang <davidm@azstarnet.com>
* time/Makefile (tests): Add test-tz.

	* time/test-tz.c: New test.

	* time/clocktest.c: Rewrite to test more meaningfully.

	* sysdeps/unix/sysv/linux/syscalls.list: Add bdflush,
 	create_module, delete_module, get_kernel_syms, init_module,
 	klogctl.

	* sysdeps/unix/sysv/linux/sys/param.h (MAXSYMLINKS): Define as 5
	instead of SYMLOOP_MAX, which is nowhere to be found.

	* sysdeps/unix/sysv/linux/sys/msq_buf.h,
 	sysdeps/unix/sysv/linux/sys/sem_buf.h,
 	sysdeps/unix/sysv/linux/sys/shm_buf.h [__USE_MISC]: Add more
 	control ops and datastructures.

	* sysdeps/unix/sysv/linux/sys/io.h: New file declaring low-level
 	I/O related functions.

	* sysdeps/unix/sysv/linux/sys/kdaemon.h: New file declaring kernel
	daemon related functions/operations.

	* sysdeps/unix/sysv/linux/sys/klog.h: New file declaring kernel
	logging related functions/operations.

	* sysdeps/unix/sysv/linux/sys/module.h: New file declaring kernel
	module related functions/operations.

	* sysdeps/unix/sysv/linux/speed.c: Only do "mention this twice" hack
	for non-Alpha based Linux systems.

	* sysdeps/unix/sysv/linux/alpha/speed.c: Remove.

	* sysdeps/unix/sysv/linux/Makefile (headers): Add sys/module.h,
	sys/io.h, sys/klog.h, and sys/kdaemon.h.

	* sysdeps/unix/sysdep.h (END): Define empty END macro for
 	platforms that don't need some sort of end directive at the
	end of functions.

	* sysdeps/unix/make-syscalls.sh: Emit END($strong) at end of
 	syscall wrapper to allow correct generation of debugging
 	information.

	* sysdeps/unix/alpha/sysdep.h (END): Redefine to use .end
 	directive for both ELF and ECOFF.
	(ret): Delete macro.  It was a dangerous macro and unnecessary
 	since the Alpha assemblers recognizes "ret" as a macro themselves.

	* sysdeps/gnu/utmpbits.h (struct utmp): Move ut_tv behind
 	ut_session to guarantee long alignment.  This is important for
 	Linux/Alpha since ut_tv.tv_sec is 32 bits and time_t is 64 bits.
  	This will all get cleaned up as programs start to use ut_tv
 	instead ut_time.

	* sysdeps/alpha/divrem.h: Include <sysdep.h> instead of <*/regdef.h>.

	* sysdeps/alpha/bsd-_setjmp.S (setjmp): Renamed entry point to
	_setjmp.

	* sysdeps/alpha/_mcount.S, sysdeps/alpha/bb_init_func.S,
 	sysdeps/alpha/bsd-_setjmp.S, sysdeps/alpha/bsd-setjmp.S,
 	sysdeps/alpha/copysign.S, sysdeps/alpha/divrem.h,
 	sysdeps/alpha/fabs.S, sysdeps/alpha/ffs.S, sysdeps/alpha/htonl.S,
 	sysdeps/alpha/htons.S, sysdeps/alpha/memchr.S,
 	sysdeps/alpha/setjmp.S, sysdeps/alpha/strlen.S,
 	sysdeps/unix/sysv/linux/alpha/ieee_get_fp_control.S,
 	sysdeps/unix/sysv/linux/alpha/ieee_set_fp_control.S,
 	sysdeps/unix/sysv/linux/alpha/llseek.S,
 	sysdeps/unix/sysv/linux/alpha/pipe.S,
 	sysdeps/unix/sysv/linux/alpha/sigsuspend.S,
 	sysdeps/unix/sysv/linux/alpha/sysdep.S: Use END macro instead of
 	.end directive.

	* csu/initfini.c (_fini): Tell gcc that _fini is not a leaf
 	function by having it contain a dummy function call.

	* configure.in (config_machine): Don't make ELF the default for
 	Linux/Alpha just yet (use --with-elf instead).
	(.init/.fini check): Generate .text to ensure function start and
 	end are in same section.

	* sysdeps/unix/bsd/osf/alpha/brk.S,
 	sysdeps/unix/sysv/linux/alpha/brk.S (__curbrk): Store the entire
 	break value, not just the low 32 bits to accomodate large
 	memories.

Tue May 28 10:46:04 1996  Richard Henderson  <rth@tamu.edu>

	* sysdeps/unix/sysv/linux/alpha/brk.S: Rather than attempt to
	dynamically resolve _end for initializing __curbrk, support the
	brk(0) query idiom.

	* sysdeps/alpha/bb_init_func.S: Don't make `init' an external symbol.

	* sysdeps/alpha/bsd-_setjmp.S: The function is _setjmp not setjmp.

Sun May 26 22:17:38 1996  Richard Henderson  <rth@tamu.edu>

	* stdlib/lcong48_r.c, stdlib/seed48_r.c, stdlib/strtod.c,
	stdlib/strtol.c: Include <string.h> for mem* and str* fns used.

Thu May 23 02:15:56 1996  David Mosberger-Tang  <davidm@azstarnet.com>

	* sysdeps/unix/sysv/linux/Makefile (headers): Add sys/io.h,
 	sys/klog.h, and sys/kdaemon.h.

	* sysdeps/unix/sysv/linux/sys/io.h: New file.
	* sysdeps/unix/sysv/linux/sys/klog.h: Ditto.
	* sysdeps/unix/sysv/linux/sys/kdaemon.h: Ditto.

	* sysdeps/unix/alpha/sysdep.h (ret): Remove macro.  It is
 	dangerous and unnecessary since both OSF/1 as and gas define "ret"
 	as a pseudo-instruction.

Sat Jun  1 17:18:21 1996  Roland McGrath  <roland@delasyd.gnu.ai.mit.edu>

	* time/tzset.c (__tzset): Clear tz_rules name pointers after freeing
	them.  Bug found by David Mosberger-Tang.

	* sysdeps/posix/tempname.c (__stdio_gen_tempname): Use __ptr_t instead
	of PTR.

	* extra-lib.mk (extra-objs): Use patsubst intead of $(A:=B) syntax
	to work around Make bug when A contains var ref.

Fri May 31 18:27:52 1996  Roland McGrath  <roland@delasyd.gnu.ai.mit.edu>

	* string/string.h [__USE_MISC]: Declare basename; OSF/1 puts it here.

	* sysdeps/unix/sysv/linux/syscalls.list (getpgid, setpgid): Define __
	strong names and [gs]etpgid as weak aliases.

	* math/math_private.h (GET_LDOUBLE_EXP): Add missing backslash.
1996-06-02 18:50:07 +00:00

161 lines
3.9 KiB
ArmAsm

/* Copyright (C) 1996 Free Software Foundation, Inc.
Contributed by David Mosberger (davidm@cs.arizona.edu).
This file is part of the GNU C Library.
The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Library General Public License as
published by the Free Software Foundation; either version 2 of the
License, or (at your option) any later version.
The GNU C Library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Library General Public License for more details.
You should have received a copy of the GNU Library General Public
License along with the GNU C Library; see the file COPYING.LIB. If
not, write to the Free Software Foundation, Inc., 675 Mass Ave,
Cambridge, MA 02139, USA. */
/* Finds characters in a memory area. Optimized for the Alpha
architecture:
- memory accessed as aligned quadwords only
- uses cmpbge to compare 8 bytes in parallel
- does binary search to find 0 byte in last
quadword (HAKMEM needed 12 instructions to
do this instead of the 9 instructions that
binary search needs).
For correctness consider that:
- only minimum number of quadwords may be accessed
- the third argument is an unsigned long
*/
#include <sysdep.h>
.set noreorder
.set noat
ENTRY(memchr)
.prologue 0
beq a2, not_found
ldq_u t0, 0(a0) # load first quadword (a0 may be misaligned)
addq a0, a2, t4
and a1, 0xff, a1 # a1 = 00000000000000ch
ldq_u t5, -1(t4)
sll a1, 8, t1 # t1 = 000000000000ch00
cmpult a2, 9, t3
or t1, a1, a1 # a1 = 000000000000chch
sll a1, 16, t1 # t1 = 00000000chch0000
lda t2, -1(zero)
or t1, a1, a1 # a1 = 00000000chchchch
sll a1, 32, t1 # t1 = chchchch00000000
extql t0, a0, t6
or t1, a1, a1 # a1 = chchchchchchchch
beq t3, first_quad
extqh t5, a0, t5
mov a0, v0
or t6, t5, t0 # t0 = quadword starting at a0
#
# Deal with the case where at most 8 bytes remain to be searched
# in t0. E.g.:
# a2 = 6
# t0 = ????c6c5c4c3c2c1
last_quad:
negq a2, t5
srl t2, t5, t5 # t5 = mask of a2 bits set
xor a1, t0, t0
cmpbge zero, t0, t1
and t1, t5, t1
beq t1, not_found
found_it:
# now, determine which byte matched:
negq t1, t2
and t1, t2, t1
and t1, 0x0f, t0
addq v0, 4, t2
cmoveq t0, t2, v0
and t1, 0x33, t0
addq v0, 2, t2
cmoveq t0, t2, v0
and t1, 0x55, t0
addq v0, 1, t2
cmoveq t0, t2, v0
done: ret
#
# Deal with the case where a2 > 8 bytes remain to be
# searched. a0 may not be aligned.
#
first_quad:
andnot a0, 0x7, v0
insqh t2, a0, t1 # t1 = 0000ffffffffffff (a0<0:2> ff bytes)
xor t0, a1, t0
or t0, t1, t0 # t0 = ====ffffffffffff
cmpbge zero, t0, t1
bne t1, found_it
/* at least one byte left to process */
ldq t0, 8(v0)
addq v0, 8, v0
/*
* Make a2 point to last quad to be accessed (the
* last quad may or may not be partial).
*/
subq t4, 1, a2
andnot a2, 0x7, a2
cmpult v0, a2, t1
beq t1, final
/* at least two quads remain to be accessed */
subq a2, v0, t3 # t3 <- number of quads to be processed in loop
and t3, 8, t3 # odd number of quads?
bne t3, odd_quad_count
/* at least three quads remain to be accessed */
mov t0, t3 # move prefetched value into correct register
.align 3
unrolled_loop:
ldq t0, 8(v0) # prefetch t0
xor a1, t3, t1
cmpbge zero, t1, t1
bne t1, found_it
addq v0, 8, v0
odd_quad_count:
xor a1, t0, t1
ldq t3, 8(v0) # prefetch t3
cmpbge zero, t1, t1
bne t1, found_it
addq v0, 8, v0
cmpult v0, a2, t5
bne t5, unrolled_loop
mov t3, t0 # move prefetched value into t0
final: subq t4, v0, a2 # a2 <- number of bytes left to do
bne a2, last_quad
not_found:
mov zero, v0
ret
END(memchr)