Commit Graph

5 Commits

Author SHA1 Message Date
Gabriel F. T. Gomes
72c11b353e powerpc: Zero pad using memset in strncpy/stpncpy
Call __memset_power8 to pad, with zeros, the remaining bytes in the
dest string on __strncpy_power8 and __stpncpy_power8.  This improves
performance when n is larger than the input string, giving ~30% gain for
larger strings without impacting much shorter strings.
2016-04-29 10:05:33 -03:00
Joseph Myers
f7a9f785e5 Update copyright dates with scripts/update-copyrights. 2016-01-04 16:05:18 +00:00
Gabriel F. T. Gomes
b0f81637d5 PowerPC: Add comments to optimized strncpy
* sysdeps/powerpc/powerpc64/power8/strncpy.S: Added comments to some
	assembly instructions.
2015-10-01 17:36:55 -03:00
Gabriel F. T. Gomes
850713336e PowerPC: Fix operand prefixes
The file sysdeps/powerpc/sysdeps.h defines aliases for register operands,
which add the letter 'r' as a prefix to a register name.  E.g.: register 20
can be written as 'r20', instead of '20'.  On the one hand, this increases
readability, as it makes it easier for readers to know whether the operand is a
register or an immediate.  On the other hand, this permits that immediate
operands be written as if they were registers, and vice-versa, thus reducing
the readability of the code.

This commit removes some of these unintentional misuses.

This commit also increases readability of the code by adding the prefix 'cr' to
some uses of the control register.

Both changes have no effect on the final code.  Checked with objdump.

	* sysdeps/powerpc/powerpc64/power8/strncpy.S: Remove or add register
	prefix from operands.
2015-10-01 17:36:46 -03:00
Adhemerval Zanella
f06a4faf8a powerpc: Optimized st{r,p}ncpy for POWER8/PPC64
This patch adds an optimized POWER8 st{r,p}ncpy using unaligned accesses.
It shows 10%-80% improvement over the optimized POWER7 one that uses
only aligned accesses, specially on unaligned inputs.

The algorithm first read and check 16 bytes (if inputs do not cross a 4K
page size).  The it realign source to 16-bytes and issue a 16 bytes read
and compare loop to speedup null byte checks for large strings.  Also,
different from POWER7 optimization, the null pad is done inline in the
implementation using possible unaligned accesses, instead of realying on
a memset call.  Special case is added for page cross reads.
2015-01-13 11:28:44 -05:00