Commit Graph

6 Commits

Author SHA1 Message Date
Joseph Myers
f7a9f785e5 Update copyright dates with scripts/update-copyrights. 2016-01-04 16:05:18 +00:00
Joseph Myers
b168057aaa Update copyright dates with scripts/update-copyrights. 2015-01-02 16:29:47 +00:00
Chris Metcalf
0dacd7a3b9 tilegx: remove implicit boolean conversion in strstr.
[BZ #17746]
The __builtin_expect() truncated a uint64_t to a 32-bit long
in ILP32 mode, discarding the high 32 bits, and potentially
missing the NUL terminator that we were searching for with SIMD
operations.  Explicitly compare to zero to fix the problem.
2014-12-22 14:50:26 -05:00
Chris Metcalf
95dee05f17 tilegx: fix strstr to build and link better
The two_way_short_needle() routine included from str-two-way.h
is not used, so mark it so to avoid compiler warnings.

Calling strnlen() breaks linknamespace tests, so change it
to __strnlen().
2014-12-19 22:54:35 -05:00
Chris Metcalf
563a74d86c tile: fix copyright header blocks in just-committed files
I accidentally committed versions not following the conventions.
2014-10-06 13:47:02 -04:00
Chris Metcalf
c86f7b80f4 tilegx: provide optimized strnlen, strstr, and strcasestr
strnlen() is based on the existing tile strlen() with length
checking added.  It speeds up by up to 5x, but on average across
the benchtest corpus by around 35%.  No regressions are seen.

strstr() does 8-byte aligned loads and compares using a 2-byte
filter on the first two bytes of the needle and then testing
the remaining bytes in needle using memcmp().  It speeds up
about 5x in the best case (for "found" needles), about 2x looking
at benchtest as a whole, with some slowdowns as much as 45%.
on a few cases (including the "fail" case for 128KB search).

strcasestr() is based on strstr() but uses a SIMD tolower
routine to convert 8-bytes to lower case in 5 instructions.
It also uses a 2-byte filter and then strncasecmp() for the
remaining bytes.  strncasecmp() is not optimized for SIMD, so
there is futher room for improvement.  However, it is still up
to 16x faster for "found" needles, averaging 2x faster on the
whole corpus of benchtests.  It does slow down by up to 35%
on a few cases, similarly to strstr().
2014-10-06 11:19:18 -04:00