In the context of a function definition, the size hints imply that the
size of an object pointed to by one parameter is another parameter.
This doesn't make sense for the fortified versions of the functions
since that's the bit it's trying to validate.
This is harmless with __builtin_object_size since it has fairly simple
semantics when it comes to objects passed as function parameters.
With __builtin_dynamic_object_size we could (as my patchset for gcc[1]
already does) use the access attribute to determine the object size in
the general case but it misleads the fortified functions.
Basically the problem occurs when access attributes are present on
regular functions that have inline fortified definitions to generate
_chk variants; the attributes get inherited by these definitions,
causing problems when analyzing them. For example with poll(fds, nfds,
timeout), nfds is hinted using the __attr_access as being the size of
fds.
Now, when analyzing the inline function definition in bits/poll2.h, the
compiler sees that nfds is the size of fds and tries to use that
information in the function body. In _FORTIFY_SOURCE=3 case, where the
object size could be a non-constant expression, this information results
in the conclusion that nfds is the size of fds, which defeats the
purpose of the implementation because we're trying to check here if nfds
does indeed represent the size of fds. Hence for this case, it is best
to not have the access attribute.
With the attributes gone, the expression evaluation should get delayed
until the function is actually inlined into its destinations.
Disable the access attribute for fortified function inline functions
when building at _FORTIFY_SOURCE=3 to make this work better. The
access attributes remain for the _chk variants since they can be used
by the compiler to warn when the caller is passing invalid arguments.
[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-October/581125.html
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
The builtin has been available in gcc since 4.7.0 and in clang since
2.6. This fixes stpncpy fortification with clang since it does a
better job of plugging in __stpncpy_chk in the right place than the
header hackery.
This has been tested by building and running all tests with gcc 10.2.1
and also with clang tip as of a few days ago (just the tests in debug/
since running all tests don't work with clang at the moment) to make
sure that both compilers pass the stpncpy tests.
Non-gcc compilers (clang and possibly other compilers that do not
masquerade as gcc 5.0 or later) are unable to use
__warn_memset_zero_len since the symbol is no longer available on
glibc built with gcc 5.0 or later. While it was likely an oversight
that caused this omission, the fact that it wasn't noticed until
recently (when clang closed the gap on _FORTIFY_SUPPORT) that the
symbol was missing.
Given that both gcc and clang are capable of doing this check in the
compiler, drop all remaining signs of __warn_memset_zero_len from
glibc so that no more objects are built with this symbol in future.
Adds the access attribute newly introduced in GCC 10 to the subset of
function declarations that are already covered by _FORTIFY_SOURCE and
that don't have corresponding GCC built-in equivalents.
Reviewed-by: DJ Delorie <dj@redhat.com>
With only two exceptions (sys/types.h and sys/param.h, both of which
historically might have defined BYTE_ORDER) the public headers that
include <endian.h> only want to be able to test __BYTE_ORDER against
__*_ENDIAN.
This patch creates a new bits/endian.h that can be included by any
header that wants to be able to test __BYTE_ORDER and/or
__FLOAT_WORD_ORDER against the __*_ENDIAN constants, or needs
__LONG_LONG_PAIR. It only defines macros in the implementation
namespace.
The existing bits/endian.h (which could not be included independently
of endian.h, and only defines __BYTE_ORDER and maybe __FLOAT_WORD_ORDER)
is renamed to bits/endianness.h. I also took the opportunity to
canonicalize the form of this header, which we are stuck with having
one copy of per architecture. Since they are so short, this means git
doesn’t understand that they were renamed from existing headers, sigh.
endian.h itself is a nonstandard header and its only remaining use
from a standard header is guarded by __USE_MISC, so I dropped the
__USE_MISC conditionals from around all of the public-namespace things
it defines. (This means, an application that requests strict library
conformance but includes endian.h will still see the definition of
BYTE_ORDER.)
A few changes to specific bits/endian(ness).h variants deserve
mention:
- sysdeps/unix/sysv/linux/ia64/bits/endian.h is moved to
sysdeps/ia64/bits/endianness.h. If I remember correctly, ia64 did
have selectable endianness, but we have assembly code in
sysdeps/ia64 that assumes it’s little-endian, so there is no reason
to treat the ia64 endianness.h as linux-specific.
- The C-SKY port does not fully support big-endian mode, the compile
will error out if __CSKYBE__ is defined.
- The PowerPC port had extra logic in its bits/endian.h to detect a
broken compiler, which strikes me as unnecessary, so I removed it.
- The only files that defined __FLOAT_WORD_ORDER always defined it to
the same value as __BYTE_ORDER, so I removed those definitions.
The SH bits/endian(ness).h had comments inconsistent with the
actual setting of __FLOAT_WORD_ORDER, which I also removed.
- I *removed* copyright boilerplate from the few bits/endian(ness).h
headers that had it; these files record a single fact in a fashion
dictated by an external spec, so I do not think they are copyrightable.
As long as I was changing every copy of ieee754.h in the tree, I
noticed that only the MIPS variant includes float.h, because it uses
LDBL_MANT_DIG to decide among three different versions of
ieee854_long_double. This patch makes it not include float.h when
GCC’s intrinsic __LDBL_MANT_DIG__ is available.
* string/endian.h: Unconditionally define LITTLE_ENDIAN,
BIG_ENDIAN, PDP_ENDIAN, and BYTE_ORDER. Condition byteswapping
macros only on !__ASSEMBLER__. Move the definitions of
__BIG_ENDIAN, __LITTLE_ENDIAN, __PDP_ENDIAN, __FLOAT_WORD_ORDER,
and __LONG_LONG_PAIR to...
* string/bits/endian.h: ...this new file, which includes
the renamed header bits/endianness.h for the definition of
__BYTE_ORDER and possibly __FLOAT_WORD_ORDER.
* string/Makefile: Install bits/endianness.h.
* include/bits/endian.h: New wrapper.
* bits/endian.h: Rename to bits/endianness.h.
Add multiple-include guard. Rewrite the comment explaining what
the machine-specific variants of this file should do.
* sysdeps/unix/sysv/linux/ia64/bits/endian.h:
Move to sysdeps/ia64.
* sysdeps/aarch64/bits/endian.h
* sysdeps/alpha/bits/endian.h
* sysdeps/arm/bits/endian.h
* sysdeps/csky/bits/endian.h
* sysdeps/hppa/bits/endian.h
* sysdeps/ia64/bits/endian.h
* sysdeps/m68k/bits/endian.h
* sysdeps/microblaze/bits/endian.h
* sysdeps/mips/bits/endian.h
* sysdeps/nios2/bits/endian.h
* sysdeps/powerpc/bits/endian.h
* sysdeps/riscv/bits/endian.h
* sysdeps/s390/bits/endian.h
* sysdeps/sh/bits/endian.h
* sysdeps/sparc/bits/endian.h
* sysdeps/x86/bits/endian.h:
Rename to endianness.h; canonicalize form of file; remove
redundant definitions of __FLOAT_WORD_ORDER.
* sysdeps/powerpc/bits/endianness.h: Remove logic to check for
broken compilers.
* ctype/ctype.h
* sysdeps/aarch64/nptl/bits/pthreadtypes-arch.h
* sysdeps/arm/nptl/bits/pthreadtypes-arch.h
* sysdeps/csky/nptl/bits/pthreadtypes-arch.h
* sysdeps/ia64/ieee754.h
* sysdeps/ieee754/ieee754.h
* sysdeps/ieee754/ldbl-128/ieee754.h
* sysdeps/ieee754/ldbl-128ibm/ieee754.h
* sysdeps/m68k/nptl/bits/pthreadtypes-arch.h
* sysdeps/microblaze/nptl/bits/pthreadtypes-arch.h
* sysdeps/mips/ieee754/ieee754.h
* sysdeps/mips/nptl/bits/pthreadtypes-arch.h
* sysdeps/nios2/nptl/bits/pthreadtypes-arch.h
* sysdeps/nptl/pthread.h
* sysdeps/riscv/nptl/bits/pthreadtypes-arch.h
* sysdeps/sh/nptl/bits/pthreadtypes-arch.h
* sysdeps/sparc/sparc32/ieee754.h
* sysdeps/unix/sysv/linux/generic/bits/stat.h
* sysdeps/unix/sysv/linux/generic/bits/statfs.h
* sysdeps/unix/sysv/linux/sys/acct.h
* wctype/bits/wctype-wchar.h:
Include bits/endian.h, not endian.h.
* sysdeps/unix/sysv/linux/hppa/pthread.h: Don’t include endian.h.
* sysdeps/mips/ieee754/ieee754.h: Use __LDBL_MANT_DIG__
in ifdefs, instead of LDBL_MANT_DIG. Only include float.h
when __LDBL_MANT_DIG__ is not predefined, in which case
define __LDBL_MANT_DIG__ to equal LDBL_MANT_DIG.
These machine-dependent inline string functions have never been on by
default, and even if they were a good idea at the time they were
introduced, they haven't really been touched in ten to fifteen years
and probably aren't a good idea on current-gen processors. Current
thinking is that this class of optimization is best left to the
compiler.
* bits/string.h, string/bits/string.h
* sysdeps/aarch64/bits/string.h
* sysdeps/m68k/m680x0/m68020/bits/string.h
* sysdeps/s390/bits/string.h, sysdeps/sparc/bits/string.h
* sysdeps/x86/bits/string.h: Delete file.
* string/string.h: Don't include bits/string.h.
* string/bits/string3.h: Rename to bits/string_fortified.h.
No need to undef various symbols that the removed headers
might have defined as macros.
* string/Makefile (headers): Remove bits/string.h, change
bits/string3.h to bits/string_fortified.h.
* string/string-inlines.c: Update commentary. Remove definitions
of various macros that nothing looks at anymore. Don't directly
include bits/string.h. Set _STRING_INLINE_unaligned here, based on
compiler-predefined macros.
* string/strncat.c: If STRNCAT is not defined, or STRNCAT_PRIMARY
_is_ defined, provide internal hidden alias __strncat.
* include/string.h: Declare internal hidden alias __strncat.
Only forward __stpcpy to __builtin_stpcpy if __NO_STRING_INLINES is
not defined.
* include/bits/string3.h: Rename to bits/string_fortified.h,
update to match above.
* sysdeps/i386/string-inlines.c: Define compat symbols for
everything formerly defined by sysdeps/x86/bits/string.h.
Make existing definitions into compat symbols as well.
Remove some no-longer-necessary messing around with macros.
* sysdeps/powerpc/powerpc32/power4/multiarch/mempcpy.c
* sysdeps/powerpc/powerpc64/multiarch/mempcpy.c
* sysdeps/powerpc/powerpc64/multiarch/stpcpy.c
* sysdeps/s390/multiarch/mempcpy.c
No need to define _HAVE_STRING_ARCH_mempcpy.
Do define __NO_STRING_INLINES and NO_MEMPCPY_STPCPY_REDIRECT.
* sysdeps/i386/i686/multiarch/strncat-c.c
* sysdeps/s390/multiarch/strncat-c.c
* sysdeps/x86_64/multiarch/strncat-c.c
Define STRNCAT_PRIMARY. Don't change definition of libc_hidden_def.
There is no longer a need for string2.h, so remove it and all mention of it.
Move the redirect for __stpcpy to include/string.h since it is still required
until all internal uses have been renamed.
This fixes several linknamespace/localplt failures when building with -Os.
[BZ #15105]
[BZ #19463]
* include/string.h: Add internal redirect for __stpcpy.
* string/Makefile: Remove bits/string2.h.
* string/string.h: Update comment.
* string/string-inlines.c: Remove bits/string2.h include and comment.
* string/bits/string2.h: Remove file.
calls with constant strings shows a small (~10%) performance gain, strdup is
typically used in error reporting code, so not performance critical.
Remove the now unused __need_malloc_and_calloc related defines from stdlib.h.
Rename existing uses of str(n)dup to __str(n)dup so it no longer needs to be
redirected to a builtin. Also building GLIBC with -Os now no longer shows
localplt or linkname space failures (partial fix for BZ #15105 and BZ #19463).
[BZ #15105]
[BZ #19463]
* elf/dl-cache.c (_dl_load_cache_lookup): Use __strdup.
* inet/rcmd.c (rcmd_af): Likewise.
* inet/rexec.c (rexec_af): Likewise.
* intl/dcigettext.c (_LIBC): Likewise.
* intl/finddomain.c (_nl_find_domain): Use strdup expansion.
* locale/loadarchive.c (_nl_load_locale_from_archive): Use __strdup.
* locale/setlocale.c (setlocale): Likewise.
* posix/spawn_faction_addopen.c
(posix_spawn_file_actions_addopen): Likewise.
* stdlib/putenv.c (putenv): Use __strndup.
* sunrpc/svc_simple.c (__registerrpc): Use __strdup.
* sysdeps/posix/getaddrinfo.c (gaih_inet): Use __strdup/__strndup.
* include/stdlib.h (__need_malloc_and_calloc): Remove uses.
(__Need_M_And_C) Remove define/undef.
* stdlib/stdlib.h (__need_malloc_and_calloc): Remove uses.
(__malloc_and_calloc_defined): Remove define.
* string/bits/string2.h (__strdup): Remove define.
(strdup): Likewise.
(__strndup): Likewise.
(strndup): Likewise.
optimization seems unlikely to ever be useful, but if it occurs in
real code it should be added to GCC. Expanding strcmp of small strings
does appear useful (benchmarking shows it is 2-3x faster), so this would
be useful to implement in GCC (PR 78809).
* string/bits/string2.h (strcmp): Remove define.
(__strcmp_cg): Likewise.
(strncmp): Likewise.
This is transformed into rawmemchr by the bits/string2.h header.
However this is generally slower than strlen on most targets, even when
an optimized rawmemchr implementation exists. Since GCC7 optimizes
strchr (s, '\0') to strlen (s) + s, the GLIBC headers should not
transform this to rawmemchr. As GCC recognizes strchr as a builtin,
defining strchr as the builtin is not useful.
* string/bits/string2.h (strchr): Remove define.
Commit 38765ab68f moved the bzero, bcopy, and explicit_bzero
fortified macros to a common header (strings_fortified.h). However
the side effect is a fortified explicit_bzero is defined when including
only strings.h.
This patch moves back the fortified explicit_bzero definition to
strings3.h header.
Checked on x86_64-linux-gnu.
* string/bits/strings_fortified.h (explicit_bzero): Move back to ..
* string/bits/string3.h: ... here.
As described in BZ#20558, bzero and bcopy declaration can only benefit
from fortified macros when decl came from string.h and when __USE_MISC
is defined (default behaviour).
This is due no standard includes those functions in string.h, so they
are only declared if __USE_MISC is defined (as pointed out in comment 4).
However fortification should be orthogona to other features test macros,
i.e, any function should be fortified if that function is declared.
To fix this behavior, the patch moved the bzero, bcopy, and
__explicit_bzero_chk to a common header (string/bits/strings_fortified.h)
and explicit fortified inclusion macros similar to string.h is added
on strings.h. This allows to get fortified declarions by only including
strings.h.
Checked on x86_64-linux-gnu and along on a bootstrap installation to check
if the fortified are correctly triggered with example from bug report.
[BZ #20558]
* string/bits/string3.h [__USE_MISC] (bcopy): Move to
strings_fortified.h.
[__USE_MISC] (bzero): Likewise.
[__USE_MISC] (explicit_bzero): Likewise.
* string/strings.h: Include strings_fortified.h.
* string/Makefile (headers): Add strings_fortified.h.
* string/bits/strings_fortified.h: New file.
* include/bits/strings_fortified.h: Likewise.
Currently strsep calls strpbrk is is now a veneer to strcspn. Calling
strcspn directly is faster. Since it handles a delimiter string of size
1 as a special case, this is not needed in strsep itself. Although this
means there is a slightly higher overhead if the delimiter size is 1,
all other cases are slightly faster. The overall performance gain is 5-10%
on AArch64.
The string/bits/string2.h header contains optimizations for constant
delimiters of size 1-3. Benchmarking these showed similar performance for
size 1 (since in all cases strchr/strchrnul is used), while size 2 and 3
can give up to 2x speedup for small input strings. However if these cases
are common it seems much better to add this optimization to strcspn.
So move these header optimizations to string-inlines.c.
Improve the strsep benchmark so that it actually benchmarks something.
The current version contains a delimiter character at every position in the
input string, so there is very little work to do, and the extremely inefficent
simple_strsep implementation appears fastest in every case. The new version
has either no match in the input for the fail case and a match halfway in the
input for the success case. The input is then restored so that each iteration
does exactly the same amount of work. Reduce the number of testcases since
simple_strsep takes a lot of time now.
* benchtests/bench-strsep.c (oldstrsep): Add old implementation.
(do_one_test) Restore original string so iteration works.
* string/string-inlines.c (do_test): Create better input strings.
(test_main) Reduce number of testruns.
* string/string-inlines.c (__old_strsep_1c): New function.
(__old_strsep_2c): Likewise.
(__old_strsep_3c): Likewise.
* string/strsep.c (__strsep): Remove case of small delim string.
Call strcspn directly rather than strpbrk.
* string/bits/string2.h (__strsep): Remove define.
(__strsep_1c): Remove.
(__strsep_2c): Remove.
(__strsep_3c): Remove.
(strsep): Remove.
* sysdeps/unix/sysv/linux/internal_statvfs.c
(__statvfs_getflags): Rename to __strsep.
explicit_bzero(s, n) is the same as memset(s, 0, n), except that the
compiler is not allowed to delete a call to explicit_bzero even if the
memory pointed to by 's' is dead after the call. Right now, this effect
is achieved externally by having explicit_bzero be a function whose
semantics are unknown to the compiler, and internally, with a no-op
asm statement that clobbers memory. This does mean that small
explicit_bzero operations cannot be expanded inline as small memset
operations can, but on the other hand, small memset operations do get
deleted by the compiler. Hopefully full compiler support for
explicit_bzero will happen relatively soon.
There are two new tests: test-explicit_bzero.c verifies the
visible semantics in the same way as the existing test-bzero.c,
and tst-xbzero-opt.c verifies the not-being-optimized-out property.
The latter is conceptually based on a test written by Matthew Dempsky
for the OpenBSD regression suite.
The crypt() implementation has an immediate use for this new feature.
We avoid having to add a GLIBC_PRIVATE alias for explicit_bzero
by running all of libcrypt's calls through the fortified variant,
__explicit_bzero_chk, which is in the impl namespace anyway. Currently
I'm not aware of anything in libc proper that needs this, but the
glue is all in place if it does become necessary. The legacy DES
implementation wasn't bothering to clear its buffers, so I added that,
mostly for consistency's sake.
* string/explicit_bzero.c: New routine.
* string/test-explicit_bzero.c, string/tst-xbzero-opt.c: New tests.
* string/Makefile (routines, strop-tests, tests): Add them.
* string/test-memset.c: Add ifdeffage for testing explicit_bzero.
* string/string.h [__USE_MISC]: Declare explicit_bzero.
* debug/explicit_bzero_chk.c: New routine.
* debug/Makefile (routines): Add it.
* debug/tst-chk1.c: Test fortification of explicit_bzero.
* string/bits/string3.h: Fortify explicit_bzero.
* manual/string.texi: Document explicit_bzero.
* NEWS: Mention addition of explicit_bzero.
* crypt/crypt-entry.c (__crypt_r): Clear key-dependent intermediate
data before returning, using explicit_bzero.
* crypt/md5-crypt.c (__md5_crypt_r): Likewise.
* crypt/sha256-crypt.c (__sha256_crypt_r): Likewise.
* crypt/sha512-crypt.c (__sha512_crypt_r): Likewise.
* include/string.h: Redirect internal uses of explicit_bzero
to __explicit_bzero_chk[_internal].
* string/Versions [GLIBC_2.25]: Add explicit_bzero.
* debug/Versions [GLIBC_2.25]: Add __explicit_bzero_chk.
* sysdeps/arm/nacl/libc.abilist
* sysdeps/unix/sysv/linux/aarch64/libc.abilist
* sysdeps/unix/sysv/linux/alpha/libc.abilist
* sysdeps/unix/sysv/linux/arm/libc.abilist
* sysdeps/unix/sysv/linux/hppa/libc.abilist
* sysdeps/unix/sysv/linux/i386/libc.abilist
* sysdeps/unix/sysv/linux/ia64/libc.abilist
* sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist
* sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist
* sysdeps/unix/sysv/linux/microblaze/libc.abilist
* sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist
* sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist
* sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist
* sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist
* sysdeps/unix/sysv/linux/nios2/libc.abilist
* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libc.abilist
* sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist
* sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist
* sysdeps/unix/sysv/linux/sh/libc.abilist
* sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist
* sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist
* sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libc.abilist
* sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libc.abilist
* sysdeps/unix/sysv/linux/tile/tilepro/libc.abilist
* sysdeps/unix/sysv/linux/x86_64/64/libc.abilist
* sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist:
Add entries for explicit_bzero and __explicit_bzero_chk.
calls strcspn, call strcspn directly so we get the end of the token without
an extra call to rawmemchr. Also avoid an unnecessary call to strcspn after
the last token by adding an early exit for an empty string. Change strtok
to tailcall strtok_r to avoid unnecessary code duplication.
Remove the special header optimization for strtok_r of a 1-character
constant string - both strspn and strcspn contain optimizations for this
case. Benchmarking this showed similar performance in the worst case,
but up to 5.5x better performance in the "found" case for large inputs.
* benchtests/bench-strtok.c (oldstrtok): Add old implementation.
* string/strtok.c (strtok): Change to tailcall __strtok_r.
* string/strtok_r.c (__strtok_r): Optimize for performance.
* string/string-inlines.c (__old_strtok_r_1c): New function.
* string/bits/string2.h (__strtok_r): Move to string-inlines.c.
The comment above the bzero() macro in this file appears to have been
copied verbatim from the comment above the memset() prototype in
string.h proper. bzero() has no 'c' argument and can only set memory
contents to 0. (The comment above the prototype of bzero() in
string.h proper does not make the same mistake.)
* string/bits/string2.h: Fix typo in comment.
The new check-installed-headers rule check now complains with C++
comment from string3.h with:
../string/bits/string3.h:129:1: error: C++ style comments are not allowed in ISO C90
// XXX We have no corresponding builtin yet.
Let use old C style comment to make compiler happy in old modes.
Tested on x86_64.
* string/bits/string3.h: Remove C++ style comments.
With now a faster strcspn implementation, it is faster to just use
it with some return tests than reimplementing strpbrk itself.
As for strcspn optimization, it is generally at least 10 times faster
than the existing implementation on bench-strspn on a few AArch64
implementations.
Also the string/bits/string2.h inlines make no longer sense, as current
implementation will already implement most of the optimizations.
Tested on x86_64, i386, and aarch64.
* string/strpbrk.c (strpbrk): Rewrite function.
* string/bits/string2.h (strpbrk): Use __builtin_strpbrk.
(__strpbrk_c2): Likewise.
(__strpbrk_c3): Likewise.
* string/string-inlines.c
[SHLIB_COMPAT(libc, GLIBC_2_1_1, GLIBC_2_24)] (__strpbrk_c2):
Likewise.
[SHLIB_COMPAT(libc, GLIBC_2_1_1, GLIBC_2_24)] (__strpbrk_c3):
Likewise.
As for strcspn, this patch improves strspn performance using a much
faster algorithm. It first constructs a 256-entry table based on
the accept string and then uses it as a lookup table for the
input string. As for strcspn optimization, it is generally at least
10 times faster than the existing implementation on bench-strspn
on a few AArch64 implementations.
Also the string/bits/string2.h inlines make no longer sense, as current
implementation will already implement most of the optimizations.
Tested on x86_64, i686, and aarch64.
* string/strspn.c (strcspn): Rewrite function.
* string/bits/string2.h (strspn): Use __builtin_strcspn.
(__strspn_c1): Remove inline function.
(__strspn_c2): Likewise.
(__strspn_c3): Likewise.
* string/string-inlines.c
[SHLIB_COMPAT(libc, GLIBC_2_1_1, GLIBC_2_24)] (__strspn_c1): Add
compatibility symbol.
[SHLIB_COMPAT(libc, GLIBC_2_1_1, GLIBC_2_24)] (__strspn_c2):
Likewise.
[SHLIB_COMPAT(libc, GLIBC_2_1_1, GLIBC_2_24)] (__strspn_c3):
Likewise.
Improve strcspn performance using a much faster algorithm. It is kept simple
so it works well on most targets. It is generally at least 10 times faster
than the existing implementation on bench-strcspn on a few AArch64
implementations, and for some tests 100 times as fast (repeatedly calling
strchr on a small string is extremely slow...).
In fact the string/bits/string2.h inlines make no longer sense, as GCC
already uses strlen if reject is an empty string, strchrnul is 5 times as
fast as __strcspn_c1, while __strcspn_c2 and __strcspn_c3 are slower than
the strcspn main loop for large strings (though reject length 2-4 could be
special cased in the future to gain even more performance).
Tested on x86_64, i686, and aarch64.
* string/Version (libc): Add GLIBC_2.24.
* string/strcspn.c (strcspn): Rewrite function.
* string/bits/string2.h (strcspn): Use __builtin_strcspn.
(__strcspn_c1): Remove inline function.
(__strcspn_c2): Likewise.
(__strcspn_c3): Likewise.
* string/string-inline.c
[SHLIB_COMPAT(libc, GLIBC_2_1_1, GLIBC_2_24)] (__strcspn_c1): Add
compatibility symbol.
[SHLIB_COMPAT(libc, GLIBC_2_1_1, GLIBC_2_24)] (__strcspn_c2):
Likewise.
[SHLIB_COMPAT(libc, GLIBC_2_1_1, GLIBC_2_24)] (__strcspn_c3):
Likewise.
* sysdeps/i386/string-inlines.c: Include generic string-inlines.c.
As discussed in
https://sourceware.org/ml/libc-alpha/2015-10/msg00403.html
the setting of _STRING_ARCH_unaligned currently controls the external
GLIBC ABI as well as selecting the use of unaligned accesses withing
GLIBC.
Since _STRING_ARCH_unaligned was recently changed for AArch64, this
would potentially break the ABI in GLIBC 2.23, so split the uses and add
_STRING_INLINE_unaligned to select the string ABI. This setting must be
fixed for each target, while _STRING_ARCH_unaligned may be changed from
release to release. _STRING_ARCH_unaligned is used unconditionally in
glibc. But <bits/string.h>, which defines _STRING_ARCH_unaligned, isn't
included with -Os. Since _STRING_ARCH_unaligned is internal to glibc and
may change between glibc releases, it should be made private to glibc.
_STRING_ARCH_unaligned should defined in the new string_private.h heade
file which is included unconditionally from internal <string.h> for glibc
build.
[BZ #19462]
* bits/string.h (_STRING_ARCH_unaligned): Renamed to ...
(_STRING_INLINE_unaligned): This.
* include/string.h: Include <string_private.h>.
* string/bits/string2.h: Replace _STRING_ARCH_unaligned with
_STRING_INLINE_unaligned.
* sysdeps/aarch64/bits/string.h (_STRING_ARCH_unaligned): Removed.
(_STRING_INLINE_unaligned): New.
* sysdeps/aarch64/string_private.h: New file.
* sysdeps/generic/string_private.h: Likewise.
* sysdeps/m68k/m680x0/m68020/string_private.h: Likewise.
* sysdeps/s390/string_private.h: Likewise.
* sysdeps/x86/string_private.h: Likewise.
* sysdeps/m68k/m680x0/m68020/bits/string.h
(_STRING_ARCH_unaligned): Renamed to ...
(_STRING_INLINE_unaligned): This.
* sysdeps/s390/bits/string.h (_STRING_ARCH_unaligned): Renamed
to ...
(_STRING_INLINE_unaligned): This.
* sysdeps/sparc/bits/string.h (_STRING_ARCH_unaligned): Renamed
to ...
(_STRING_INLINE_unaligned): This.
* sysdeps/x86/bits/string.h (_STRING_ARCH_unaligned): Renamed
to ...
(_STRING_INLINE_unaligned): This.
I think the last clause of the conditional,
|| __n <= __bos (__dest)
may be backward. The code should call the runtime-checking function
if __n is not constant, or if __n is known to be LARGER than the size
of the destination.
gcc now warns when the arguments to memset may have been accidentally
transposed (i.e. length set to zero instead of the byte), so we don't
need that bit of the code in glibc headers anymore.
Tested on x86_64. Coe generated by gcc 4.8 is identical with or
without the patch. I also tested gcc master, which does not result in
any new failures. It does fail quite a few FORTIFY_SOURCE tests, but
those failures are not due to this patch.
This patch cleans up cases of __USE_MISC that are trivially redundant
after the recent substitution of __USE_MISC for __USE_BSD and
__USE_SVID: either in constructs such as "defined __USE_MISC ||
defined __USE_MISC", or else (in the bits/mman.h case) a conditional
on __USE_MISC nested inside another __USE_MISC conditional. (The
cleanups remaining after this patch are still quite large, but it
seems a reasonable piece to separate out.)
Tested x86_64.
* bits/mman.h [__USE_MISC]: Remove redundant conditionals.
* ctype/ctype.h [__USE_MISC]: Likewise.
* dirent/dirent.h [__USE_MISC]: Likewise.
* grp/grp.h [__USE_MISC]: Likewise.
* io/fcntl.h [__USE_MISC]: Likewise.
* io/sys/stat.h [__USE_MISC]: Likewise.
* libio/stdio.h [__USE_MISC]: Likewise.
* posix/unistd.h [__USE_MISC]: Likewise.
* pwd/pwd.h [__USE_MISC]: Likewise.
* stdlib.h [__USE_MISC]: Likewise.
* string/bits/string2.h [__USE_MISC]: Likewise.
* string/string.h [__USE_MISC]: Likewise.
* time/time.h [__USE_MISC]: Likewise.
[BZ #14083]
Fix warning when using strspn with -Wconversion:
$ gcc -Wconversion -O t.c
t.c: In function ‘main’:
t.c:8:7: warning: conversion to ‘long unsigned int’ from ‘int’ may change the sign of the result [-Wsign-conversion]