Without the asm redirects, strchr et al. are not const-correct.
libc++ has a wrapper header that works with and without
__CORRECT_ISO_CPP_STRING_H_PROTO (using a Clang extension). But when
Clang is used with libstdc++ or just C headers, the overloaded functions
with the correct types are not declared.
This change does not impact current GCC (with libstdc++ or libc++).
If the specified needle crosses a page-boundary, the s390-z15 ifunc variant of
strstr truncates the needle which results in invalid results.
This is fixed by loading the needle beyond the page boundary to v18 instead of v16.
The bug is sometimes observable in test-strstr.c in check1 and check2 as the
haystack and needle is stored on stack. Thus the needle can be on a page boundary.
check2 is now extended to test haystack / needles located on stack, at end of page
and on two pages.
This bug was introduced with commit 6f47401bd5
("S390: Add arch13 strstr ifunc variant.") and is already released in glibc 2.30.
As for gettimeofday, time will be implemented based on clock_gettime
on all platforms and internal code should use clock_gettime
directly. In addition to removing a layer of indirection, this will
allow us to remove the PLT-bypass gunk for gettimeofday.
The changed code always assumes __clock_gettime (CLOCK_REALTIME)
or __clock_gettime (CLOCK_REALTIME_COARSE) (for Linux case) cannot
fail, using the same rationale for gettimeofday change. And internal
helper was added (time_now).
Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc64le-linux-gnu,
powerpc64-linux-gnu, and powerpc-linux-gnu.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
Commit 69fd157a3 "time: Add padding for the timespec if required"
caused a breakage in the glibc tests as the endian.h include file was
kept in the networking headers while the __USE_MISC #ifdefs had been
removed. This resulted in namespace violations in the networking
headers.
This patche restores the __USE_MISC conditionals in endian.h to fix the
test failures.
* string/endian.h: Restore the __USE_MISC conditionals.
string/tester.c contains code that correctly triggers various GCC
warnings about dubious uses of string functions (uses that are being
deliberately tested there), and duly disables those warnings around
the relevant code.
A change in GCC mainline resulted in this code failing to compile with
a -Warray-bounds error, despite the location with the error having
-Warray-bounds already disabled. This has been reported as
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91890>. This patch
avoids that problem and possible future issues with these diagnostics
by moving all the warning disabling in this file to top level, as
suggested by Florian in
<https://sourceware.org/ml/libc-alpha/2019-10/msg00033.html>, rather
than only doing it locally around specific function calls.
Tested with build-many-glibcs.py for aarch64-linux-gnu with GCC
mainline (with only the conform/ failures noted in
<https://sourceware.org/ml/libc-alpha/2019-10/msg00043.html>).
* string/tester.c: Ignore -Warray-bounds and
-Wmemset-transposed-args at top level.
[__GNUC_PREREQ (7, 0)]: Ignore -Wrestrict and -Wstringop-overflow=
at top level.
[__GNUC_PREREQ (8, 0)]: Ignore -Wstringop-truncation at top level.
(test_stpncpy): Do not ignore warnings here.
(test_strncat): Likewise.
(test_strncpy): Likewise.
(test_memset): Likewise.
With only two exceptions (sys/types.h and sys/param.h, both of which
historically might have defined BYTE_ORDER) the public headers that
include <endian.h> only want to be able to test __BYTE_ORDER against
__*_ENDIAN.
This patch creates a new bits/endian.h that can be included by any
header that wants to be able to test __BYTE_ORDER and/or
__FLOAT_WORD_ORDER against the __*_ENDIAN constants, or needs
__LONG_LONG_PAIR. It only defines macros in the implementation
namespace.
The existing bits/endian.h (which could not be included independently
of endian.h, and only defines __BYTE_ORDER and maybe __FLOAT_WORD_ORDER)
is renamed to bits/endianness.h. I also took the opportunity to
canonicalize the form of this header, which we are stuck with having
one copy of per architecture. Since they are so short, this means git
doesn’t understand that they were renamed from existing headers, sigh.
endian.h itself is a nonstandard header and its only remaining use
from a standard header is guarded by __USE_MISC, so I dropped the
__USE_MISC conditionals from around all of the public-namespace things
it defines. (This means, an application that requests strict library
conformance but includes endian.h will still see the definition of
BYTE_ORDER.)
A few changes to specific bits/endian(ness).h variants deserve
mention:
- sysdeps/unix/sysv/linux/ia64/bits/endian.h is moved to
sysdeps/ia64/bits/endianness.h. If I remember correctly, ia64 did
have selectable endianness, but we have assembly code in
sysdeps/ia64 that assumes it’s little-endian, so there is no reason
to treat the ia64 endianness.h as linux-specific.
- The C-SKY port does not fully support big-endian mode, the compile
will error out if __CSKYBE__ is defined.
- The PowerPC port had extra logic in its bits/endian.h to detect a
broken compiler, which strikes me as unnecessary, so I removed it.
- The only files that defined __FLOAT_WORD_ORDER always defined it to
the same value as __BYTE_ORDER, so I removed those definitions.
The SH bits/endian(ness).h had comments inconsistent with the
actual setting of __FLOAT_WORD_ORDER, which I also removed.
- I *removed* copyright boilerplate from the few bits/endian(ness).h
headers that had it; these files record a single fact in a fashion
dictated by an external spec, so I do not think they are copyrightable.
As long as I was changing every copy of ieee754.h in the tree, I
noticed that only the MIPS variant includes float.h, because it uses
LDBL_MANT_DIG to decide among three different versions of
ieee854_long_double. This patch makes it not include float.h when
GCC’s intrinsic __LDBL_MANT_DIG__ is available.
* string/endian.h: Unconditionally define LITTLE_ENDIAN,
BIG_ENDIAN, PDP_ENDIAN, and BYTE_ORDER. Condition byteswapping
macros only on !__ASSEMBLER__. Move the definitions of
__BIG_ENDIAN, __LITTLE_ENDIAN, __PDP_ENDIAN, __FLOAT_WORD_ORDER,
and __LONG_LONG_PAIR to...
* string/bits/endian.h: ...this new file, which includes
the renamed header bits/endianness.h for the definition of
__BYTE_ORDER and possibly __FLOAT_WORD_ORDER.
* string/Makefile: Install bits/endianness.h.
* include/bits/endian.h: New wrapper.
* bits/endian.h: Rename to bits/endianness.h.
Add multiple-include guard. Rewrite the comment explaining what
the machine-specific variants of this file should do.
* sysdeps/unix/sysv/linux/ia64/bits/endian.h:
Move to sysdeps/ia64.
* sysdeps/aarch64/bits/endian.h
* sysdeps/alpha/bits/endian.h
* sysdeps/arm/bits/endian.h
* sysdeps/csky/bits/endian.h
* sysdeps/hppa/bits/endian.h
* sysdeps/ia64/bits/endian.h
* sysdeps/m68k/bits/endian.h
* sysdeps/microblaze/bits/endian.h
* sysdeps/mips/bits/endian.h
* sysdeps/nios2/bits/endian.h
* sysdeps/powerpc/bits/endian.h
* sysdeps/riscv/bits/endian.h
* sysdeps/s390/bits/endian.h
* sysdeps/sh/bits/endian.h
* sysdeps/sparc/bits/endian.h
* sysdeps/x86/bits/endian.h:
Rename to endianness.h; canonicalize form of file; remove
redundant definitions of __FLOAT_WORD_ORDER.
* sysdeps/powerpc/bits/endianness.h: Remove logic to check for
broken compilers.
* ctype/ctype.h
* sysdeps/aarch64/nptl/bits/pthreadtypes-arch.h
* sysdeps/arm/nptl/bits/pthreadtypes-arch.h
* sysdeps/csky/nptl/bits/pthreadtypes-arch.h
* sysdeps/ia64/ieee754.h
* sysdeps/ieee754/ieee754.h
* sysdeps/ieee754/ldbl-128/ieee754.h
* sysdeps/ieee754/ldbl-128ibm/ieee754.h
* sysdeps/m68k/nptl/bits/pthreadtypes-arch.h
* sysdeps/microblaze/nptl/bits/pthreadtypes-arch.h
* sysdeps/mips/ieee754/ieee754.h
* sysdeps/mips/nptl/bits/pthreadtypes-arch.h
* sysdeps/nios2/nptl/bits/pthreadtypes-arch.h
* sysdeps/nptl/pthread.h
* sysdeps/riscv/nptl/bits/pthreadtypes-arch.h
* sysdeps/sh/nptl/bits/pthreadtypes-arch.h
* sysdeps/sparc/sparc32/ieee754.h
* sysdeps/unix/sysv/linux/generic/bits/stat.h
* sysdeps/unix/sysv/linux/generic/bits/statfs.h
* sysdeps/unix/sysv/linux/sys/acct.h
* wctype/bits/wctype-wchar.h:
Include bits/endian.h, not endian.h.
* sysdeps/unix/sysv/linux/hppa/pthread.h: Don’t include endian.h.
* sysdeps/mips/ieee754/ieee754.h: Use __LDBL_MANT_DIG__
in ifdefs, instead of LDBL_MANT_DIG. Only include float.h
when __LDBL_MANT_DIG__ is not predefined, in which case
define __LDBL_MANT_DIG__ to equal LDBL_MANT_DIG.
Use the generic C memset/memcpy/memmove in benchtests since comparing
against a slow byte-oriented implementation makes no sense.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2019-08-29 Wilco Dijkstra <wdijkstr@arm.com>
* benchtests/bench-memcpy.c (simple_memcpy): Remove.
(generic_memcpy): Include generic C memcpy.
* benchtests/bench-memmove.c (simple_memmove): Remove.
(generic_memmove): Include generic C memmove.
* benchtests/bench-memset.c (simple_memset): Remove.
(generic_memset): Include generic C memset.
* benchtests/bench-memset-large.c (simple_memset): Remove.
(generic_memset): Include generic C memset.
* benchtests/bench-memset-walk.c (simple_memset): Remove.
(generic_memset): Include generic C memset.
* string/memcpy.c (MEMCPY): Add defines to enable redirection.
* string/memset.c (MEMSET): Likewise.
* sysdeps/x86_64/memcopy.h: Remove empty file.
It doesn't make sense to remove all the internal uses of time.
It's still a standard ISO C function, and its callers don't need
sub-second resolution and would be unnecessarily complicated if
they had to declare a struct timespec instead of just a time_t.
However, a handful of places were using the vestigial "result"
argument instead of the return value, which is slightly less
efficient and also looks strange. Correct this.
* misc/syslog.c (__vsyslog_internal)
* time/getdate.c (__getdate_r)
* time/tst_wcsftime.c (main):
Use return value of time, not its argument.
* string/strfry.c (strfry)
* sysdeps/mach/sleep.c (__sleep):
Remove unnecessary casts of NULL in calls to time.
C2X adds the memccpy, strdup and strndup functions. This patch duly
adds __GLIBC_USE (ISOC2X) to the conditions under which <string.h>
declares them.
Tested for x86_64.
* string/string.h (memccpy): Also declare if [__GLIBC_USE (ISOC2X)].
(strdup): Likewise.
(strndup): Likewise.
This patch fixes the following gcc 9 warnings for "make xcheck" / "make bench":
-string/tst-strcasestr.c:
../include/bits/../../misc/bits/error.h:42:5: error: ‘%s’ directive argument is null [-Werror=format-overflow=]
-argp/argp-test.c:
argp-test.c:130:20: error: ‘%d’ directive writing between 1 and 11 bytes into a region of size 10 [-Werror=format-overflow=]
argp-test.c:130:19: note: directive argument in the range [-2147483648, 122]
argp-test.c:130:5: note: ‘sprintf’ output between 2 and 12 bytes into a destination of size 10
-nss/tst-field.c:
tst-field.c:52:7: error: ‘%s’ directive argument is null [-Werror=format-overflow=]
-benchtests/bench-strstr.c:
../include/bits/../../misc/bits/error.h:42:5: error: ‘%s’ directive argument is null [-Werror=format-overflow=]
-benchtests/bench-malloc-simple.c:
bench-malloc-simple.c:93:16: error: iteration 3 invokes undefined behavior [-Werror=aggressive-loop-optimizations]
ChangeLog:
[BZ #24556]
* string/test-strcasestr.c (check_result): Add NULL check.
* nss/tst-field.c (check_rewrite): Likewise.
* benchtests/bench-strstr.c (do_one_test): Likewise.
* string/test-strstr.c (check_result): Likewise.
* argp/argp-test.c (popt): Increase size of buf to 12.
* benchtests/bench-malloc-simple.c (bench):
Do not initialize tests array out of bounds.
This patch significantly improves performance of memmem using a novel
modified Horspool algorithm. Needles up to size 256 use a bad-character
table indexed by hashed pairs of characters to quickly skip past mismatches.
Long needles use a self-adapting filtering step to avoid comparing the whole
needle repeatedly.
By limiting the needle length to 256, the shift table only requires 8 bits
per entry, lowering preprocessing overhead and minimizing cache effects.
This limit also implies worst-case performance is linear.
Small needles up to size 2 use a dedicated linear search. Very long needles
use the Two-Way algorithm (to avoid increasing stack size or slowing down
the common case, inlining is disabled).
The performance gain is 6.6 times on English text on AArch64 using random
needles with average size 8.
Tested against GLIBC testsuite and randomized tests.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
* string/memmem.c (__memmem): Rewrite to improve performance.
This patch significantly improves performance of strstr using a novel
modified Horspool algorithm. Needles up to size 256 use a bad-character
table indexed by hashed pairs of characters to quickly skip past mismatches.
Long needles use a self-adapting filtering step to avoid comparing the whole
needle repeatedly.
By limiting the needle length to 256, the shift table only requires 8 bits
per entry, lowering preprocessing overhead and minimizing cache effects.
This limit also implies worst-case performance is linear.
Small needles up to size 3 use a dedicated linear search. Very long needles
use the Two-Way algorithm.
The performance gain using the improved bench-strstr on Cortex-A72 is 5.8
times basic_strstr and 3.7 times twoway_strstr.
Tested against GLIBC testsuite, randomized tests and the GNULIB strstr test
(https://git.savannah.gnu.org/cgit/gnulib.git/tree/tests/test-strstr.c).
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
* string/str-two-way.h (two_way_short_needle): Add inline to avoid
warning.
(two_way_long_needle): Block inlining.
* string/strstr.c (strstr2): Add new function.
(strstr3): Likewise.
(STRSTR): Completely rewrite strstr to improve performance.
Commit 1294b1892e ("Add support for sqrt asm redirects") added the
-fno-math-errno flag to build most of the glibc in order to enable GCC
to inline math functions. Due to GCC bug #88576, saving and restoring
errno around calls to malloc are optimized-out. In turn this causes
strerror to set errno to ENOMEM if it get passed an invalid error number
and if malloc sets errno to ENOMEM (which might happen even if it
succeeds). This is not allowed by POSIX.
This patch changes the build flags, building only libm with
-fno-math-errno and all the remaining code with -fno-math-errno. This
should be safe as libm doesn't contain any code saving and restoring
errno around malloc. This patch can probably be reverted once the GCC
bug is fixed and available in stable releases.
Tested on x86-64, no regression in the testsuite.
Changelog:
[BZ #24024]
* Makeconfig: Build libm with -fno-math-errno but build the remaining
code with -fmath-errno.
* string/Makefile [$(build-shared)] (tests): Add test-strerror-errno.
[$(build-shared)] (LDLIBS-test-strerror-errno): New variable.
* string/test-strerror-errno.c: New file.
The generic strstr in GLIBC 2.28 fails to match huge needles. The optimized
AVAILABLE macro reads ahead a large fixed amount to reduce the overhead of
repeatedly checking for the end of the string. However if the needle length
is larger than this, two_way_long_needle may confuse this as meaning the end
of the string and return NULL. This is fixed by adding the needle length to
the amount to read ahead.
[BZ #23637]
* string/test-strstr.c (pr23637): New function.
(test_main): Add tests with longer needles.
* string/strcasestr.c (AVAILABLE): Fix readahead distance.
* string/strstr.c (AVAILABLE): Likewise.
As done in commit 284f42bc77, memcmp
can be used after memchr to avoid the initialization overhead of the
two-way algorithm for the first match. This has shown improvement
>40% for first match.
Looking at the benchtests, both strstr and strcasestr spend a lot of time
in a slow initialization loop handling one character per iteration.
This can be simplified and use the much faster strlen/strnlen/strchr/memcmp.
Read ahead a few cachelines to reduce the number of strnlen calls, which
improves performance by ~3-4%. This patch improves the time taken for the
full strstr benchtest by >40%.
* string/strcasestr.c (STRCASESTR): Simplify and speedup first match.
* string/strstr.c (AVAILABLE): Likewise.
On s390x, the test string/tst-xbzero-opt is failing if build with gcc head:
FAIL: no clear/prepare: expected 32 got 0
FAIL: no clear/test: expected some got 0
FAIL: ordinary clear/prepare: expected 32 got 0
INFO: ordinary clear/test: found 0 patterns (memset not eliminated)
PASS: explicit clear/prepare: expected 32 got 32
PASS: explicit clear/test: expected 0 got 0
In setup_no_clear / setup_ordinary_clear, GCC is omitting the memcpy loop
in prepare_test_buffer. Thus count_test_patterns does not find any of the
test_pattern.
This patch calls use_test_buffer in order to force the compiler to really copy
the pattern to buf.
ChangeLog:
* string/tst-xbzero-opt.c (use_test_buffer): New function.
(prepare_test_buffer): Call use_test_buffer as compiler barrier.
Add <bits/indirect-return.h> and include it in <ucontext.h>.
__INDIRECT_RETURN defined in <bits/indirect-return.h> indicates if
swapcontext requires special compiler treatment. The default
__INDIRECT_RETURN is empty.
On x86, when shadow stack is enabled, __INDIRECT_RETURN is defined
with indirect_return attribute, which has been added to GCC 9, to
indicate that swapcontext returns via indirect branch. Otherwise
__INDIRECT_RETURN is defined with returns_twice attribute.
When shadow stack is enabled, remove always_inline attribute from
prepare_test_buffer in string/tst-xbzero-opt.c to avoid:
tst-xbzero-opt.c: In function ‘prepare_test_buffer’:
tst-xbzero-opt.c:105:1: error: function ‘prepare_test_buffer’ can never be inlined because it uses setjmp
prepare_test_buffer (unsigned char *buf)
when indirect_return attribute isn't available.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* bits/indirect-return.h: New file.
* misc/sys/cdefs.h (__glibc_has_attribute): New.
* sysdeps/x86/bits/indirect-return.h: Likewise.
* stdlib/Makefile (headers): Add bits/indirect-return.h.
* stdlib/ucontext.h: Include <bits/indirect-return.h>.
(swapcontext): Add __INDIRECT_RETURN.
* string/tst-xbzero-opt.c (ALWAYS_INLINE): New.
(prepare_test_buffer): Use it.
Improve strstr performance. Strstr tends to be slow because it uses
many calls to memchr and a slow byte loop to scan for the next match.
Performance is significantly improved by using strnlen on larger blocks
and using strchr to search for the next matching character. strcasestr
can also use strnlen to scan ahead, and memmem can use memchr to check
for the next match.
On the GLIBC bench tests the performance gains on Cortex-A72 are:
strstr: +25%
strcasestr: +4.3%
memmem: +18%
On a 256KB dataset strstr performance improves by 67%, strcasestr by 47%.
Reviewd-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The __libc_freeres framework does not extend to non-libc.so objects.
This causes problems in general for valgrind and mtrace detecting
unfreed objects in both libdl.so and libpthread.so. This change is
a pre-requisite to properly moving the malloc hooks out of malloc
since such a move now requires precise accounting of all allocated
data before destructors are run.
This commit adds a proper hook in libc.so.6 for both libdl.so and
for libpthread.so, this ensures that shm-directory.c which uses
freeit () to free memory is called properly. We also remove the
nptl_freeres hook and fall back to using weak-ref-and-check idiom
for a loaded libpthread.so, thus making this process similar for
all DSOs.
Lastly we follow best practice and use explicit free calls for
both libdl.so and libpthread.so instead of the generic hook process
which has undefined order.
Tested on x86_64 with no regressions.
Signed-off-by: DJ Delorie <dj@redhat.com>
Signed-off-by: Carlos O'Donell <carlos@redhat.com>
Building the testsuite with GCC mainline fails with
-Wstringop-overflow= errors in string/tst-cmp.c. These are for calls
to strncmp and strncasecmp with SIZE_MAX size argument. The tests are
deliberately using this size that would be dubious in normal code, so
this patch disables the warning for the calls in question.
Tested with build-many-glibcs.py for aarch64-linux-gnu.
* string/tst-cmp.c: Include <libc-diag.h>.
(strncmp_max): Disable -Wstringop-overflow= around call to
strncmp.
(strncasecmp_max): Disable -Wstringop-overflow= around call to
strncasecmp.
Building the testsuite with GCC mainline fails with:
bug-strspn1.c: In function 'main':
bug-strspn1.c:14:3: error: right-hand operand of comma expression has no effect [-Werror=unused-value]
strspn (b++, "");
^~~~~~~~~~~~~~~~
and a similar error for bug-strpbrk1.c. I'm not sure what GCC change
introduced this, and the wording of the message is a bit off (in the
source it's not a comma expression, that must reflect GCC's IR). But
the warning is correct (strspn is a pure function, the call is
useless, and if there wasn't an argument with a side effect much older
GCC would have warned); the point of the test is to verify that the
side effect in an argument still occurs for this useless call that can
otherwise be optimized to an (unused) constant (testing for a bug
there once was in an old strspn macro). This patch duly arranges for
the warning to be disabled for this code.
Tested with build-many-glibcs.py for aarch64-linux-gnu.
* string/bug-strpbrk1.c: Include <libc-diag.h>.
(main): Disable -Wunused-value around call to strpbrk.
* string/bug-strspn1.c: Include <libc-diag.h>.
(main): Disable -Wunused-value around call to strspn.
With current GCC mainline, one strncat test involving a size close to
SIZE_MAX results in a -Wrestrict warning that that buffer size would
imply that the two buffers must overlap. This patch fixes the build
by adding disabling of -Wrestrict (for GCC versions supporting that
option) to the already-present disabling of -Wstringop-overflow= and
-Warray-bounds for this test.
Tested with build-many-glibcs.py that this restores the testsuite
build with GCC mainline for aarch64-linux-gnu.
* string/tester.c (test_strncat) [__GNUC_PREREQ (7, 0)]: Also
ignore -Wrestrict for one test.
While there are now clean -Os build and test results on x86_64 (given
my patch <https://sourceware.org/ml/libc-alpha/2018-02/msg00602.html>,
pending review), testing with -Os with build-many-glibcs.py shows the
build is still failing with -Os everywhere except for x86_64, x86 and
s390x.
There are a variety of different build failures, but the most common
seem to be in strcoll / wcscoll, similar to existing such cases where
DIAG_* are used to disable -Wmaybe-uninitialized. There are various
different failures even within those functions. This patch fixes one
particular case that seems quite common, where the warning appears at
the declarations of seq1 and seq2.
Tested with build-many-glibcs.py that this fixes the -Os build for
aarch64-linux-gnu with GCC 7.
* string/strcoll_l.c: Include <libc-diag.h>.
(STRCOLL): Ignore -Wmaybe-uninitialized for -Os around
declarations of seq1 and seq2.
Among other localplt test failures when building with -Os, there are
libc.so PLT references for argz_next and __argz_next. This is a
simple case of functions that are inlined for -O2 but not for -Os;
this patch adds libc_hidden_proto / libc_hidden_def for them to avoid
localplt failures even when not inlined.
Tested for x86_64 (both that it removes these particular localplt
failures for -Os - but other such failures remain so the bug can't yet
be closed - and that the testsuite continues to pass without -Os).
[BZ #15105]
* include/argz.h (argz_next): Use libc_hidden_proto.
(__argz_next): Likewise.
* string-argz-next.c (__argz_next): Use libc_hidden_def.
(argz_next): Use libc_hidden_weak.
We have a general principle of preferring optimizations for library
facilities to use compiler built-in functions rather than being
located in library headers, where the compiler can reasonably optimize
code without needing to know glibc implementation details.
This patch applies this principle to bits/byteswap.h, eliminating all
the architecture-specific variants and bits/byteswap-16.h. The
__bswap_16, __bswap_32 and __bswap_64 interfaces all become inline
functions, never macros, using the GCC built-in functions where
available and otherwise a single architecture-independent definition
using shifts and masking (which compilers may well be able to detect
and optimize; GCC has detection of various byte-swapping idioms).
The __bswap_constant_32 macro needs to stay around because of uses in
static initializers within glibc and its tests, and so for consistency
all __bswap_constant_* are kept rather than just being inlined into
the old-GCC-or-non-GCC parts of the __bswap_* inline function
definitions.
Various open bugs are addressed by this cleanup, with caveats about
exactly what is covered by those bugs and when the bugs applied at
all.
Bug 14508 reports -Wformat warnings building glibc because __bswap_*
sometimes returned the wrong types. Obviously we already don't have
such warnings any more or the build would be failing, given -Werror,
and I suspect that bug was originally for wrong types for x86_64, as
fixed by commit d394eb742a (glibc 2.17).
The only case I saw removed by this patch where the types would still
have been wrong was the non-__GNUC__ case of __bswap_64 in the s390
header (using unsigned long long int, but uint64_t would be unsigned
long int for 64-bit). In any case, the single header consistently
uses __uintN_t types after this patch, thereby eliminating all such
bugs. The existing string/test-endian-types.c test already suffices
to verify that the types are correct with the compiler used to build
glibc and its tests.
Bug 15512 reports an error from __bswap_constant_16 with -Werror
-Wsign-conversion. I am unable to reproduce this with any GCC version
supporting -Wsign-conversion - all seem to be able to avoid warning
for ((x) >> 8) & 0xffu, where x is uint16_t, which while it formally
does involve an implicit conversion from int to unsigned int, is also
a case where it should be easy for the compiler to see that the value
converted is never negative. But in this patch __bswap_constant_16 is
changed to use signed 0xff so that no such implicit conversion occurs
at all, and a test with -Werror -Wsign-conversion is added.
Bug 17082 objects to the use of ({}) statement expressions in these
macros preventing use at file scope (in C, that's in sizeof etc.; in
C++, more generally in static initializers). The particular case of
these interfaces is fixed by this patch as it changes them to inline
functions, eliminating all uses of ({}) in bits/byteswap.h, and a
corresponding testcase is added. The bug tries to raise a more
general policy question about use of ({}) in macros in installed
headers, referring to "many other libc functions" (unspecified which
functions are being considered).
Since such policy questions belong on libc-alpha, and since there
*are* macros in installed headers which can't really avoid using ({})
(where they are type-generic, so can't use an inline function, but
need a temporary variable, and a few where the interface involves
returning memory from alloca so can't use an inline function either),
I propose to consider that bug fixed with this change. That is
without prejudice to any other new bugs anyone wishes to file *for
precisely defined sets of macros* requesting moving away from ({})
*where it is clearly possible for those interfaces*. Where ({}) can
be avoided, typically by use of an inline function, I think that's a
good idea - that inline functions are typically to be preferred to
({}) for header interfaces where such optimizations are useful but the
interface is suited to being defined using an inline function.
Bug 20530 requests use of __builtin_bswap16 when available (GCC 4.8
and later), which this patch implements.
Tested for x86_64, and with build-many-glibcs.py. Also did an x86_64
test with the __GNUC_PREREQ conditionals changed to "#if 0" to verify
the old-GCC/non-GCC case in the headers. (There are already existing
tests for correctness of results of these interfaces.)
[BZ #14508]
[BZ #15512]
[BZ #17082]
[BZ #20530]
* bits/byteswap.h: Update file comment. Do not include
<bits/byteswap-16.h>.
(__bswap_constant_16): Cast result to __uint16_t. Use signed 0xff
constant.
(__bswap_16): Define as inline function.
(__bswap_constant_32): Reformat definition.
(__bswap_32): Always define as inline function, not macro, using
__uint32_t. Use __builtin_bswap32 if [__GNUC_PREREQ (4, 3)],
otherwise __bswap_constant_32.
(__bswap_constant_64): Reformat definition. Do not use
__extension__ here.
(__bswap_64): Always define as inline function, not macro. Use
__extension__ on function definition. Use __builtin_bswap64 if
[__GNUC_PREREQ (4, 3)], otherwise __bswap_constant_64.
* string/test-endian-file-scope.c: New file.
* string/test-endian-sign-conversion.c: Likewise.
* string/Makefile (headers): Remove bits/byteswap-16.h.
(tests): Add test-endian-file-scope and
test-endian-sign-conversion.
(CFLAGS-test-endian-sign-conversion.c): New variable.
* bits/byteswap-16.h: Remove file.
* sysdeps/ia64/bits/byteswap-16.h: Likewise.
* sysdeps/ia64/bits/byteswap.h: Likewise.
* sysdeps/m68k/bits/byteswap.h: Likewise.
* sysdeps/s390/bits/byteswap-16.h: Likewise.
* sysdeps/s390/bits/byteswap.h: Likewise.
* sysdeps/tile/bits/byteswap.h: Likewise.
* sysdeps/x86/bits/byteswap-16.h: Likewise.
* sysdeps/x86/bits/byteswap.h: Likewise.
Bug 19667 reports unchecked malloc calls in the test
string/testcopy.c. This patch makes that test use xmalloc and the
support/test-driver.c test framework.
Tested for x86_64.
[BZ #19667]
* string/testcopy.c: Include <support/support.h>. Do not include
<malloc.h>. Use <support/test-driver.c>.
(main): Rename to do_test. Make static. Use xmalloc instead of
malloc.
Some strncat tests fail to build with GCC 8 because of -Warray-bounds
warnings. These tests are deliberately test over-large size arguments
passed to strncat, and already disable -Wstringop-overflow warnings,
but now the warnings for these tests come under -Warray-bounds so that
option needs disabling for them as well, which this patch does (with
an update on the comments; the DIAG_IGNORE_NEEDS_COMMENT call for
-Warray-bounds doesn't need to be conditional itself, because that
option is supported by all versions of GCC that can build glibc).
Tested compilation with build-many-glibcs.py for aarch64-linux-gnu.
* string/tester.c (test_strncat): Also disable -Warray-bounds
warnings for two tests.
GCC 8 warns about more cases of string functions truncating their
output or not copying a trailing NUL byte.
This patch fixes testsuite build failures caused by such warnings in
string/tester.c. In general, the warnings are disabled around the
relevant calls using DIAG_* macros, since the relevant cases are being
deliberately tested. In one case, the warning is with
-Wstringop-overflow= instead of -Wstringop-truncation; in that case,
the conditional is __GNUC_PREREQ (7, 0) (being the version where
-Wstringop-overflow= was introduced), to allow the conditional to be
removed sooner, since it's harmless to disable the warning for a
GCC version where it doesn't actually occur. In the case of warnings
for strncpy calls in test_memcmp, the calls in question are changed to
use memcpy, as they don't copy a trailing NUL and the point of that
code is to test memcmp rather than strncpy.
Tested (compilation) with GCC 8 for x86_64-linux-gnu with
build-many-glibcs.py (in conjunction with Martin's patch to allow
glibc to build).
* string/tester.c (test_stpncpy): Disable -Wstringop-truncation
for stpncpy calls for GCC 8.
(test_strncat): Disable -Wstringop-truncation warning for strncat
calls for GCC 8. Disable -Wstringop-overflow= warning for one
strncat call for GCC 7.
(test_strncpy): Disable -Wstringop-truncation warning for strncpy
calls for GCC 8.
(test_memcmp): Use memcpy instead of strncpy for calls not copying
trailing NUL.
GCC 8 warns about strncat calls with truncated output.
string/bug-strncat1.c tests such a call; this patch disables the
warning for it.
Tested (compilation) with GCC 8 for x86_64-linux-gnu with
build-many-glibcs.py (in conjunction with Martin's patch to allow
glibc to build).
* string/bug-strncat1.c: Include <libc-diag.h>.
(main): Disable -Wstringop-truncation for strncat call for GCC 8.
Hide internal __strsep function to allow direct access within libc.so and
libc.a without using GOT nor PLT.
[BZ #18822]
* include/string.h (__strsep): Add libc_hidden_proto.
* string/strsep.c (__strsep): Add libc_hidden_def.
Fix GCC 7 errors when string/stratcliff.c is compiled with -O3:
stratcliff.c: In function ‘do_test’:
cc1: error: assuming signed overflow does not occur when assuming that (X - c) <= X is always true [-Werror=strict-overflow]
[BZ #21982]
* string/stratcliff.c (do_test): Declare size, nchars, inner,
middle and outer with size_t instead of int. Repleace %d and
%Zd with %zu in printf. Update "MAX (0, nchars - 128)" and
"MAX (outer, nchars - 64)" to support unsigned outer and
nchars. Also exit loop when outer == 0.
Move internal argz function prototypes to include/argz.h and mark them
with attribute_hidden to allow direct access within libc.so and libc.a
without using GOT nor PLT. This also brings string/argz.h closer to the
gnulib version.
[BZ #18822]
* include/argz.h (__argz_create_sep): New function prototype.
(__argz_append): Likewise.
(__argz_add): Likewise.
(__argz_add_sep): Likewise.
(__argz_delete): Likewise.
(__argz_insert): Likewise.
(__argz_replace): Likewise.
* string/argz.h (__argz_create_sep): Removed.
(__argz_append): Likewise.
(__argz_add): Likewise.
(__argz_add_sep): Likewise.
(__argz_delete): Likewise.
(__argz_insert): Likewise.
(__argz_replace): Likewise.
This patch increases the timeouts for some tests that I've seen timing
out on slow systems in my 2.26 release testing. (In the case of
tst-tsearch.c, increasing the timeout means removing a setting of 10
that was put there before the default timeout was increased to 20
seconds, so putting the default into effect.)
* iconvdata/tst-loading.c (TIMEOUT): Define to 30.
* misc/tst-tsearch.c (TIMEOUT): Remove.
* nptl/tst-create-detached.c (TIMEOUT): Define to 100.
* nptl/tst-robust-fork.c (TIMEOUT): Likewise.
* nptl/tst-rwlock19.c (TIMEOUT): Likewise.
* string/tst-cmp.c (TIMEOUT): Define to 600.
This code:
L(between_2_3):
/* Load as big endian with overlapping loads and bswap to avoid
branches. */
movzwl -2(%rdi, %rdx), %eax
movzwl -2(%rsi, %rdx), %ecx
shll $16, %eax
shll $16, %ecx
movzwl (%rdi), %edi
movzwl (%rsi), %esi
orl %edi, %eax
orl %esi, %ecx
bswap %eax
bswap %ecx
subl %ecx, %eax
ret
needs a saturating subtract because the full register is used.
With this commit, only the lower 24 bits of the register are used,
so a regular subtraction suffices.
The test case change adds coverage for these kinds of bugs.