calls strcspn, call strcspn directly so we get the end of the token without
an extra call to rawmemchr. Also avoid an unnecessary call to strcspn after
the last token by adding an early exit for an empty string. Change strtok
to tailcall strtok_r to avoid unnecessary code duplication.
Remove the special header optimization for strtok_r of a 1-character
constant string - both strspn and strcspn contain optimizations for this
case. Benchmarking this showed similar performance in the worst case,
but up to 5.5x better performance in the "found" case for large inputs.
* benchtests/bench-strtok.c (oldstrtok): Add old implementation.
* string/strtok.c (strtok): Change to tailcall __strtok_r.
* string/strtok_r.c (__strtok_r): Optimize for performance.
* string/string-inlines.c (__old_strtok_r_1c): New function.
* string/bits/string2.h (__strtok_r): Move to string-inlines.c.
Clean up string functions that do not have a version in gnulib on
the assumption that glibc is the canonical upstream copy of this
code. basename has a copy in gnulib but it is largely written to
handle Windows paths so merging it is not really viable. The changes
mostly consist of switching to ANSI function prototypes and removing
unused includes.
As many of these functions do not get built in a typical build due
to architecture optimized versions being used instead I built these
by hand to verify there were no build warnings and the code was
identical.
2014-04-07 Will Newton <will.newton@linaro.org>
* string/basename.c [HAVE_CONFIG_H]: Remove #ifdef and
and contents. [!_LIBC] Remove #ifndef and contents.
(basename): Use ANSI prototype. [_LIBC] Remove #idef.
* string/memccpy.c (__memccpy): Use ANSI prototype.
* string/memfrob.c (memfrob): Likewise.
* string/strcoll.c (STRCOLL): Likewise.
* string/strlen.c (strlen): Likewise.
* string/strtok.c (STRTOK): Likewise.
* string/strcat.c: Remove unused #include of memcopy.h.
(strcat): Use ANSI prototype.
* string/strchr.c: Remove unused #include of memcopy.h.
(strchr): Use ANSI prototype.
* string/strcmp.c: Remove unused #include of memcopy.h.
(strcmp): Use ANSI prototype.
* string/strcpy.c: Remove unused #include of memcopy.h.
(strcpy): Use ANSI prototype.
* limits.h: Change MB_LEN_MAX to 6. A 31-bit ISO 10646
character in UTF-8 encoding has that many bytes.
* locale/langinfo.h: New element _NL_CTYPE_MB_CUR_MAX.
* locale/categories.def: Add description of field _NL_CTYPE_MB_CUR_MAX.
* locale/Makefile (routines): Add mb_cur_max.
* locale/mb_cur_max.c: New file. This function gets called
when the macro MB_CUR_MAX is used.
* locale/C-ctype.c: Initialize new mb_cur_max field.
* locale/localeinfo.h: Change magic value because of incompatible
change.
* locale/programs/ld-ctype.c: Determine value of mb_cur_max
according to current character set and write it out with the rest.
* stdlib/stdlib.h (MB_CUR_MAX): Not constant anymore. Get value
according to currently used locale for catefory LC_CTYPE by
calling the function __ctype_get_mb_cur_max.
Tue May 28 03:27:46 1996 Ulrich Drepper <drepper@cygnus.com>
* FAQ: Fix some typos.
Tell that for Linux the kernel header files are necessary.
* PROJECTS: New file. List of open jobs for glibc.
* Makefile (distribute): Add PROJECTS.
* crypt/GNUmakefile (headers): New variable. Mention crypt.h.
* crypt/crypt.h: Header for crypt functions.
* elf/elf.h: Add some new constants from recent Cygnus ELF
header files.
* login/getutid_r.c: Test for correct type.
Don't depend on ut_type and ut_id unless _HAVE_UT_TYPE and
_HAVE_UT_ID resp. are defined.
Make really compliant with specification.
* login/getutline_r.c, login/pututline_r.c: Don't depend on
ut_type and ut_id unless _HAVE_UT_TYPE and _HAVE_UT_ID resp. are
defined.
Make really compliant with specification.
* login/setutent_r.c: Don't depend on ut_type and ut_id unless
_HAVE_UT_TYPE and _HAVE_UT_ID resp. are defined.
* login/login.c, login/logout.c, login/logwtmp.c: Complete
rewrite. Now based on getut*/setut* functions.
* stdlib/strtol.c: Undo changes of Wed May 22 01:48:54 1996.
This prevented using this file in other GNU packages.
* sysdeps/gnu/utmpbits.h: Define _HAVE_UT_TYPE, _HAVE_UT_ID,
and _HAVE_UT_TV because struct utmp has these members.
* sysdeps/libm-i387/e_exp.S: Correct exp(+-Inf) case.
* utmp.h: New file. Wrapper around login/utmp.h.
* elf/dl-error.c (struct catch): New type.
(catch): New static variable, struct catch *.
(catch_env, signalled_errstring, signalled_objname): Variables removed.
(_dl_signal_error): If CATCH is non-null, set its errstring and
objname members and jump to CATCH->env. If it is null, call
_dl_sysdep_fatal with a standard message.
* elf/rtld.c (dl_main): Explode `doit' function into dl_main's body.
No longer use _dl_catch_error.
* stdlib/ldiv.c: Deansideclized.
Sun May 26 19:39:53 1996 Ulrich Drepper <drepper@cygnus.com>
* intl/loadmsgcat.c (_nl_load_domain): Test correct variable
after malloc.
* string/Makefile (tester-ENV): New variable to suppress message
translation in test.
* string/tester.c: Add tests for strtok_r and strsep.
* sysdeps/i386/i486/strcat.S: Correct some more 8bit operation
<-> 32 bit operand conflicts.
* sysdeps/i386/strsep.S: Wrapper around <sysdeps/i386/strtok.S>
to produce strsep function.
* sysdeps/i386/strtok.S: Optimized implementation of strtok
function.
* sysdeps/i386/strtok_r.S: Wrapper around <sysdeps/i386/strtok.S>
to produce strtok_r function.
* sysdeps/generic/strtok.c: Moved here from string/strtok.c.
Corrected example in comment.
* string/Makefile (routines): Add strtok_r.
* sysdeps/generic/strtok_r.c: New file. Implement reentrant version
of strtok_r.
* string/string.h: Add prototype for strtok_r.
* wcsmbs/wcstok.c: Handle illegal SAVE_PTR argument the same
as in strtok_r.
Sun May 26 13:28:23 1996 Roland McGrath <roland@delasyd.gnu.ai.mit.edu>
* time/tzset.c (__tzset): Ignore leading : in $TZ; always try tzfile
first and fall back to 1003.1 syntax only if it fails.
* time/Makefile (install-others): Also install posix/ZONE and
right/ZONE for each ZONE in $(zonenames).
(z.% rule): Generate rules for right/ZONE and posix/ZONE targets too,
the difference begin leapseconds vs /dev/null as 3rd dep. For
original ZONE targets use $(leapseconds), to be set in Makeconfig.
(target-zone-flavor): New variable.
(tzcompile): Use it to get the right -d for posix/ and right/ flavors.
* Makeconfig (leapseconds): New variable.
* mach/Machrules (%.udeps rule): Depend on Machrules.
Emit deps for .uh and .__h files.
(%.uh, %.__h rules): Don't depend on %.defs; use #include <$*.defs>
instead.
Sun May 26 01:06:47 1996 Ulrich Drepper <drepper@cygnus.com>
* stdlib/Makefile (routines): Add llabs, lldiv.
* stdlib/llabs.c: New file. Implementation of return
absolute value of long long argument.
* stdlib/lldiv.c: New file. Implementation of division with remainder
of long long argument.
* stdlib/stdlib.h [__USE_GNU] (lldiv_t): New type for lldiv
function.
Define prototypes for lldiv and llabs functions.
* locale/C-collate.c: Initialize _NL_COLLATE_NRULES element.
* stdlib/strtod.c: Replace wchar_t with wint_t. The later is
really the type for a single wide character.
* string/strxfrm.c (print_val): Define separate version for
use as wcsxfrm. Here we don't need UTF8 encoding.
* wcsmbs/wchar.h: gcc-2.7.2-960517 finally introduces wint_t
in <stddef.h>. Use this value and only for older gcc version
define in place.
(uwchar_t): Remove definition.
* wcsmbs/wcscmp.c, wcsmbs/wcscoll.c, wcsmbs/wcsncmp.c,
wcsmbs/wcsxfrm.c, wcsmbs/wmemcmp.c: : Don't use uwchar_t as unsigned
type. wint_t is intended for this.
Sat May 25 14:10:19 1996 Roland McGrath <roland@delasyd.gnu.ai.mit.edu>
* sysdeps/unix/bsd/direntry.h: Use [1] instead of [0] for d_name to
quiet -ansi -pedantic.
* sysdeps/unix/common/direntry.h: Likewise.
* login/Makefile (headers): Add lastlog.h.
* login/lastlog.h: New file.
* login/Makefile (CFLAGS): Don't append -D_THREAD_SAFE.
* login/utmp.h [_REENTRANT || _THREAD_SAFE]: Replace this conditional
with #ifdef __USE_REENTRANT.
* features.h (__GNU_LIBRARY__): Set to 6.
[_GNU_SOURCE] (_POSIX_SOURCE, _POSIX_C_SOURCE, _BSD_SOURCE,
_SVID_SOURCE): Make sure they are all defined.
* sysdeps/unix/sysv/linux/gnu/types.h: Instead of including
<linux/posix_types.h>, define _LINUX_TYPES_DONT_EXPORT and then
include <linux/types.h>.
* resource/sys/resource.h: Remove trailing commas from enums.
* sysdeps/generic/netinet/in.h: Remove trailing commas from enums.
* sysdeps/unix/sysv/linux/netinet/in.h: Likewise.