[BZ#23744]
This refactoring was prompted by a problem when the regex code is
used as part of Gnulib and when the builder’s compiler does not grok
__builtin_expect. Problem reported for Gawk by Nelson H.F. Beebe in:
https://lists.gnu.org/r/bug-gnulib/2018-09/msg00137.html
Although this refactoring does not fix the problem directly,
we might as well have Gawk use the now-preferred glibc style for when
__builtin_expect is unavailable.
* posix/regex_internal.h (BE): Remove.
All uses replaced by __glibc_unlikely or __glibc_likely.
This patch syncs the regex implementation with gnulib (commit 0ee5212).
Only two changes in GLIBC regex testing are required:
1. posix/bug-regex28.c: as previously discussed [1] the change of
expected results on the pattern should be safe.
2. posix/PCRE.tests: the ERE (a)|\1 is malformed (in the sense that
the \1 doesn't mean anything) and although current GLIBC accepts
it has undefined behavior. This patch removes the specific test.
This sync contains some patches from thread 'Regex: Make libc regex
more usable outside GLIBC.' [2] which have been pushed upstream in
gnulib. This patches also fixes some regex issues (BZ #23233,
BZ #21163, BZ #18986, BZ #13762) and I did not add testcases for
both #23233 and #13762 because I couldn't think a simple way to
trigger the expected failure path to trigger them.
Checked on x86_64-linux-gnu and i686-linux-gnu.
[BZ #23233]
[BZ #21163]
[BZ #18986]
[BZ #13762]
* posix/Makefile (tests): Add bug-regex37 and bug-regex38.
* posix/PCRE.tests: Remove invalid test.
* posix/bug-regex28.c: Fix expected values for used syntax.
* posix/bug-regex37.c: New file.
* posix/bug-regex38.c: Likewise.
* posix/regcomp.c: Sync with gnulib.
* posix/regex.c: Likewise.
* posix/regex.h: Likewise.
* posix/regex_internal.c: Likewise.
* posix/regex_internal.h: Likewise.
* posix/regexec.c: Likewise.
[1] https://sourceware.org/ml/libc-alpha/2017-12/msg00807.html
[2] https://sourceware.org/ml/libc-alpha/2017-12/msg00237.html
next_last_offset.
(struct re_dfa_t): Remove unused member states_alloc.
* posix/regcomp.c (init_dfa): Don't initialize unused members.
2005-08-25 Paul Eggert <eggert@cs.ucla.edu>
* posix/regexec.c (set_regs): Don't alloca with an unbounded size.
alloca modernization/simplification for regex.
* posix/regex.c: Remove portability cruft for alloca. This no longer
needs to be at the start of the file, and can be moved into
regex_internal.h and simplified.
* posix/regex_internal.h: Include <alloca.h>.
(__libc_use_alloca) [!defined _LIBC]: New macro.
* posix/regexec.c (build_trtable): Remove "#ifdef _LIBC",
since the code now works outside glibc.
2005-09-06 Ulrich Drepper <drepper@redhat.com>
* include/regex.h: Remove use of _RE_ARGS.
2005-08-25 Paul Eggert <eggert@cs.ucla.edu>
* posix/regexec.c (find_recover_state): Change "err" to "*err".
2005-08-24 Paul Eggert <eggert@cs.ucla.edu>
* posix/regcomp.c (regerror): Pointer args are 'restrict',
as per POSIX.
* posix/regex.h (regerror): Likewise.
* manual/pattern.texi (POSIX Regexp Compilation): Likewise.
Similarly for regcomp and regexec. Also, first 2 args of regexec
and 2nd arg of regerror are const.
* posix/regex.c: Do not include <sys/types.h>, as POSIX no longer
requires this. (The code never needed it.)
2005-08-20 Paul Eggert <eggert@cs.ucla.edu>
* posix/regexec.c (sift_states_bkref): re_node_set_insert returns
int, not reg_errcode_t.
* posix/regex_internal.c (calc_state_hash): Put 'inline' before type,
since some broken compilers warn about it otherwise.
* posix/regcomp.c (create_initial_state): Remove duplicate decl.
2005-08-20 Paul Eggert <eggert@cs.ucla.edu>
* posix/regex.h (_RE_ARGS): Remove. No longer needed, since we assume
C89 or better. All uses removed.
2005-09-06 Ulrich Drepper <drepper@redhat.com>
* posix/regex.c: Prevent using C++ compilers.
2005-08-19 Paul Eggert <eggert@cs.ucla.edu>
* posix/regcomp.c (duplicate_node): Return new index, not an error
code, and let the caller return REG_ESPACE if out of space. This
removes an uninitialied-variable warning with GCC 4.0.1, and also
avoids taking the address of a local variable. All callers
changed.
2005-09-06 Ulrich Drepper <drepper@redhat.com>
* include/time.h (__strptime_internal): Rename parameter to avoid
bogus compiler warning.
2005-08-19 Jim Meyering <jim@meyering.net>
* posix/regexec.c (proceed_next_node): Redo local variables to
avoid GCC shadowing warnings.
2005-09-06 Ulrich Drepper <drepper@redhat.com>
* posix/regex_internal.c (re_acquire_state): Minor code rearrangement.
(re_acquire_state_context): Likewise.
2005-08-19 Paul Eggert <eggert@cs.ucla.edu>
* posix/regex_internal.c (re_string_realloc_buffers):
(re_node_set_insert, re_node_set_insert_last, re_dfa_add_node):
Rename local variables to avoid GCC shadowing warnings.
2005-07-08 Eric Blake <ebb9@byu.net>
Paul Eggert <eggert@cs.ucla.edu>
* posix/regcomp.c (init_dfa): Store __btowc value in wint_t, not
wchar_t. Remove now-unnecessary cast.
(build_range_exp): Likewise.
2004-01-13 Ulrich Drepper <drepper@redhat.com>
* posix/regex.c: Support crappy compilers and platforms which have
problems with alloca.
* posix/regex_internal.h: Likewise.
Patch by Paolo Bonzini.
* posix/regex_internal.h: Add forward declaration of re_dfa_t.
Replace last two parameters of re_string_allocate and
re_string_construct with pointer to DFA.
(re_dfa_t): Add map_notascii field.
* posix/regcomp.c (re_compile_internal): Add call of
re_string_construct.
(init_dfa): Initialize mpa_notascii.
* posix/regex_internal.c: Adjust definitions of re_string_allocate
and re_string_construct.
Pass DFA to re_string_construct. Adjust definition. Initialize
map_notascii field.
(build_wcs_upper_buffer): If map_notascii is zero use simplfied
method to map ASCII values to upper case.
* posix/regex.c: Include localeinfo.h.
* posix/regexec.c: Adjust call of re_string_allocate.
* locale/langinfo.h: Add _NL_CTYPE_MAP_TO_NONASCII.
* locale/localeinfo.h (LIMAGIC): Change value.
* locale/categories.def. Add entry for _NL_CTYPE_MAP_TO_NONASCII.
* locale/C-ctype.h: Likewise.
* locale/programs/ld-ctype.c: Compute whether any mapping maps from
ASCII to non-ASCII value. Write out that value.
* inet/rcmd.c (rresvport_af): Avoid using invliad values. Wrap
around in search if port IPPORT_RESERVED/2 has been test.
2002-02-20 Paolo Bonzini <bonzini@gnu.org>
* posix/regcomp.c: Remove inclusions.
* posix/regexec.c: Likewise.
* posix/regex_internal.c: Likewise.
* posix/regex_internal.h: Add inclusions here.
* posix/regex.c: Only include sys/types.h before regex.h. Include
regex_internal.h here. Include regex_internal.c before regcomp.c
and regexec.c (might expose more opportunities to the C compiler).
* posix/regcomp.c (parse_expression): Fix construct rejected by SGI CC.
* posix/regex_internal.h [!_LIBC] (__mempcpy): Fix typo.
[!_LIBC] (__wcrtomb): New definition.
[!_LIBC]: Conditionalize enabling of I18N on HAVE_WCSCOLL and
HAVE_LOCALE_H as well.
2003-02-20 Ulrich Drepper <drepper@redhat.com>
2002-05-21 Isamu Hasegawa <isamu@yamato.ibm.com>
* posix/regex.c: Define `inline' as a macro into nothing for the
compilers which lack the keyword.
* posix/regex.h: (RE_SYNTAX_GNU_AWK): Remove RE_CONTEXT_INVALID_OPS
for the compatibility of gawk.
* posix/regcomp.c: Add fake implementation of isblank() for the
environments which lack the function.
Don't use free_charset() in case of non-i18n envs.
(build_range_exp): Don't use i18n related code in case of non-i18n
envs.
(build_collating_symbol): Likewise.
(build_equiv_class): Likewise.
(build_charclass): Likewise.
(re_compile_fastmap_iter): Likewise.
(parse_bracket_exp): Likewise.
(build_word_op): Likewise.
(regfree): Don't use free_charset() in case of non-i18n envs.
* posix/regex_internal.h: Remove COMPLEX_BRACKET from
re_token_type_t in case of non-i18n envs.
Don't define re_charset_t in case of non-i18n envs.
Change the type of wcs of re_string_t from wchar_t to wint_t,
since we store also WEOF.
* posix/regex_internal.c: (re_string_realloc_buffers): Change
the type of wcs of re_string_t from wchar_t to wint_t.
(re_string_reconstruct): Likewise.
(create_ci_newstate): Don't use i18n related code in case of
non-i18n envs.
(create_cd_newstate): Likewise.
2002-05-24 Ulrich Drepper <drepper@redhat.com>
* iconv/loop.c: Fix typo.
2002-05-23 Jakub Jelinek <jakub@redhat.com>
* inet/ether_line.c (ether_line): Fix a typo causing only
lower 4 bits of each ethernet address byte being assigned.
Don't modify what line points to.
* inet/tst-ether_aton.c (main): Add ether_line tests.
2002-05-23 Marcus Brinkmann <marcus@gnu.org>
* manual/filesys.texi: Don't make readlink example leak memory
when readlink fails.
* iconvdata/ibm1129.h: Remove duplicate mappings.
* iconvdata/ibm937.c: Handle overflow errors. Handle new tables.
* iconvdata/ibm937.h: Reorganize table to safe a lot of space.
Patch by Masahide Washizawa <WASHI@jp.ibm.com>.
* timezone/zic.c: Fix handling of turnaround times.
Patch by Arthur David Olson <olsona@dc37a.nci.nih.gov>.
2001-12-02 Moshe Olshansky <OLSHANSK@il.ibm.com>
* sysdeps/ieee754/dbl-64/e_remainder.c (__ieee754_remainder): Fix
overflow problem.
2001-12-05 Ulrich Drepper <drepper@redhat.com>
* posix/regex.c: For use outside glibc defined bounded pointer
macros here. Patch by Jim Meyering <jim@meyering.net>.
2001-11-26 Ulrich Drepper <drepper@redhat.com>
* stdio-common/vfscanf.c: If incomplete nan of inf(inity) strings
are found call conv_error and not input_error [PR libc/2669].
* math/bits/mathcalls.h: Mark ceil and floor as const.
Reported by David Mosberger.
2001-11-21 Jim Meyering <meyering@lucent.com>
* posix/regex.c (iswctype, mbrtowc, wcslen, wcscoll, wcrtomb) [_LIBC]:
Define to be __-prefixed.
Remove unnecessary duplication in `#ifdef _LIBC' blocks.
2001-10-26 Ulrich Drepper <drepper@redhat.com>
* string/strxfrm.c [USE_IN_EXTENDED_LOCALE_MODEL]: Correctly get
nrules value.
2001-10-24 H.J. Lu <hjl@gnu.org>
* sysdeps/generic/bits/dlfcn.h (DL_CALL_FCT): Cast to void *.
Use __BEGIN_DECLS/__END_DECLS around prototypes.
* sysdeps/mips/bits/dlfcn.h (DL_CALL_FCT): Likewise.
2001-10-21 Jim Meyering <meyering@lucent.com>
* malloc/obstack.c (_): Honor the setting of ENABLE_NLS. Otherwise,
this code would end up calling gettext even in packages built
with --disable-nls.
* posix/getopt.c (_): Likewise.
* posix/regex.c (_): Likewise.
2001-10-26 Ulrich Drepper <drepper@redhat.com>
* resolv/gethnamaddr.c (gethostbyaddr): Use ip6.addr for reverse
lookup not ip6.int.
* resolv/nss_dns/dns-host.c (_nss_dns_gethostbyaddr_r): Likewise.
Reported by Martin.v.Loewis@t-online.de [PR libc/2598].
2001-10-19 Jakub Jelinek <jakub@redhat.com>
* misc/sys/cdefs.h (__attribute_used__): Define.
* elf/rtld.c (_dl_start): Add __attribute_used__.
* elf/dl-runtime.c (fixup, profile_fixup): Likewise.
2001-08-20 Martin Schwidefsky <schwidefsky@de.ibm.com>
* sysdeps/unix/sysv/linux/s390/s390-32/sys/ucontext.h: Revert the
change of the gregset_t type.
* sysdeps/unix/sysv/linux/s390/s390-64/sys/ucontext.h: Likewise.
2001-08-20 kaz Kojima <kkojima@rr.iij4u.or.jp>
* sysdeps/unix/sysv/linux/sh/sysdep.S: Align errno.
* posix/regex.c (truncate_wchar): Use wcrtomb not wctomb.
* posix/fnmatch_loop.c: Fix computation of alignment.
2001-08-09 Isamu Hasegawa <isamu@yamato.ibm.com>
* posix/regex.c (wcs_regex_compile): Use appropriate string
to compare with collating element.
Fix the padding for the alignment.
2001-08-09 Isamu Hasegawa <isamu@yamato.ibm.com>
* locale/programs/ld-collate.c (collate_output): Exclude
characters from elem_table.
Reduce if clause to write collating elements correctly.
* posix/Makefile (tests): Add bug-regex5.
* posix/bug-regex5.c: New file.
2001-08-09 Ulrich Drepper <drepper@redhat.com>
2001-07-18 Ulrich Drepper <drepper@redhat.com>
* libio/filedoalloc.c (_IO_file_doallocate): A few more minor
cleanups and improvements.
2001-07-18 Andreas Schwab <schwab@suse.de>
* posix/regex.c (WORDCHAR_P) [WCHAR]: Also return true for the
underscore character.
2001-07-18 Jakub Jelinek <jakub@redhat.com>
* malloc/malloc (new_heap): Don't call munmap for zero length.
2001-07-18 Ulrich Drepper <drepper@redhat.com>
* libio/filedoalloc.c (_IO_file_doallocate): Use DEV_TTY_P if
available to determine whether descriptor is for tty before
calling isatty.
* sysdeps/unix/sysv/linux/device-nrs.h: Define DEV_TTY_P.
* sysdeps/generic/device-nrs.h: Likewise.
2001-07-06 Paul Eggert <eggert@twinsun.com>
* manual/argp.texi: Remove ignored LGPL copyright notice; it's
not appropriate for documentation anyway.
* manual/libc-texinfo.sh: "Library General Public License" ->
"Lesser General Public License".
2001-07-06 Andreas Jaeger <aj@suse.de>
* All files under GPL/LGPL version 2: Place under LGPL version
2.1.
* posix/Makefile: Add rules to build and run tst-regex.
2001-06-20 Isamu Hasegawa <isamu@yamato.ibm.com>
* posix/regex.c (FREE_WCS_BUFFERS): New macro to free buffers.
(re_search_2): invoke convert_mbs_to_wcs and FREE_WCS_BUFFERS.
(wcs_re_match_2_internal): Check whether the wcs buffers need
seting up or not, and skip seting up routin if not needed.
2001-06-26 Isamu Hasegawa <isamu@yamato.ibm.com>
* posix/regex.c (count_mbs_length): Use binary search for
optimization.
2001-06-27 Ulrich Drepper <drepper@redhat.com>
2001-06-22 Jakub Jelinek <jakub@redhat.com>
* posix/regex.c (regex_compile, re_match_2_internal): Fix comment
typos.
2001-06-01 Wolfram Gloger <wg@malloc.de>
* malloc/malloc.c (malloc_atfork, free_atfork): Use a unique value
ATFORK_ARENA_PTR, not 0, for the thread-specific arena pointer
when malloc_atfork is in use.
2001-05-14 Andreas Jaeger <aj@suse.de>
* sysdeps/i386/fpu/libm-test-ulps: Adjust for new tests.
* math/libm-test.inc (tanh_test): Add testcases for last tanh bug.
2001-05-14 Stephen L Moshier <moshier@mediaone.net>
* sysdeps/ieee754/ldbl-96/s_tanhl.c (__tanhl): Fix sign test.
2001-05-11 Jakub Jelinek <jakub@redhat.com>
* posix/regex.c (re_match_2_internal): Swap mbs_offset and csize
as well if swapping strings.
Make sure stop is not past end of second string.
* posix/bug-regex4.c: New test.
* posix/Makefile (tests): Add bug-regex4.
2001-05-10 Andreas Jaeger <aj@suse.de>
* manual/install.texi (Linux): Clarify that Linux 2.2 is minimal
requirement.
2001-03-29 Ulrich Drepper <drepper@redhat.com>
* math/bits/mathcalls.h: Remove infnan declaration.
2001-03-29 H.J. Lu <hjl@gnu.org>
* include/endian.h: Define BIG_ENDI, LITTLE_ENDI, HIGH_HALF,
and LOW_HALF only if _LIBC is defined and _ISOMAC is not defined.
* stdlib/isomac.c (fmt): Define _LIBC and _ISOMAC.
.
2001-03-29 Isamu Hasegawa <isamu@yamato.ibm.com>
* posix/regex.c: Fix typo and add a sentinel.
2001-03-29 Ulrich Drepper <drepper@redhat.com>
* sysdeps/unix/sysv/linux/shm_open.c: Open new file always with
O_NOFOLLOW. Suggested by Christoph Roland.
* posix/Makefile: Add rules to build and run bug-regex2.
2001-02-10 Jakub Jelinek <jakub@redhat.com>
* posix/regex.c (convert_mbs_to_wcs): Change is_binary to char *.
(regex_compile): Likewise.
(FREE_VARIABLES): Don't free is_binary1 and is_binary2.
(re_match_2_internal): Use just is_binary instead of two variables.
Use REGEX_TALLOC to allocate it and FREE_VAR to free on failure.
2001-02-09 Ulrich Drepper <drepper@redhat.com>
2001-02-07 Ulrich Drepper <drepper@redhat.com>
* sysdeps/gnu/netinet/tcp.h: Correct values of TCP_ macros.
Patch by Pekka.Pietikainen@cern.ch.
* posix/regex.c: Correct several problems with 64-bit architectures
introduced in the MBS changes.
Patch by Isamu Hasegawa <isamu@yamato.ibm.com>.
2001-02-07 Jakub Jelinek <jakub@redhat.com>
* math/tgmath.h: Only add l suffixes if __NO_LONG_DOUBLE_MATH is
not defined.
* sysdeps/alpha/fpu/bits/mathinline.h: Honour __NO_MATH_INLINES.
2001-02-06 Ulrich Drepper <drepper@redhat.com>
* sysdeps/unix/sysv/linux/ia64/pt-initfini.c: First attempt to fix the
broken code. Patch by Jes Sorensen.