Commit Graph

80 Commits

Author SHA1 Message Date
Joseph Myers
a5f0adb39b Fix regex wcrtomb namespace (bug 18496).
The regex code brings in references to wcrtomb, which isn't in all the
standards that contain regex.  This patch makes it call __wcrtomb
instead (in fact some places already called __wcrtomb, so this patch
makes it internally consistent about which name is used).

Tested for x86_64 and x86 that installed stripped shared libraries are
unchanged by the patch.

	[BZ #18496]
	* posix/regex_internal.c (build_wcs_upper_buffer): Call __wcrtomb
	instead of wcrtomb.
2015-06-05 21:31:39 +00:00
Joseph Myers
9dd6b7799a Fix regex wctype namespace (bug 18495).
regcomp brings in references to various wctype functions that aren't
in all the standards including regcomp.  This patch fixes this in the
usual way by using the __* versions of these functions (which already
exist, but some didn't have libc_hidden_proto / libc_hidden_def
before).

Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by the patch).  (Other wide character
function references from the regex code mean that this patch by itself
doesn't fix any XFAILed linknamespace test failures; further patches
will be needed for that.)

	[BZ #18495]
	* wctype/wcfuncs.c (__iswalnum): Use libc_hidden_def.
	(__iswlower): Likewise.
	* include/wctype.h (__iswalnum): Declare.  Use libc_hidden_proto.
	(__iswlower): Likewise.
	* posix/regcomp.c (re_compile_fastmap_iter): Call __towlower
	instead of towlower.
	* posix/regex_internal.c (build_wcs_upper_buffer): Call __iswlower
	instead of iswlower.  Call __towupper instead of towupper.
	* posix/regex_internal.h (IS_WIDE_WORD_CHAR): Call __iswalnum
	instead of iswalnum.
2015-06-05 20:04:47 +00:00
Joseph Myers
b168057aaa Update copyright dates with scripts/update-copyrights. 2015-01-02 16:29:47 +00:00
Siddhesh Poyarekar
78dd658a02 Check if DEBUG is defined in regex_internal.c
The DEBUG macro is checked for its value in one place and if it is
defined in another.  Make this consistent across the two cases and use
the same style that we did in mktime.c, which is to check if the macro
is defined and it is set.
2014-08-01 14:24:41 +05:30
Allan McRae
d4697bc93d Update copyright notices with scripts/update-copyrights 2014-01-01 22:00:23 +10:00
Joseph Myers
568035b787 Update copyright notices with scripts/update-copyrights. 2013-01-02 19:05:09 +00:00
Stanislav Brabec
71b5d1c5d5 [BZ #13637]
* posix/regex_internal.c (re_string_skip_chars): Fix miscomputation
	of remain_len that may cause incomplete multi-byte character and
	false match.
	* posix/bug-regex33.c: New file.
	* posix/Makefile (tests): Add bug-regex33.
2012-02-28 16:16:45 +01:00
Paul Eggert
59ba27a63a Replace FSF snail mail address with URLs. 2012-02-09 23:18:22 +00:00
Jakub Jelinek
2ba92745c3 Fix up regcomp/regexec
The problem is that parse_bracket_symbol is miscompiled, and it turns
out it is because of an incorrect attribute on re_string_fetch_byte_case.
Unlike re_string_peek_byte_case, this one is really not pure, it modifies memory
(increments pstr->cur_idx), and with the pure attribute GCC assumed it doesn't
and it cached the presumed value of regexp->cur_idx in a variable across the
 for (;; ++i)
   {
     if (i >= BRACKET_NAME_BUF_SIZE)
       return REG_EBRACK;
     if (token->type == OP_OPEN_CHAR_CLASS)
       ch = re_string_fetch_byte_case (regexp);
     else
       ch = re_string_fetch_byte (regexp);
     if (re_string_eoi(regexp))
       return REG_EBRACK;
     if (ch == delim && re_string_peek_byte (regexp, 0) == ']')
       break;
     elem->opr.name[i] = ch;
   }
2011-12-30 17:13:56 -05:00
Ulrich Drepper
5e2b63c658 Fix warnings in regex 2011-11-12 01:23:45 -05:00
Ulrich Drepper
8887a920a4 Fix unnecessary overallocation due to incomplete character
When incomplete characters are found at the end of a string the
code ran amok and allocated lots of memory.  Stricter limits
are now in place.
2011-05-28 17:14:30 -04:00
Ulrich Drepper
5ddf954cf1 Simplify test in re_string_skip_chars. 2010-01-22 10:22:53 -08:00
Ulrich Drepper
4f08104cbf regex_internal.c: don't assume WEOF fits in wchar_t 2010-01-22 10:17:45 -08:00
Ulrich Drepper
0dae5d4ec1 regex_internal.c: remove useless variable and the code to set it. 2010-01-22 09:57:30 -08:00
Ulrich Drepper
2236464488 Extend overflow detection in re_dfa_add_node. 2010-01-22 09:48:35 -08:00
Ulrich Drepper
54dd0ab31f regex: avoid internal re_realloc overflow 2010-01-22 09:33:01 -08:00
Ulrich Drepper
2da42bc065 Fix a few more cases of ignored return values in regex. 2010-01-15 12:03:16 -08:00
Ulrich Drepper
b3918c7d7f * posix/regcomp.c (re_compile_fastmap_iter): Use __mbrtowc.
* posix/regex_internal.c (build_wcs_buffer, build_wcs_upper_buffer,
	re_string_skip_chars, re_string_reconstruct): Likewise.
	* posix/regex_internal.h [!_LIBC] (__mbrtowc): New #define.
2009-01-08 00:23:09 +00:00
Ulrich Drepper
0caca71ac9 * string/Makefile (distribute): Add str-two-way.h.
2008-03-29  Eric Blake	<ebb9@byu.net>

	Rewrite string searches to O(n) rather than O(n^2).
	* string/str-two-way.h: New file.  For linear fixed-allocation
	string searching.
	* string/memmem.c: New implementation.
	* string/strstr.c: New implementation.
	* string/strcasestr.c: New implementation.

	* sysdeps/posix/getaddrinfo.c (getaddrinfo): Call _res_hconf_init
2008-05-15 04:42:20 +00:00
Ulrich Drepper
ba40cc1540 [BZ #3155]
2006-09-07  Jakub Jelinek  <jakub@redhat.com>
	[BZ #3155]
	* sysdeps/powerpc/powerpc32/fpu/s_lrint.S (__lrint): Don't access
	stack below r1.

	* posix/regex_internal.c (re_string_reconstruct): Handle
	offset < pstr->valid_raw_len && pstr->offsets_needed case.
	Ensure no bytes read before raw_mbs array.  Pass a saved copy of
	pstr->valid_len - 1 rather than pstr->valid_raw_len - 1 to
	re_string_context_at.
	* posix/Makefile: Add rules to build and run bug-regex26 test.
	* posix/bug-regex26.c: New test.

	* dlfcn/Makefile (LDLIBS-bug-atexit3-lib.so): Add ld.so.
2006-09-07 13:50:31 +00:00
Ulrich Drepper
33e63e7993 * posix/regex_internal.c (re_string_skip_chars): If no character has
been converted at all, set *last_wc to WEOF.  If mbrtowc failed, set wc
	to the byte which couldn't be converted.
	(re_string_reconstruct): Don't clear valid_raw_len before calling
	re_string_skip_chars.  If wc is WEOF after re_string_skip_chars, set
	tip_context using re_string_context_at.
	* posix/Makefile: Add rules to build and run bug-regex25 test.
	* posix/bug-regex25.c: New test.
2006-06-04 04:59:36 +00:00
Andreas Jaeger
4f7e7f8e00 [BZ #1950, BZ #2153]
Update.
	[BZ #1950]
	* posix/regex_internal.c (re_string_reconstruct): Adjust for
	build_wcs_upper_buffer change.
	(build_wcs_upper_buffer): Change return type.

	[BZ #2153]
	* math/s_cacosh.c (__cacosh): Do not return a negative
	value. Patch by Wes Loewer <wjltemp-temp01@yahoo.com>.
	* math/s_cacoshl.c (__cacoshl): Likewise.
	* math/s_cacoshf.c (__cacoshf): Likewise.
	* math/libm-test.inc (cacosh_test): Adjust for change.

	* sysdeps/alpha/fpu/libm-test-ulps: Adopt for cacosh test change.
	* sysdeps/hppa/fpu/libm-test-ulps: Likewise.
	* sysdeps/i386/fpu/libm-test-ulps: Likewise.
	* sysdeps/ia64/fpu/libm-test-ulps: Likewise.
	* sysdeps/m68k/fpu/libm-test-ulps: Likewise.
	* sysdeps/mips/fpu/libm-test-ulps: Likewise.
	* sysdeps/powerpc/fpu/libm-test-ulps: Likewise.
	* sysdeps/s390/fpu/libm-test-ulps: Likewise.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
	* sysdeps/sh/sh4/fpu/libm-test-ulps: Likewise.
	* sysdeps/sparc/sparc32/fpu/libm-test-ulps: Likewise.
	* sysdeps/sparc/sparc64/fpu/libm-test-ulps: Likewise.
2006-01-15 17:59:52 +00:00
Ulrich Drepper
1ba81cea27 * posix/regexec.c: Finish prototyping of static functions.
* posix/regex_internal.c: Likewise.
2005-10-15 15:23:33 +00:00
Ulrich Drepper
db26cb7576 [BZ #1248]
2005-08-26  Paul Eggert  <eggert@cs.ucla.edu>
	[BZ #1248]
	* posix/regex_internal.h (bitset_not, bitset_merge, bitset_not_merge,
	bitset_mask, re_string_allocate, re_string_construct,
	re_string_reconstruct, re_string_destruct, re_string_elem_size_at,
	re_string_char_size_at, re_string_wchar_at, re_string_peek_byte_case,
	re_string_fetch_byte_case, re_node_set_alloc, re_node_set_init_1,
	re_node_set_init_2, re_node_set_init_copy, re_node_set_add_intersect,
	re_node_set_init_union, re_node_set_merge, re_node_set_insert,
	re_node_set_insert_last, re_node_set_compare, re_node_set_contains,
	re_node_set_remove_at, re_dfa_add_node, re_acquire_state,
	re_acquire_state_context): Remove unnecessary forward decls.
	(re_string_char_size_at, re_string_wchar_at, re_string_elem_size_at):
	Put __attribute at function definition, now that the function decl
	has been removed.
	* posix/regex_internal.c (re_string_peek_byte_case,
	re_string_fetch_byte_case, re_node_set_compare, re_node_set_contains):
	Likewise.
2005-10-13 21:10:24 +00:00
Ulrich Drepper
e2f5526407 [BZ #1231]
2005-08-23  Paul Eggert  <eggert@cs.ucla.edu>
	[BZ #1231]
	* posix/regex_internal.c (re_string_skip_chars, register_state,
	calc_state_hash): Remove forward decls.
	* posix/regexec.c (acquire_init_state_context, check_halt_node_context,
	proceed_next_node, pop_fail_stack, sub_epsilon_src_nodes,
	clean_state_log_if_needed): Likewise.

	* posix/regex.c: No need to use K&R definitions for static functions.
	* posix/regex_internal.c: Likewise.
2005-10-13 20:08:58 +00:00
Ulrich Drepper
997470b3e1 [BZ #281]
* posix/regex.h: Define RE_TRANSLATE_TYPE as unsigned char *.
	* posix/regcomp.c: Remove unnecessary uses of
	unsigned RE_TRANSLATE_TYPE.
	* posix/regex_internal.h: Likewise.
	* posix/regex_internal.c: Likewise.
	* posix/regexexec.c: Likewise.
	Based on a patch by Stepan Kasal <kasal@ucw.cz>.
2005-09-23 06:11:29 +00:00
Ulrich Drepper
01ed6ceb7c * posix/regex_internal.c (re_string_reconstruct): Avoid calling
mbrtowc for very simple UTF-8 case.

2005-09-01  Paul Eggert  <eggert@cs.ucla.edu>

	* posix/regex_internal.c (build_wcs_upper_buffer): Fix portability
	bugs in int versus size_t comparisons.

2005-09-06  Ulrich Drepper  <drepper@redhat.com>

	* posix/regex_internal.c (re_acquire_state): Make DFA pointer arg
	a pointer-to-const.
	(re_acquire_state_context): Likewise.
	* posix/regex_internal.h: Adjust prototypes.

2005-08-31  Jim Meyering  <jim@meyering.net>

	* posix/regcomp.c (search_duplicated_node): Make first pointer arg
	a pointer-to-const.
	* posix/regex_internal.c (create_ci_newstate, create_cd_newstate,
	register_state): Likewise.
	* posix/regexec.c (search_cur_bkref_entry, check_dst_limits):
	(check_dst_limits_calc_pos_1, check_dst_limits_calc_pos):
	(group_nodes_into_DFAstates): Likewise.

	* posix/regexec.c (re_search_internal): Simplify update of
	rm_so and rm_eo by replacing "if (A == B) A += C - B;"
	with the equivalent of "if (A == B) A = C;".

2005-09-06  Ulrich Drepper  <drepper@redhat.com>

	* posix/regcomp.c (re_compile_internal): Change third parameter type
	to size_t.
	(init_dfa): Likewise.  Make sure that arithmetic on pat_len doesn't
	overflow.
	* posix/regex_internal.h (struct re_dfa_t): Change type of nodes_alloc
	and nodes_len to size_t.
	* posix/regex_internal.c (re_dfa_add_node): Use size_t as type for
	new_nodes_alloc.  Check for overflow.

2005-08-31  Paul Eggert  <eggert@cs.ucla.edu>

	* posix/regcomp.c (re_compile_fastmap_iter, init_dfa, init_word_char):
	(optimize_subexps, lower_subexp):
	Don't assume 1<<31 has defined behavior on hosts with 32-bit int,
	since the signed shift might overflow.  Use 1u<<31 instead.
	* posix/regex_internal.h (bitset_set, bitset_clear, bitset_contain):
	Likewise.
	* posix/regexec.c (check_dst_limits_calc_pos_1): Likewise.
	(check_subexp_matching_top): Likewise.
	* posix/regcomp.c (optimize_subexps, lower_subexp):
	Use CHAR_BIT rather than 8, for clarity.
	* posix/regexec.c (check_dst_limits_calc_pos_1):
	(check_subexp_matching_top): Likewise.
	* posix/regcomp.c (init_dfa): Make table_size unsigned, so that we
	don't have to worry about portability issues when shifting it left.
	Remove no-longer-needed test for table_size > 0.
	* posix/regcomp.c (parse_sub_exp): Do not shift more bits than there
	are in a word, as the resulting behavior is undefined.
	* posix/regexec.c (check_dst_limits_calc_pos_1): Likewise;
	in one case, a <= should have been an <, and in another case the
	whole test was missing.
	* posix/regex_internal.h (BYTE_BITS): Remove.  All uses changed to
	the standard name CHAR_BIT.
2005-09-07 01:15:33 +00:00
Ulrich Drepper
2d87db5b53 * posix/regex_internal.h (re_sub_match_top_t): Remove unused member
next_last_offset.
	(struct re_dfa_t): Remove unused member states_alloc.
	* posix/regcomp.c (init_dfa): Don't initialize unused members.

2005-08-25  Paul Eggert  <eggert@cs.ucla.edu>

	* posix/regexec.c (set_regs): Don't alloca with an unbounded size.

	alloca modernization/simplification for regex.
	* posix/regex.c: Remove portability cruft for alloca.  This no longer
	needs to be at the start of the file, and can be moved into
	regex_internal.h and simplified.
	* posix/regex_internal.h: Include <alloca.h>.
	(__libc_use_alloca) [!defined _LIBC]: New macro.
	* posix/regexec.c (build_trtable): Remove "#ifdef _LIBC",
	since the code now works outside glibc.

2005-09-06  Ulrich Drepper  <drepper@redhat.com>

	* include/regex.h: Remove use of _RE_ARGS.

2005-08-25  Paul Eggert  <eggert@cs.ucla.edu>

	* posix/regexec.c (find_recover_state): Change "err" to "*err".

2005-08-24  Paul Eggert  <eggert@cs.ucla.edu>

	* posix/regcomp.c (regerror): Pointer args are 'restrict',
	as per POSIX.
	* posix/regex.h (regerror): Likewise.
	* manual/pattern.texi (POSIX Regexp Compilation): Likewise.
	Similarly for regcomp and regexec.  Also, first 2 args of regexec
	and 2nd arg of regerror are const.

	* posix/regex.c: Do not include <sys/types.h>, as POSIX no longer
	requires this.  (The code never needed it.)

2005-08-20  Paul Eggert  <eggert@cs.ucla.edu>

	* posix/regexec.c (sift_states_bkref): re_node_set_insert returns
	int, not reg_errcode_t.

	* posix/regex_internal.c (calc_state_hash): Put 'inline' before type,
	since some broken compilers warn about it otherwise.

	* posix/regcomp.c (create_initial_state): Remove duplicate decl.

2005-08-20  Paul Eggert  <eggert@cs.ucla.edu>

	* posix/regex.h (_RE_ARGS): Remove.  No longer needed, since we assume
	C89 or better.  All uses removed.

2005-09-06  Ulrich Drepper  <drepper@redhat.com>

	* posix/regex.c: Prevent using C++ compilers.

2005-08-19  Paul Eggert  <eggert@cs.ucla.edu>

	* posix/regcomp.c (duplicate_node): Return new index, not an error
	code, and let the caller return REG_ESPACE if out of space.  This
	removes an uninitialied-variable warning with GCC 4.0.1, and also
	avoids taking the address of a local variable.  All callers
	changed.

2005-09-06  Ulrich Drepper  <drepper@redhat.com>

	* include/time.h (__strptime_internal): Rename parameter to avoid
	bogus compiler warning.

2005-08-19  Jim Meyering  <jim@meyering.net>

	* posix/regexec.c (proceed_next_node): Redo local variables to
	avoid GCC shadowing warnings.

2005-09-06  Ulrich Drepper  <drepper@redhat.com>

	* posix/regex_internal.c (re_acquire_state): Minor code rearrangement.
	(re_acquire_state_context): Likewise.

2005-08-19  Paul Eggert  <eggert@cs.ucla.edu>

	* posix/regex_internal.c (re_string_realloc_buffers):
	(re_node_set_insert, re_node_set_insert_last, re_dfa_add_node):
	Rename local variables to avoid GCC shadowing warnings.

2005-07-08  Eric Blake  <ebb9@byu.net>
            Paul Eggert  <eggert@cs.ucla.edu>

	* posix/regcomp.c (init_dfa): Store __btowc value in wint_t, not
	wchar_t.  Remove now-unnecessary cast.
	(build_range_exp): Likewise.
2005-09-06 21:15:13 +00:00
Ulrich Drepper
ec73fd87da * posix/regex_internal.c (build_wcs_buffer): Use MB_LEN_MAX not
MB_CUR_MAX.
	(build_wcs_upper_buffer): Likewise.
2005-07-05 22:01:42 +00:00
Ulrich Drepper
88764ae26a [BZ #779]
2005-03-10  Jakub Jelinek  <jakub@redhat.com>
	* math/test-misc.c (main): Add some more tests.

2005-03-17  Jakub Jelinek  <jakub@redhat.com>

	* posix/regcomp.c (re_compile_fastmap_iter): Fix check for failed
	__wcrtomb.  Check return values of other __wcrtomb calls.
	* posix/regex_internal.c (build_wcs_buffer, re_string_skip_chars):
	Change mbclen type to size_t.
	(build_wcs_upper_buffer): Change mbclen and mbcdlen type to size_t.
	Handle mb chars whose upper case doesn't have multibyte representation
	in locale's charset.

2005-03-15  Jakub Jelinek  <jakub@redhat.com>

	* malloc/malloc.c (_int_icalloc, _int_icomalloc, iALLOc,
	public_iCALLOc, public_iCALLOc, public_iCOMALLOc): Protect with
	#ifndef _LIBC.

	[BZ #779]
	* malloc/malloc.c (public_mTRIm): Initialize malloc if not yet
	initialized.

2005-03-10  Jakub Jelinek  <jakub@redhat.com>

	* misc/sys/cdefs.h (__always_inline): Define.
	* posix/bits/unistd.h (read, pread, pread64, readlink, getcwd, getwd):
	Use __always_inline instead of __inline.
	* socket/bits/socket2.h (recv, recvfrom): Likewise.
	* libio/bits/stdio2.h (gets, fgets, fgets_unlocked): Likewise.
	* string/bits/string3.h (__memcpy_ichk, __memmove_ichk, __mempcpy_ichk,
	__memset_ichk, __strcpy_ichk, __stpcpy_ichk, __strncpy_ichk,
	__strcat_ichk, __strncat_ichk): Use __always_inline instead of
	__inline__ __attribute__ ((__always_inline__)).

2005-03-09  Jakub Jelinek  <jakub@redhat.com>

	* debug/tst-chk1.c: Include sys/socket.h and sys/un.h.
	(do_test): Add new tests for recv, recvfrom, getcwd, getwd and
	readlink.  Add some more tests for read, pread, pread64, fgets and
	fgets_unlocked.

	* posix/bits/unistd.h (read, pread, pread64, readlink,
	getcwd, getwd): Change macros into extern inline functions.
	(__read_alias, __pread_alias, __pread64_alias, __readlink_alias,
	__getcwd_alias, __getwd_alias): New prototypes.
	* socket/bits/socket2.h (recv, recvfrom): Change macros into
	extern inline functions.
	(__recv_alias, __recvfrom_alias): New prototypes.
	* libio/bits/stdio2.h (gets, fgets, fgets_unlocked): Change macros
	into extern inline functions.
	(__gets_alias, __fgets_alias, __fgets_unlocked_alias): New prototypes.

	* debug/pread_chk.c (__pread_chk): Fix order of arguments passed
	to __pread.
	* debug/pread64_chk.c (__pread64_chk): Fix order of arguments passed
	to __pread64.
2005-03-19 00:28:51 +00:00
Ulrich Drepper
1c99f950d1 * posix/regexec.c (check_node_accept_bytes): Correct cast to avoid
warning.
	* posix/regex_internal.c (re_string_reconstruct): Add cast to
	avoid warning.
	(build_wcs_upper_buffer): Change type of bug to plain char.
	* locale/weightwc.h (findidx): Add casts to avoid warnings.
	* time/mktime.c (ranged_convert): Initialize tm to make the
	compiler happy.
	* wcsmbs/mbsrtowcs_l.c (__mbsrtowcs_l): Add casts to avoid warnings.
	* wcsmbs/wcsnrtombs.c (__wcsnrtombs): Add casts to avoid warnings.
	* wcsmbs/mbsnrtowcs.c: Add casts to avoid warnings.
	* wcsmbs/wcsrtombs.c (__wcsrtombs): Add casts to avoid warnings.
	* wcsmbs/wcrtomb.c (__wcrtomb): Add casts to avoid warnings.
	* wcsmbs/mbrtowc.c (__mbrtowc): Use unsigned char for outbuf.
	* posix/regex_internal.c [_LIBC] (build_wcs_buffer): Avoid using
	dynamically sized array.
	(build_wcs_upper_buffer): Likewise.
2005-03-06 07:27:56 +00:00
Ulrich Drepper
963d8d782f [BZ #558]
Update.
2005-01-27  Paolo Bonzini  <bonzini@gnu.org>

	[BZ #558]
	* posix/regcomp.c (calc_inveclosure): Return reg_errcode_t.
	Initialize the node sets in dfa->inveclosures.
	(analyze): Initialize inveclosures only if it is needed.
	Check errors from calc_inveclosure.
	* posix/regex_internal.c (re_dfa_add_node): Do not initialize
	the inveclosure node set.
	* posix/regexec.c (re_search_internal): If nmatch includes unused
	subexpressions, reset them to { rm_so: -1, rm_eo: -1 } here.

	* posix/regcomp.c (parse_bracket_exp) [!RE_ENABLE_I18N]:
	Do build a SIMPLE_BRACKET token.

	* posix/regexec.c (transit_state_mb): Do not examine nodes
	where ACCEPT_MB is not set.
2005-01-27 19:08:10 +00:00
Ulrich Drepper
02f3550c8b [BZ #605, BZ #611]
Update.
2004-12-13  Paolo Bonzini  <bonzini@gnu.org>

	Separate parsing and creation of the NFA.  Avoided recursion on
	the (very unbalanced) parse tree.
	[BZ #611]
	* posix/regcomp.c (struct subexp_optimize, analyze_tree, calc_epsdest,
	re_dfa_add_tree_node, mark_opt_subexp_iter): Removed.
	(optimize_subexps, duplicate_tree, calc_first, calc_next,
	mark_opt_subexp): Rewritten.
	(preorder, postorder, lower_subexps, lower_subexp, link_nfa_nodes,
	create_token_tree, free_tree, free_token): New.
	(analyze): Accept a regex_t *.  Invoke the passes via the preorder and
	postorder generic visitors.  Do not initialize the fields in the
	re_dfa_t that represent the transitions.
	(free_dfa_content): Use free_token.
	(re_compile_internal): Analyze before UTF-8 optimizations.  Do not
	include optimization of subexpressions.
	(create_initial_state): Fetch the DFA node index from the first node's
	bin_tree_t *.
	(optimize_utf8): Abort on unexpected nodes, including OP_DUP_QUESTION.
	Return on COMPLEX_BRACKET.
	(duplicate_node_closure): Fix comment.
	(duplicate_node): Do not initialize the fields in the
	re_dfa_t that represent the transitions.
	(calc_eclosure, calc_inveclosure): Do not handle OP_DELETED_SUBEXP.
	(create_tree): Remove final argument.  All callers adjusted.  Rewritten
	to use create_token_tree.
	(parse_reg_exp, parse_branch, parse_expression, parse_bracket_exp,
	build_charclass_op): Use create_tree or create_token_tree instead
	of re_dfa_add_tree_node.
	(parse_dup_op): Likewise.  Also free the tree using free_tree for
	"<re>{0}", and lower OP_DUP_QUESTION to OP_ALT: "a?" is equivalent
	to "a|".  Adjust invocation of mark_opt_subexp.
	(parse_sub_exp): Create a single SUBEXP node.
	* posix/regex_internal.c (re_dfa_add_node): Remove last parameter,
	always perform as if it was 1.  Do not initialize OPT_SUBEXP and
	DUPLICATED, and initialize the DFA fields representing the transitions.
	* posix/regex_internal.h (re_dfa_add_node): Adjust prototype.
	(re_token_type_t): Move OP_DUP_PLUS and OP_DUP_QUESTION to the tokens
	section.  Add a tree-only code SUBEXP.  Remove OP_DELETED_SUBEXP.
	(bin_tree_t): Include a full re_token_t for TOKEN.  Turn FIRST and
	NEXT into pointers to trees.  Remove ECLOSURE.

2004-12-28  Paolo Bonzini  <bonzini@gnu.org >

	[BZ #605]
	* posix/regcomp.c (parse_bracket_exp): Do not modify DFA nodes
	that were already created.
	* posix/regex_internal.c (re_dfa_add_node): Set accept_mb field
	in the token if needed.
	(create_ci_newstate, create_cd_newstate): Set accept_mb field
	from the tokens' field.
	* posix/regex_internal.h (re_token_t): Add accept_mb field.
	(ACCEPT_MB_NODE): Removed.
	* posix/regexec.c (proceed_next_node, transit_states_mb,
	build_sifted_states, check_arrival_add_next_nodes): Use
	accept_mb instead of ACCEPT_MB_NODE.
2005-01-26 22:42:49 +00:00
Ulrich Drepper
5cf53cc208 Update.
2005-01-06  Ulrich Drepper  <drepper@redhat.com>

	* misc/sys/cdefs.h: Define __wur.
	* libio/stdio.h: Use __wur for a number of interfaces.
	* posix/unistd.h: Likewise.

	* posix/regex_internal.c (free_state): Free word_trtable.
2005-01-06 21:01:25 +00:00
Ulrich Drepper
a334319f65 (CFLAGS-tst-align.c): Add -mpreferred-stack-boundary=4. 2004-12-22 20:10:10 +00:00
Jakub Jelinek
0ecb606cb6 2.5-18.1 2007-07-12 18:26:36 +00:00
Ulrich Drepper
5cf1ec5256 Update.
2004-12-07  Paolo Bonzini  <bonzini@gnu.org>

	* posix/regexec.c (proceed_next_node): Simplify treatment of epsilon
	nodes.  Pass the pushed node to push_fail_stack.
	(push_fail_stack): Accept a single node rather than an array
	of two epsilon destinations.
	(build_sifted_states): Only walk non-epsilon nodes.
	(check_arrival): Don't pass epsilon nodes to
	check_arrival_add_next_nodes.
	(check_arrival_add_next_nodes) [DEBUG]: Abort if an epsilon node is
	found.
	(check_node_accept): Do expensive checks later.
	(add_epsilon_src_nodes): Cache result of merging the inveclosures.
	* posix/regex_internal.h (re_dfastate_t): Add non_eps_nodes and
	inveclosure.
	(re_string_elem_size_at, re_string_char_size_at, re_string_wchar_at,
	re_string_context_at, re_string_peek_byte_case,
	re_string_fetch_byte_case, re_node_set_compare, re_node_set_contains):
	Declare as pure.
	* posix/regex_internal.c (create_newstate_common): Remove.
	(register_state): Move part of it here.  Initialize non_eps_nodes.
	(free_state): Free inveclosure and non_eps_nodes.
	(create_cd_newstate, create_ci_newstate): Allocate the new
	re_dfastate_t here.
2004-12-10 04:37:58 +00:00
Ulrich Drepper
20dc2f79f7 Update.
2004-11-23  Paolo Bonzini  <bonzini@gnu.org>

	* posix/regcomp.c (analyze_tree): Always call calc_epsdest.
	(calc_inveclosure): Use re_node_set_insert_last.
	(parse_dup_op): Lower X{1,5} to (X(X(X(XX?)?)?)?)?
	rather than X?X?X?X?X?.
	* posix/regex_internal.h (re_node_set_insert_last): New declaration.
	* posix/regex_internal.c (re_node_set_insert_last): New function.
	* posix/PCRE.tests: Add testcases.
2004-11-25 22:32:18 +00:00
Ulrich Drepper
bb677c9581 Update.
2004-11-09  Paolo Bonzini  <bonzini@gnu.org>

	* posix/regexec.c (transit_state): Remove the check for
	out-of-bounds buffers.
	(check_matching): Check here for out-of-bounds buffers.
	(re_search_internal): Store into match_kind a set of bits
	indicating which incantation of fastmap scanning must be
	used.  Use a switch statement instead of multiple ifs.
	Exit the final "for (;;)" with goto free_return unless
	the match succeeded, thus simplifying some conditionals.

	* posix/regex_internal.c (re_string_reconstruct,
	re_string_context_at): Add several branch predictions for
	case-sensitive matching and no transition table being used.

2004-11-10  Ulrich Drepper  <drepper@redhat.com>

	* posix/tst-waitid.c: Don't use error to print error message, they
	won't end up in the .out file.

	* nscd/nscd_getgr_r.c: Likewise.  Make map externally visible.
	* nscd/nscd_gethst_r.c: Likewise.
2004-11-10 15:48:06 +00:00
Ulrich Drepper
e40a38b383 Update.
2004-11-08  Ulrich Drepper  <drepper@redhat.com>

	* posix/regcomp.c (utf8_sb_map): Define.
	(free_dfa_content): Don't free dfa->sb_char if it's a pointer to
	utf8_sb_map.
	(init_dfa): Use utf8_sb_map instead of initializing memory when the
	encoding is UTF-8.

	* posix/regcomp.c (init_dfa): Get the codeset name outside glibc as
	well.  Check if it is spelled UTF8 as well as UTF-8, and check
	case-insensitively.  Set dfa->map_notascii manually when outside
	glibc.
	* posix/regex_internal.c (build_wcs_upper_buffer) [!_LIBC]: Enable
	optimizations based on map_notascii.
	* posix/regex_internal.h [HAVE_LANGINFO_H || HAVE_LANGINFO_CODESET
	|| _LIBC]: Include langinfo.h.

	* posix/regex_internal.h (struct re_backref_cache_entry): Add "more"
	field.
	* posix/regexec.c (check_dst_limits): Hoist computation of the source
	and destination bkref_idx out of the loop.  Pass it to
	check_dst_limits_calc_pos.
	(check_dst_limits_calc_pos_1): New function, containing the recursive
	loop of check_dst_limits_calc_pos; uses the "more" field of
	struct re_backref_cache to control the loop.
	(check_dst_limits_calc_pos): Store into "boundaries" the position
	relative to lim's start and end positions.  Do not accept eclosures,
	accept bkref_idx instead.  Call check_dst_limits_calc_pos_1 to do the
	work.
	(sift_states_bkref): Use the "more" field of struct re_backref_cache
	to control the loop.  A big "if" was turned into a continue and the
	function was reindented.
	(get_subexp): Use the "more" field of struct re_backref_cache
	to control the loop.
	(match_ctx_add_entry): Initialize the bkref_ents' "more" field.
	(search_cur_bkref_entry): Return -1 if out of bounds.

	* posix/regexec.c (empty_set): Remove.
	(sift_states_backward): Remove cur_src variable.  Move inner loop
	to build_sifted_states.
	(build_sifted_states): Extract from sift_states_backward.  Do not
	use empty_set.
	(update_cur_sifted_state): Do not use empty_set.  Special case
	dest_nodes->nelem == 0.
2004-11-08 22:49:44 +00:00
Ulrich Drepper
d40eb37aad [BZ #40]
Update.
2004-05-15  Petter Reinholdtsen  <pere@hungry.com>

	* locale/iso-3166.def: Remove YUGOSLAVIA and insert "SERBIA AND
	MONTENEGRO" which have taken over the code 819.  Patch from
	Danilo Segan. [BZ #40]

2004-05-15  Jakub Jelinek  <jakub@redhat.com>

	* sysdeps/unix/sysv/linux/sparc/sparc32/sysdep.h
	(SYSCALL_ERROR_HANDLER): Rename __sparc.get_pic.l7 to
	__sparc_get_pic_l7.

2004-05-15  Joseph S. Myers  <jsm@polyomino.org.uk>

	* catgets/gencat.c: Update bug reporting instructions.
	* csu/version.c: Likewise.
	* debug/catchsegv.sh: Likewise.
	* debug/pcprofiledump.c: Likewise.
	* debug/xtrace.sh: Likewise.
	* elf/ldd.bash.in: Likewise.
	* iconv/iconv_prog.c: Likewise.
	* iconv/iconvconfig.c: Likewise.
	* locale/programs/locale.c: Likewise.
	* locale/programs/localedef.c: Likewise.
	* login/programs/pt_chown.c: Likewise.
	* malloc/memusage.sh: Likewise.
	* malloc/memusagestat.c: Likewise.
	* malloc/mtrace.pl: Likewise.
	* manual/crypt.texi: Likewise.
	* manual/install.texi: Likewise.
	* nss/makedb.c: Likewise.

2004-05-14  Jakub Jelinek  <jakub@redhat.com>

	* sysdeps/sparc/sparc32/dl-machine.h (elf_machine_rela): Only
	CHECK_STATIC_TLS if sym != NULL.
	* sysdeps/sh/dl-machine.h (elf_machine_rela): Likewise.
	* sysdeps/i386/dl-machine.h (elf_machine_rela): Likewise.

2004-05-12  Andreas Schwab  <schwab@suse.de>

	* posix/regex_internal.c (build_wcs_buffer): Also set pstr->mbs
	when translating.

2004-05-13  H.J. Lu  <hongjiu.lu@intel.com>

	* Rules (xtests): Depend on tests.
2004-05-17 18:59:35 +00:00
Ulrich Drepper
1756853774 Update.
2004-02-24  Arnold D. Robbins  <arnold@skeeve.com>

	* posix/regex_internal.c (build_wcs_upper_buffer): Enclose
	`offsets_needed' label in `#ifdef _LIBC' to silence `unused label'
	compiler warning.

2004-02-24  Nelson H.F. Beebe  <beebe@math.utah.edu>

	* posix/regex_internal.c (build_wcs_buffer): Add cast to char* in call
	to `wcrtomb'.
	* posix/regex_internal.h (bitset_not, bitset_merge, bitset_not_merge,
	bitset_mask, re_string_char_size_a, re_string_wchar_at,
	re_string_elem_size_at): Change to use prototypes.
	(re_string_char_size_at, re_string_wchar_at,
	re_string_elem_size_at): Declare as `internal_function'.
2004-02-26 01:32:44 +00:00
Ulrich Drepper
f39eef7b5d Update.
2004-01-05  Jakub Jelinek  <jakub@redhat.com>

	* posix/regcomp.c (regcomp): Fix comment typo.
	(regfree): Free preg->translate, clear buffer, allocated, fastmap
	and translate fields.

	* posix/regcomp.c (build_charclass, buld_charclass_op): Change first
	argument to unsigned RE_TRANSLATE_TYPE.
	* posix/regex_internal.h (re_string_t): Change trans type to
	unsigned RE_TRANSLATE_TYPE.
	* posix/regex_internal.c (re_string_construct_common): Cast
	trans to unsigned RE_TRANSLATE_TYPE.
	(re_string_peek_byte_case, re_string_fetch_byte_case): Avoid fast
	path if pstr->trans.  Never translate the character through
	pstr->trans.
	* posix/Makefile (tests): Add bug-regex22.
	(bug-regex22-ENV): Set.
	* posix/bug-regex22.c: New test.
2004-01-06 22:12:27 +00:00
Ulrich Drepper
59e7ebcc20 Update.
2004-01-02  Paolo Bonzini  <bonzini@gnu.org>

	* posix/regex_internal.c (re_node_set_add_intersect,
	re_node_set_merge): Rewritten.
	(re_node_set_insert, re_node_set_remove_at):
	Avoid memmove, we know what direction we should copy and that we
	are copying 32-bit words.
	(re_node_set_compare): Iterate backwards.

	* posix/regex_internal.h (re_match_context_t): Add dfa member.
	* posix/regexec.c (match_ctx_free_subtops, search_cur_bkref_entry,
	match_ctx_add_sublast, sift_ctx_init, acquire_init_state_context,
	prune_impossible_nodes, check_halt_state_context, proceed_next_node,
	sift_states_backward, update_cur_sifted_state, check_dst_limits,
	check_dst_limits_calc_pos, sift_states_bkref, transit_state,
	check_subexp_matching_top, transit_state_sb, transit_state_mb,
	transit_state_bkref, get_subexp, get_subexp_sub, check_arrival,
	check_arrival_add_next_nodes, expand_bkref_cache, check_node_accept):
	Remove dfa parameter.  Get dfa from mctxt.  Adjust callers.
	(re_search_internal): Initialize mctxt.dfa.
2004-01-03 06:56:35 +00:00
Ulrich Drepper
56b168be5d Update.
2004-01-02  Jakub Jelinek  <jakub@redhat.com>

	* posix/regex_internal.c (re_node_set_insert): Remove unused variables.

	* posix/regex_internal.h (re_dfa_t): Add syntax field.
	* posix/regcomp.c (parse): Initialize dfa->syntax.
	* posix/regexec.c (acquire_init_state_context,
	prune_impossible_nodes, check_matching, check_halt_state_context,
	proceed_next_node, sift_states_iter_mb, sift_states_backward,
	update_cur_sifted_state, sift_states_bkref, transit_state,
	transit_state_sb, transit_state_mb, transit_state_bkref,
	get_subexp, get_subexp_sub, check_arrival, expand_bkref_cache,
	build_trtable): Remove preg argument, add dfa argument instead
	and remove dfa = preg->buffer initialization in the body.
	Adjust all callers.
	(check_node_accept_bytes, group_nodes_into_DFAstates,
	check_node_accept): Likewise.  Use dfa->syntax instead of
	preg->syntax.
	(check_arrival_add_next_nodes): Remove preg argument.

	* posix/regex_internal.h (re_match_context_t): Make input
	re_string_t instead of a pointer to it.
	* posix/regex_internal.c (re_string_construct_common): Don't clear
	pstr here...
	(re_string_construct): ... but only here.
	* posix/regexec.c (match_ctx_init): Remove input argument.  Don't
	initialize fields to zero.
	(re_search_internal): Move input into mctx.input.
	(acquire_init_state_context, check_matching,
	check_halt_state_context, proceed_next_node,
	clean_state_log_if_needed, sift_states_bkref, sift_states_iter_mb,
	transit_state, transit_state_sb, transit_state_mb,
	transit_state_bkref, get_subexp, check_arrival,
	check_arrival_add_next_nodes, check_node_accept, extend_buffers):
	Change mctx->input into &mctx->input and mctx->input->field into
	mctx->input.field.

2004-01-02  Jakub Jelinek  <jakub@redhat.com>
	    Paolo Bonzini  <bonzini@gnu.org>

	* posix/regex_internal.h (re_const_bitset_ptr_t): New type.
	(re_string_t): Add newline_anchor, word_char and word_ops_used fields.
	(re_dfa_t): Change word_char type to bitset.  Add word_ops_used field.
	(re_string_context_at, re_string_reconstruct): Remove last argument.
	* posix/regex_internal.c (re_string_allocate): Initialize
	pstr->word_char and pstr->word_ops_used.
	(re_string_context_at): Remove newline_anchor argument.
	Use input->newline_anchor instead, swap && conditions.
	Only use IS_WIDE_WORD_CHAR if input->word_ops_used != 0.
	Use input->word_char bitmap instead of IS_WORD_CHAR.
	(re_string_reconstruct): Likewise.
	Adjust re_string_context_at caller.
	* posix/regexec.c (acquire_init_state_context,
	check_halt_state_context, transit_state, transit_state_sb,
	transit_state_mb, transit_state_bkref, check_arrival,
	check_node_accept): Adjust re_string_context_at and
	re_string_reconstruct callers.
	(re_search_internal): Likewise.  Set input.newline_anchor.
	(build_trtable): Use dfa->word_char bitmap instead of IS_WORD_CHAR.
	* posix/regcomp.c (init_word_char): Change return type to void.
	Set dfa->word_ops_used.
	(free_dfa_content): Don't free dfa->word_char.
	(parse_expression): Remove error handling for init_word_char.
2004-01-02 21:20:51 +00:00
Ulrich Drepper
8503c987b6 Update.
2004-01-01  Paolo Bonzini  <bonzini@gnu.org>

	* posix/regex_internal.h (re_dfastate_t): Fix size of the CONTEXT
	bitfield.

	* posix/regex_internal.c (re_node_set_insert):  Rewrite.
2004-01-02 11:08:23 +00:00
Ulrich Drepper
6b6557e8b3 Update.
2003-12-23  Paolo Bonzini  <bonzini@gnu.org>

	* posix/regex_internal.c (re_dfa_add_node): Initialize opt_subexp.
	* posix/regex_internal.h (re_token_type_t): Put OP_DUP_PLUS
	among the tokens, rather than among the epsilon-transiting nodes.
	(re_token_t): Add the opt_subexp flag.
	* posix/regcomp.c (optimize_utf8, calc_first,
	calc_next, calc_epsdest): Don't consider OP_DUP_PLUS.
	(mark_opt_subexp, mark_opt_subexp_iter): New functions.
	(parse_dup_op): Mostly rewritten, lowering OP_DUP_PLUS to
	OP_DUP_ASTERISK and marking optional subexpressions
	as such using mark_opt_subexp.
	* posix/regexec.c (set_regs): Initialize PREV_INDEX_MATCH
	and pass it to update_regs.
	(update_regs): Use the PREV_INDEX_MATCH parameter, together
	with the opt_subexp flag, in order to discard a final empty
	match of a repeated subexpression.
	* posix/BOOST.tests: Adjust test vectors.
	* posix/PCRE.tests: Likewise.
	* posix/rxspencer/tests: Likewise.

2003-12-17  Paolo Bonzini  <bonzini@gnu.org>
2003-12-16  Paolo Bonzini  <bonzini@gnu.org>
2003-12-17  Paolo Bonzini  <bonzini@gnu.org>
2003-12-16  Jakub Jelinek  <jakub@redhat.com>
2003-04-06  Kaz Kojima  <kkojima@rr.iij4u.or.jp>
2003-02-20  Paolo Bonzini  <bonzini@gnu.org>
2003-01-12  Franz Sirl  <Franz.Sirl-kernel@lauterbach.com>
2003-01-09  Richard Henderson  <rth@redhat.com>
2003-01-09  Richard Henderson  <rth@redhat.com>
2003-01-03  Paul Eggert  <eggert@twinsun.com>
2003-12-27 23:40:06 +00:00
Ulrich Drepper
8cae99dba5 Update.
2003-12-22  Jakub Jelinek  <jakub@redhat.com>

	* posix/regcomp.c: Remove C99-ism.
	* posix/tst-rxspencer.c: Likewise.
	Based on a patch by Alex Davis <alex14641@yahoo.com>.

2002-12-17  Paolo Bonzini  <bonzini@gnu.org>

	* posix/regex_internal.h [!_LIBC] (internal_function): Define.
	(re_string_allocate, re_string_construct, re_string_reconstruct,
	re_string_realloc_buffers, build_wcs_buffer,
	build_wcs_upper_buffer, build_upper_buffer,
	re_string_translate_buffer, re_string_destruct,
	re_string_elem_size_at, re_string_char_size_at,
	re_string_wchar_at, re_string_context_at,
	re_node_set_alloc, re_node_set_init_1
	re_node_set_init_2, re_node_set_init_copy,
	re_node_set_add_intersect, re_node_set_init_union,
	re_node_set_merge, re_node_set_insert
	re_node_set_compare, re_node_set_contains
	re_node_set_remove_at, re_dfa_add_node,
	re_acquire_state, re_acquire_state_context,
	free_state): Add internal_function to declaration.

	* posix/regexec.c (match_ctx_init, match_ctx_clean,
	match_ctx_free, match_ctx_free_subtops,
	match_ctx_add_entry, search_cur_bkref_entry,
	match_ctx_clear_flag, match_ctx_add_subtop,
	match_ctx_add_sublast, sift_ctx_init,
	re_search_internal, re_search_2_stub, re_search_stub,
	re_copy_regs, acquire_init_state_context,
	prune_impossible_nodes, check_matching,
	check_halt_node_context, check_halt_state_context
	update_regs, proceed_next_node, push_fail_stack,
	pop_fail_stack, set_regs, free_fail_stack_return,
	sift_states_iter_mb, sift_states_backward
	update_cur_sifted_state, add_epsilon_src_nodes,
	sub_epsilon_src_nodes, check_dst_limits,
	check_dst_limits_calc_pos, check_subexp_limits,
	sift_states_bkref, clean_state_log_if_need,
	merge_state_array, transit_state,
	check_subexp_matching_top, transit_state_sb,
	transit_state_mb, transit_state_bkref,
	get_subexp, get_subexp_sub, find_subexp_node,
	check_arrival, check_arrival_add_next_nodes,
	find_collation_sequence_value, check_arrival_expand_ecl,
	check_arrival_expand_ecl_sub, expand_bkref_cache,
	build_trtable, check_node_accept_bytes, extend_buffers,
	group_nodes_into_DFAstates, check_node_accept): Likewise.

	* posix/regex_internal.c (re_string_construct_common,
	re_string_skip_chars, create_newstate_common,
	register_state, create_ci_newstate, create_cd_newstate,
	calc_state_hash): Likewise.
	(re_string_peek_byte_case, re_fetch_byte_case): Change
	declaration from ANSI to K&R.

2002-12-16  Paolo Bonzini  <bonzini@gnu.org>

	* posix/regexec.c (build_trtable): Don't allocate the trtable
	until state->word_trtable is known.  Don't hardcode UINT_BITS
	iterations on each bitset item.
2003-12-23 02:29:44 +00:00
Ulrich Drepper
c0d5034ed1 Update.
* posix/regexec.c (check_arrival): Remove duplicate test.

2003-12-15  Ulrich Drepper  <drepper@redhat.com>

	* posix/regcomp.c: Make !RE_ENABLE_I18N work again.
	* posix/regex_internal.c: Likewise.
	* posix/regexec.c: Likewise.
	Patch by Paolo Bonzini.

2003-12-14  Paolo Bonzini  <bonzini@gnu.org>
2003-12-16 06:16:27 +00:00
Ulrich Drepper
a0a8461cf9 Update.
2003-12-14  Paolo Bonzini  <bonzini@gnu.org>

	* posix/regex_internal.c (re_acquire_state_context):
        Compare the node sets after all the other comparisons.

2003-12-13  Paolo Bonzini  <bonzini@gnu.org>

	* posix/regexec.c (find_subexp_node, check_arrival,
	check_arrival_add_next_nodes, check_arrival_expand_ecl,
	check_arrival_expand_ecl_sub, expand_bkref_cache):
	Rename the FL_OPEN parameter to TYPE, which is either
	OP_OPEN_SUBEXP or OP_CLOSE_SUBEXP.  Callers adjusted.

	* Makeconfig (gnulib): If have-cc-with-libunwind is "yes", also
2003-11-12  David Mosberger  <davidm@hpl.hp.com>
2003-12-15 00:56:30 +00:00