Commit Graph

92 Commits

Author SHA1 Message Date
Paul Eggert
f615e3fced Remove dead regex code
* posix/regex_internal.c (re_node_set_insert):
Remove unnecessary assignment.  Reported by Tim Rühsen in:
https://lists.gnu.org/r/bug-gnulib/2019-08/msg00026.html
2019-08-21 11:02:19 -07:00
Paul Eggert
8a80ee5e2b Fix bad pointer / leak in regex code
This was found by Coverity (CID 1484201).  [BZ#24844]
* posix/regex_internal.c (create_cd_newstate): Fix use of bad
pointer and/or memory leak when storage is exhausted.
2019-08-21 11:02:19 -07:00
Joseph Myers
04277e02d7 Update copyright dates with scripts/update-copyrights.
* All files with FSF copyright notices: Update copyright dates
	using scripts/update-copyrights.
	* locale/programs/charmap-kw.h: Regenerated.
	* locale/programs/locfile-kw.h: Likewise.
2019-01-01 00:11:28 +00:00
Paul Eggert
f4efbdfb44 regex: __builtin_expect → __glibc_unlikely
[BZ#23744]
This refactoring was prompted by a problem when the regex code is
used as part of Gnulib and when the builder’s compiler does not grok
__builtin_expect.  Problem reported for Gawk by Nelson H.F. Beebe in:
https://lists.gnu.org/r/bug-gnulib/2018-09/msg00137.html
Although this refactoring does not fix the problem directly,
we might as well have Gawk use the now-preferred glibc style for when
__builtin_expect is unavailable.
* posix/regex_internal.h (BE): Remove.
All uses replaced by __glibc_unlikely or __glibc_likely.
2018-10-14 23:36:55 -05:00
Paul Eggert
bc680b3369 regex: fix uninitialized memory access
I introduced this bug into gnulib in commit
8335a4d6c7b4448cd0bcb6d0bebf1d456bcfdb17 dated 2006-04-10;
eventually it was merged into glibc.  The bug was found by
project-repo <bugs@feusi.co> and reported here:
https://lists.gnu.org/r/sed-devel/2018-08/msg00017.html
Diagnosis and draft fix reported by Assaf Gordon here:
https://lists.gnu.org/r/bug-gnulib/2018-08/msg00071.html
https://lists.gnu.org/r/bug-gnulib/2018-08/msg00142.html
* posix/regex_internal.c (build_wcs_upper_buffer):
Fix bug when mbrtowc returns 0.
2018-08-25 20:34:34 -07:00
Adhemerval Zanella
eb04c21373 posix: Sync gnulib regex implementation
This patch syncs the regex implementation with gnulib (commit 0ee5212).
Only two changes in GLIBC regex testing are required:

  1. posix/bug-regex28.c: as previously discussed [1] the change of
     expected results on the pattern should be safe.

  2. posix/PCRE.tests: the ERE (a)|\1 is malformed (in the sense that
     the \1 doesn't mean anything) and although current GLIBC accepts
     it has undefined behavior.  This patch removes the specific test.

This sync contains some patches from thread 'Regex: Make libc regex
more usable outside GLIBC.' [2] which have been pushed upstream in
gnulib.  This patches also fixes some regex issues (BZ #23233,
BZ #21163, BZ #18986, BZ #13762) and I did not add testcases for
both #23233 and #13762 because I couldn't think a simple way to
trigger the expected failure path to trigger them.

Checked on x86_64-linux-gnu and i686-linux-gnu.

	[BZ #23233]
	[BZ #21163]
	[BZ #18986]
	[BZ #13762]
	* posix/Makefile (tests): Add bug-regex37 and bug-regex38.
	* posix/PCRE.tests: Remove invalid test.
	* posix/bug-regex28.c: Fix expected values for used syntax.
	* posix/bug-regex37.c: New file.
	* posix/bug-regex38.c: Likewise.
	* posix/regcomp.c: Sync with gnulib.
	* posix/regex.c: Likewise.
	* posix/regex.h: Likewise.
	* posix/regex_internal.c: Likewise.
	* posix/regex_internal.h: Likewise.
	* posix/regexec.c: Likewise.

[1] https://sourceware.org/ml/libc-alpha/2017-12/msg00807.html
[2] https://sourceware.org/ml/libc-alpha/2017-12/msg00237.html
2018-07-04 09:54:45 -03:00
Joseph Myers
688903eb3e Update copyright dates with scripts/update-copyrights.
* All files with FSF copyright notices: Update copyright dates
	using scripts/update-copyrights.
	* locale/programs/charmap-kw.h: Regenerated.
	* locale/programs/locfile-kw.h: Likewise.
2018-01-01 00:32:25 +00:00
Eric Blake
ed8ae46bed Avoid gcc warnings on cygwin
* posix/regex_internal.c (re_string_reconstruct) [!RE_ENABLE_I18N]:
* posix/regexec.c (check_arrival_add_next_nodes) [!RE_ENABLE_I18N]:
Avoid unused variable.
2017-12-22 08:01:27 -08:00
Arnold Robbins
5069ff3284 regex: Fix spelling in comments.
Fix the spelling in various comments throughout the
regex implementation. These changes are also present
in gnulib and will be integrated there also, see:
https://sourceware.org/ml/libc-alpha/2017-12/msg00688.html
2017-12-19 19:28:21 -08:00
Florian Weimer
b41bd5bc83 posix: Remove internal_function attribute 2017-08-31 18:52:00 +02:00
Joseph Myers
bfff8b1bec Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
Joseph Myers
f7a9f785e5 Update copyright dates with scripts/update-copyrights. 2016-01-04 16:05:18 +00:00
Joseph Myers
a5f0adb39b Fix regex wcrtomb namespace (bug 18496).
The regex code brings in references to wcrtomb, which isn't in all the
standards that contain regex.  This patch makes it call __wcrtomb
instead (in fact some places already called __wcrtomb, so this patch
makes it internally consistent about which name is used).

Tested for x86_64 and x86 that installed stripped shared libraries are
unchanged by the patch.

	[BZ #18496]
	* posix/regex_internal.c (build_wcs_upper_buffer): Call __wcrtomb
	instead of wcrtomb.
2015-06-05 21:31:39 +00:00
Joseph Myers
9dd6b7799a Fix regex wctype namespace (bug 18495).
regcomp brings in references to various wctype functions that aren't
in all the standards including regcomp.  This patch fixes this in the
usual way by using the __* versions of these functions (which already
exist, but some didn't have libc_hidden_proto / libc_hidden_def
before).

Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by the patch).  (Other wide character
function references from the regex code mean that this patch by itself
doesn't fix any XFAILed linknamespace test failures; further patches
will be needed for that.)

	[BZ #18495]
	* wctype/wcfuncs.c (__iswalnum): Use libc_hidden_def.
	(__iswlower): Likewise.
	* include/wctype.h (__iswalnum): Declare.  Use libc_hidden_proto.
	(__iswlower): Likewise.
	* posix/regcomp.c (re_compile_fastmap_iter): Call __towlower
	instead of towlower.
	* posix/regex_internal.c (build_wcs_upper_buffer): Call __iswlower
	instead of iswlower.  Call __towupper instead of towupper.
	* posix/regex_internal.h (IS_WIDE_WORD_CHAR): Call __iswalnum
	instead of iswalnum.
2015-06-05 20:04:47 +00:00
Joseph Myers
b168057aaa Update copyright dates with scripts/update-copyrights. 2015-01-02 16:29:47 +00:00
Siddhesh Poyarekar
78dd658a02 Check if DEBUG is defined in regex_internal.c
The DEBUG macro is checked for its value in one place and if it is
defined in another.  Make this consistent across the two cases and use
the same style that we did in mktime.c, which is to check if the macro
is defined and it is set.
2014-08-01 14:24:41 +05:30
Allan McRae
d4697bc93d Update copyright notices with scripts/update-copyrights 2014-01-01 22:00:23 +10:00
Joseph Myers
568035b787 Update copyright notices with scripts/update-copyrights. 2013-01-02 19:05:09 +00:00
Stanislav Brabec
71b5d1c5d5 [BZ #13637]
* posix/regex_internal.c (re_string_skip_chars): Fix miscomputation
	of remain_len that may cause incomplete multi-byte character and
	false match.
	* posix/bug-regex33.c: New file.
	* posix/Makefile (tests): Add bug-regex33.
2012-02-28 16:16:45 +01:00
Paul Eggert
59ba27a63a Replace FSF snail mail address with URLs. 2012-02-09 23:18:22 +00:00
Jakub Jelinek
2ba92745c3 Fix up regcomp/regexec
The problem is that parse_bracket_symbol is miscompiled, and it turns
out it is because of an incorrect attribute on re_string_fetch_byte_case.
Unlike re_string_peek_byte_case, this one is really not pure, it modifies memory
(increments pstr->cur_idx), and with the pure attribute GCC assumed it doesn't
and it cached the presumed value of regexp->cur_idx in a variable across the
 for (;; ++i)
   {
     if (i >= BRACKET_NAME_BUF_SIZE)
       return REG_EBRACK;
     if (token->type == OP_OPEN_CHAR_CLASS)
       ch = re_string_fetch_byte_case (regexp);
     else
       ch = re_string_fetch_byte (regexp);
     if (re_string_eoi(regexp))
       return REG_EBRACK;
     if (ch == delim && re_string_peek_byte (regexp, 0) == ']')
       break;
     elem->opr.name[i] = ch;
   }
2011-12-30 17:13:56 -05:00
Ulrich Drepper
5e2b63c658 Fix warnings in regex 2011-11-12 01:23:45 -05:00
Ulrich Drepper
8887a920a4 Fix unnecessary overallocation due to incomplete character
When incomplete characters are found at the end of a string the
code ran amok and allocated lots of memory.  Stricter limits
are now in place.
2011-05-28 17:14:30 -04:00
Ulrich Drepper
5ddf954cf1 Simplify test in re_string_skip_chars. 2010-01-22 10:22:53 -08:00
Ulrich Drepper
4f08104cbf regex_internal.c: don't assume WEOF fits in wchar_t 2010-01-22 10:17:45 -08:00
Ulrich Drepper
0dae5d4ec1 regex_internal.c: remove useless variable and the code to set it. 2010-01-22 09:57:30 -08:00
Ulrich Drepper
2236464488 Extend overflow detection in re_dfa_add_node. 2010-01-22 09:48:35 -08:00
Ulrich Drepper
54dd0ab31f regex: avoid internal re_realloc overflow 2010-01-22 09:33:01 -08:00
Ulrich Drepper
2da42bc065 Fix a few more cases of ignored return values in regex. 2010-01-15 12:03:16 -08:00
Ulrich Drepper
b3918c7d7f * posix/regcomp.c (re_compile_fastmap_iter): Use __mbrtowc.
* posix/regex_internal.c (build_wcs_buffer, build_wcs_upper_buffer,
	re_string_skip_chars, re_string_reconstruct): Likewise.
	* posix/regex_internal.h [!_LIBC] (__mbrtowc): New #define.
2009-01-08 00:23:09 +00:00
Ulrich Drepper
0caca71ac9 * string/Makefile (distribute): Add str-two-way.h.
2008-03-29  Eric Blake	<ebb9@byu.net>

	Rewrite string searches to O(n) rather than O(n^2).
	* string/str-two-way.h: New file.  For linear fixed-allocation
	string searching.
	* string/memmem.c: New implementation.
	* string/strstr.c: New implementation.
	* string/strcasestr.c: New implementation.

	* sysdeps/posix/getaddrinfo.c (getaddrinfo): Call _res_hconf_init
2008-05-15 04:42:20 +00:00
Ulrich Drepper
ba40cc1540 [BZ #3155]
2006-09-07  Jakub Jelinek  <jakub@redhat.com>
	[BZ #3155]
	* sysdeps/powerpc/powerpc32/fpu/s_lrint.S (__lrint): Don't access
	stack below r1.

	* posix/regex_internal.c (re_string_reconstruct): Handle
	offset < pstr->valid_raw_len && pstr->offsets_needed case.
	Ensure no bytes read before raw_mbs array.  Pass a saved copy of
	pstr->valid_len - 1 rather than pstr->valid_raw_len - 1 to
	re_string_context_at.
	* posix/Makefile: Add rules to build and run bug-regex26 test.
	* posix/bug-regex26.c: New test.

	* dlfcn/Makefile (LDLIBS-bug-atexit3-lib.so): Add ld.so.
2006-09-07 13:50:31 +00:00
Ulrich Drepper
33e63e7993 * posix/regex_internal.c (re_string_skip_chars): If no character has
been converted at all, set *last_wc to WEOF.  If mbrtowc failed, set wc
	to the byte which couldn't be converted.
	(re_string_reconstruct): Don't clear valid_raw_len before calling
	re_string_skip_chars.  If wc is WEOF after re_string_skip_chars, set
	tip_context using re_string_context_at.
	* posix/Makefile: Add rules to build and run bug-regex25 test.
	* posix/bug-regex25.c: New test.
2006-06-04 04:59:36 +00:00
Andreas Jaeger
4f7e7f8e00 [BZ #1950, BZ #2153]
Update.
	[BZ #1950]
	* posix/regex_internal.c (re_string_reconstruct): Adjust for
	build_wcs_upper_buffer change.
	(build_wcs_upper_buffer): Change return type.

	[BZ #2153]
	* math/s_cacosh.c (__cacosh): Do not return a negative
	value. Patch by Wes Loewer <wjltemp-temp01@yahoo.com>.
	* math/s_cacoshl.c (__cacoshl): Likewise.
	* math/s_cacoshf.c (__cacoshf): Likewise.
	* math/libm-test.inc (cacosh_test): Adjust for change.

	* sysdeps/alpha/fpu/libm-test-ulps: Adopt for cacosh test change.
	* sysdeps/hppa/fpu/libm-test-ulps: Likewise.
	* sysdeps/i386/fpu/libm-test-ulps: Likewise.
	* sysdeps/ia64/fpu/libm-test-ulps: Likewise.
	* sysdeps/m68k/fpu/libm-test-ulps: Likewise.
	* sysdeps/mips/fpu/libm-test-ulps: Likewise.
	* sysdeps/powerpc/fpu/libm-test-ulps: Likewise.
	* sysdeps/s390/fpu/libm-test-ulps: Likewise.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
	* sysdeps/sh/sh4/fpu/libm-test-ulps: Likewise.
	* sysdeps/sparc/sparc32/fpu/libm-test-ulps: Likewise.
	* sysdeps/sparc/sparc64/fpu/libm-test-ulps: Likewise.
2006-01-15 17:59:52 +00:00
Ulrich Drepper
1ba81cea27 * posix/regexec.c: Finish prototyping of static functions.
* posix/regex_internal.c: Likewise.
2005-10-15 15:23:33 +00:00
Ulrich Drepper
db26cb7576 [BZ #1248]
2005-08-26  Paul Eggert  <eggert@cs.ucla.edu>
	[BZ #1248]
	* posix/regex_internal.h (bitset_not, bitset_merge, bitset_not_merge,
	bitset_mask, re_string_allocate, re_string_construct,
	re_string_reconstruct, re_string_destruct, re_string_elem_size_at,
	re_string_char_size_at, re_string_wchar_at, re_string_peek_byte_case,
	re_string_fetch_byte_case, re_node_set_alloc, re_node_set_init_1,
	re_node_set_init_2, re_node_set_init_copy, re_node_set_add_intersect,
	re_node_set_init_union, re_node_set_merge, re_node_set_insert,
	re_node_set_insert_last, re_node_set_compare, re_node_set_contains,
	re_node_set_remove_at, re_dfa_add_node, re_acquire_state,
	re_acquire_state_context): Remove unnecessary forward decls.
	(re_string_char_size_at, re_string_wchar_at, re_string_elem_size_at):
	Put __attribute at function definition, now that the function decl
	has been removed.
	* posix/regex_internal.c (re_string_peek_byte_case,
	re_string_fetch_byte_case, re_node_set_compare, re_node_set_contains):
	Likewise.
2005-10-13 21:10:24 +00:00
Ulrich Drepper
e2f5526407 [BZ #1231]
2005-08-23  Paul Eggert  <eggert@cs.ucla.edu>
	[BZ #1231]
	* posix/regex_internal.c (re_string_skip_chars, register_state,
	calc_state_hash): Remove forward decls.
	* posix/regexec.c (acquire_init_state_context, check_halt_node_context,
	proceed_next_node, pop_fail_stack, sub_epsilon_src_nodes,
	clean_state_log_if_needed): Likewise.

	* posix/regex.c: No need to use K&R definitions for static functions.
	* posix/regex_internal.c: Likewise.
2005-10-13 20:08:58 +00:00
Ulrich Drepper
997470b3e1 [BZ #281]
* posix/regex.h: Define RE_TRANSLATE_TYPE as unsigned char *.
	* posix/regcomp.c: Remove unnecessary uses of
	unsigned RE_TRANSLATE_TYPE.
	* posix/regex_internal.h: Likewise.
	* posix/regex_internal.c: Likewise.
	* posix/regexexec.c: Likewise.
	Based on a patch by Stepan Kasal <kasal@ucw.cz>.
2005-09-23 06:11:29 +00:00
Ulrich Drepper
01ed6ceb7c * posix/regex_internal.c (re_string_reconstruct): Avoid calling
mbrtowc for very simple UTF-8 case.

2005-09-01  Paul Eggert  <eggert@cs.ucla.edu>

	* posix/regex_internal.c (build_wcs_upper_buffer): Fix portability
	bugs in int versus size_t comparisons.

2005-09-06  Ulrich Drepper  <drepper@redhat.com>

	* posix/regex_internal.c (re_acquire_state): Make DFA pointer arg
	a pointer-to-const.
	(re_acquire_state_context): Likewise.
	* posix/regex_internal.h: Adjust prototypes.

2005-08-31  Jim Meyering  <jim@meyering.net>

	* posix/regcomp.c (search_duplicated_node): Make first pointer arg
	a pointer-to-const.
	* posix/regex_internal.c (create_ci_newstate, create_cd_newstate,
	register_state): Likewise.
	* posix/regexec.c (search_cur_bkref_entry, check_dst_limits):
	(check_dst_limits_calc_pos_1, check_dst_limits_calc_pos):
	(group_nodes_into_DFAstates): Likewise.

	* posix/regexec.c (re_search_internal): Simplify update of
	rm_so and rm_eo by replacing "if (A == B) A += C - B;"
	with the equivalent of "if (A == B) A = C;".

2005-09-06  Ulrich Drepper  <drepper@redhat.com>

	* posix/regcomp.c (re_compile_internal): Change third parameter type
	to size_t.
	(init_dfa): Likewise.  Make sure that arithmetic on pat_len doesn't
	overflow.
	* posix/regex_internal.h (struct re_dfa_t): Change type of nodes_alloc
	and nodes_len to size_t.
	* posix/regex_internal.c (re_dfa_add_node): Use size_t as type for
	new_nodes_alloc.  Check for overflow.

2005-08-31  Paul Eggert  <eggert@cs.ucla.edu>

	* posix/regcomp.c (re_compile_fastmap_iter, init_dfa, init_word_char):
	(optimize_subexps, lower_subexp):
	Don't assume 1<<31 has defined behavior on hosts with 32-bit int,
	since the signed shift might overflow.  Use 1u<<31 instead.
	* posix/regex_internal.h (bitset_set, bitset_clear, bitset_contain):
	Likewise.
	* posix/regexec.c (check_dst_limits_calc_pos_1): Likewise.
	(check_subexp_matching_top): Likewise.
	* posix/regcomp.c (optimize_subexps, lower_subexp):
	Use CHAR_BIT rather than 8, for clarity.
	* posix/regexec.c (check_dst_limits_calc_pos_1):
	(check_subexp_matching_top): Likewise.
	* posix/regcomp.c (init_dfa): Make table_size unsigned, so that we
	don't have to worry about portability issues when shifting it left.
	Remove no-longer-needed test for table_size > 0.
	* posix/regcomp.c (parse_sub_exp): Do not shift more bits than there
	are in a word, as the resulting behavior is undefined.
	* posix/regexec.c (check_dst_limits_calc_pos_1): Likewise;
	in one case, a <= should have been an <, and in another case the
	whole test was missing.
	* posix/regex_internal.h (BYTE_BITS): Remove.  All uses changed to
	the standard name CHAR_BIT.
2005-09-07 01:15:33 +00:00
Ulrich Drepper
2d87db5b53 * posix/regex_internal.h (re_sub_match_top_t): Remove unused member
next_last_offset.
	(struct re_dfa_t): Remove unused member states_alloc.
	* posix/regcomp.c (init_dfa): Don't initialize unused members.

2005-08-25  Paul Eggert  <eggert@cs.ucla.edu>

	* posix/regexec.c (set_regs): Don't alloca with an unbounded size.

	alloca modernization/simplification for regex.
	* posix/regex.c: Remove portability cruft for alloca.  This no longer
	needs to be at the start of the file, and can be moved into
	regex_internal.h and simplified.
	* posix/regex_internal.h: Include <alloca.h>.
	(__libc_use_alloca) [!defined _LIBC]: New macro.
	* posix/regexec.c (build_trtable): Remove "#ifdef _LIBC",
	since the code now works outside glibc.

2005-09-06  Ulrich Drepper  <drepper@redhat.com>

	* include/regex.h: Remove use of _RE_ARGS.

2005-08-25  Paul Eggert  <eggert@cs.ucla.edu>

	* posix/regexec.c (find_recover_state): Change "err" to "*err".

2005-08-24  Paul Eggert  <eggert@cs.ucla.edu>

	* posix/regcomp.c (regerror): Pointer args are 'restrict',
	as per POSIX.
	* posix/regex.h (regerror): Likewise.
	* manual/pattern.texi (POSIX Regexp Compilation): Likewise.
	Similarly for regcomp and regexec.  Also, first 2 args of regexec
	and 2nd arg of regerror are const.

	* posix/regex.c: Do not include <sys/types.h>, as POSIX no longer
	requires this.  (The code never needed it.)

2005-08-20  Paul Eggert  <eggert@cs.ucla.edu>

	* posix/regexec.c (sift_states_bkref): re_node_set_insert returns
	int, not reg_errcode_t.

	* posix/regex_internal.c (calc_state_hash): Put 'inline' before type,
	since some broken compilers warn about it otherwise.

	* posix/regcomp.c (create_initial_state): Remove duplicate decl.

2005-08-20  Paul Eggert  <eggert@cs.ucla.edu>

	* posix/regex.h (_RE_ARGS): Remove.  No longer needed, since we assume
	C89 or better.  All uses removed.

2005-09-06  Ulrich Drepper  <drepper@redhat.com>

	* posix/regex.c: Prevent using C++ compilers.

2005-08-19  Paul Eggert  <eggert@cs.ucla.edu>

	* posix/regcomp.c (duplicate_node): Return new index, not an error
	code, and let the caller return REG_ESPACE if out of space.  This
	removes an uninitialied-variable warning with GCC 4.0.1, and also
	avoids taking the address of a local variable.  All callers
	changed.

2005-09-06  Ulrich Drepper  <drepper@redhat.com>

	* include/time.h (__strptime_internal): Rename parameter to avoid
	bogus compiler warning.

2005-08-19  Jim Meyering  <jim@meyering.net>

	* posix/regexec.c (proceed_next_node): Redo local variables to
	avoid GCC shadowing warnings.

2005-09-06  Ulrich Drepper  <drepper@redhat.com>

	* posix/regex_internal.c (re_acquire_state): Minor code rearrangement.
	(re_acquire_state_context): Likewise.

2005-08-19  Paul Eggert  <eggert@cs.ucla.edu>

	* posix/regex_internal.c (re_string_realloc_buffers):
	(re_node_set_insert, re_node_set_insert_last, re_dfa_add_node):
	Rename local variables to avoid GCC shadowing warnings.

2005-07-08  Eric Blake  <ebb9@byu.net>
            Paul Eggert  <eggert@cs.ucla.edu>

	* posix/regcomp.c (init_dfa): Store __btowc value in wint_t, not
	wchar_t.  Remove now-unnecessary cast.
	(build_range_exp): Likewise.
2005-09-06 21:15:13 +00:00
Ulrich Drepper
ec73fd87da * posix/regex_internal.c (build_wcs_buffer): Use MB_LEN_MAX not
MB_CUR_MAX.
	(build_wcs_upper_buffer): Likewise.
2005-07-05 22:01:42 +00:00
Ulrich Drepper
88764ae26a [BZ #779]
2005-03-10  Jakub Jelinek  <jakub@redhat.com>
	* math/test-misc.c (main): Add some more tests.

2005-03-17  Jakub Jelinek  <jakub@redhat.com>

	* posix/regcomp.c (re_compile_fastmap_iter): Fix check for failed
	__wcrtomb.  Check return values of other __wcrtomb calls.
	* posix/regex_internal.c (build_wcs_buffer, re_string_skip_chars):
	Change mbclen type to size_t.
	(build_wcs_upper_buffer): Change mbclen and mbcdlen type to size_t.
	Handle mb chars whose upper case doesn't have multibyte representation
	in locale's charset.

2005-03-15  Jakub Jelinek  <jakub@redhat.com>

	* malloc/malloc.c (_int_icalloc, _int_icomalloc, iALLOc,
	public_iCALLOc, public_iCALLOc, public_iCOMALLOc): Protect with
	#ifndef _LIBC.

	[BZ #779]
	* malloc/malloc.c (public_mTRIm): Initialize malloc if not yet
	initialized.

2005-03-10  Jakub Jelinek  <jakub@redhat.com>

	* misc/sys/cdefs.h (__always_inline): Define.
	* posix/bits/unistd.h (read, pread, pread64, readlink, getcwd, getwd):
	Use __always_inline instead of __inline.
	* socket/bits/socket2.h (recv, recvfrom): Likewise.
	* libio/bits/stdio2.h (gets, fgets, fgets_unlocked): Likewise.
	* string/bits/string3.h (__memcpy_ichk, __memmove_ichk, __mempcpy_ichk,
	__memset_ichk, __strcpy_ichk, __stpcpy_ichk, __strncpy_ichk,
	__strcat_ichk, __strncat_ichk): Use __always_inline instead of
	__inline__ __attribute__ ((__always_inline__)).

2005-03-09  Jakub Jelinek  <jakub@redhat.com>

	* debug/tst-chk1.c: Include sys/socket.h and sys/un.h.
	(do_test): Add new tests for recv, recvfrom, getcwd, getwd and
	readlink.  Add some more tests for read, pread, pread64, fgets and
	fgets_unlocked.

	* posix/bits/unistd.h (read, pread, pread64, readlink,
	getcwd, getwd): Change macros into extern inline functions.
	(__read_alias, __pread_alias, __pread64_alias, __readlink_alias,
	__getcwd_alias, __getwd_alias): New prototypes.
	* socket/bits/socket2.h (recv, recvfrom): Change macros into
	extern inline functions.
	(__recv_alias, __recvfrom_alias): New prototypes.
	* libio/bits/stdio2.h (gets, fgets, fgets_unlocked): Change macros
	into extern inline functions.
	(__gets_alias, __fgets_alias, __fgets_unlocked_alias): New prototypes.

	* debug/pread_chk.c (__pread_chk): Fix order of arguments passed
	to __pread.
	* debug/pread64_chk.c (__pread64_chk): Fix order of arguments passed
	to __pread64.
2005-03-19 00:28:51 +00:00
Ulrich Drepper
1c99f950d1 * posix/regexec.c (check_node_accept_bytes): Correct cast to avoid
warning.
	* posix/regex_internal.c (re_string_reconstruct): Add cast to
	avoid warning.
	(build_wcs_upper_buffer): Change type of bug to plain char.
	* locale/weightwc.h (findidx): Add casts to avoid warnings.
	* time/mktime.c (ranged_convert): Initialize tm to make the
	compiler happy.
	* wcsmbs/mbsrtowcs_l.c (__mbsrtowcs_l): Add casts to avoid warnings.
	* wcsmbs/wcsnrtombs.c (__wcsnrtombs): Add casts to avoid warnings.
	* wcsmbs/mbsnrtowcs.c: Add casts to avoid warnings.
	* wcsmbs/wcsrtombs.c (__wcsrtombs): Add casts to avoid warnings.
	* wcsmbs/wcrtomb.c (__wcrtomb): Add casts to avoid warnings.
	* wcsmbs/mbrtowc.c (__mbrtowc): Use unsigned char for outbuf.
	* posix/regex_internal.c [_LIBC] (build_wcs_buffer): Avoid using
	dynamically sized array.
	(build_wcs_upper_buffer): Likewise.
2005-03-06 07:27:56 +00:00
Ulrich Drepper
963d8d782f [BZ #558]
Update.
2005-01-27  Paolo Bonzini  <bonzini@gnu.org>

	[BZ #558]
	* posix/regcomp.c (calc_inveclosure): Return reg_errcode_t.
	Initialize the node sets in dfa->inveclosures.
	(analyze): Initialize inveclosures only if it is needed.
	Check errors from calc_inveclosure.
	* posix/regex_internal.c (re_dfa_add_node): Do not initialize
	the inveclosure node set.
	* posix/regexec.c (re_search_internal): If nmatch includes unused
	subexpressions, reset them to { rm_so: -1, rm_eo: -1 } here.

	* posix/regcomp.c (parse_bracket_exp) [!RE_ENABLE_I18N]:
	Do build a SIMPLE_BRACKET token.

	* posix/regexec.c (transit_state_mb): Do not examine nodes
	where ACCEPT_MB is not set.
2005-01-27 19:08:10 +00:00
Ulrich Drepper
02f3550c8b [BZ #605, BZ #611]
Update.
2004-12-13  Paolo Bonzini  <bonzini@gnu.org>

	Separate parsing and creation of the NFA.  Avoided recursion on
	the (very unbalanced) parse tree.
	[BZ #611]
	* posix/regcomp.c (struct subexp_optimize, analyze_tree, calc_epsdest,
	re_dfa_add_tree_node, mark_opt_subexp_iter): Removed.
	(optimize_subexps, duplicate_tree, calc_first, calc_next,
	mark_opt_subexp): Rewritten.
	(preorder, postorder, lower_subexps, lower_subexp, link_nfa_nodes,
	create_token_tree, free_tree, free_token): New.
	(analyze): Accept a regex_t *.  Invoke the passes via the preorder and
	postorder generic visitors.  Do not initialize the fields in the
	re_dfa_t that represent the transitions.
	(free_dfa_content): Use free_token.
	(re_compile_internal): Analyze before UTF-8 optimizations.  Do not
	include optimization of subexpressions.
	(create_initial_state): Fetch the DFA node index from the first node's
	bin_tree_t *.
	(optimize_utf8): Abort on unexpected nodes, including OP_DUP_QUESTION.
	Return on COMPLEX_BRACKET.
	(duplicate_node_closure): Fix comment.
	(duplicate_node): Do not initialize the fields in the
	re_dfa_t that represent the transitions.
	(calc_eclosure, calc_inveclosure): Do not handle OP_DELETED_SUBEXP.
	(create_tree): Remove final argument.  All callers adjusted.  Rewritten
	to use create_token_tree.
	(parse_reg_exp, parse_branch, parse_expression, parse_bracket_exp,
	build_charclass_op): Use create_tree or create_token_tree instead
	of re_dfa_add_tree_node.
	(parse_dup_op): Likewise.  Also free the tree using free_tree for
	"<re>{0}", and lower OP_DUP_QUESTION to OP_ALT: "a?" is equivalent
	to "a|".  Adjust invocation of mark_opt_subexp.
	(parse_sub_exp): Create a single SUBEXP node.
	* posix/regex_internal.c (re_dfa_add_node): Remove last parameter,
	always perform as if it was 1.  Do not initialize OPT_SUBEXP and
	DUPLICATED, and initialize the DFA fields representing the transitions.
	* posix/regex_internal.h (re_dfa_add_node): Adjust prototype.
	(re_token_type_t): Move OP_DUP_PLUS and OP_DUP_QUESTION to the tokens
	section.  Add a tree-only code SUBEXP.  Remove OP_DELETED_SUBEXP.
	(bin_tree_t): Include a full re_token_t for TOKEN.  Turn FIRST and
	NEXT into pointers to trees.  Remove ECLOSURE.

2004-12-28  Paolo Bonzini  <bonzini@gnu.org >

	[BZ #605]
	* posix/regcomp.c (parse_bracket_exp): Do not modify DFA nodes
	that were already created.
	* posix/regex_internal.c (re_dfa_add_node): Set accept_mb field
	in the token if needed.
	(create_ci_newstate, create_cd_newstate): Set accept_mb field
	from the tokens' field.
	* posix/regex_internal.h (re_token_t): Add accept_mb field.
	(ACCEPT_MB_NODE): Removed.
	* posix/regexec.c (proceed_next_node, transit_states_mb,
	build_sifted_states, check_arrival_add_next_nodes): Use
	accept_mb instead of ACCEPT_MB_NODE.
2005-01-26 22:42:49 +00:00
Ulrich Drepper
5cf53cc208 Update.
2005-01-06  Ulrich Drepper  <drepper@redhat.com>

	* misc/sys/cdefs.h: Define __wur.
	* libio/stdio.h: Use __wur for a number of interfaces.
	* posix/unistd.h: Likewise.

	* posix/regex_internal.c (free_state): Free word_trtable.
2005-01-06 21:01:25 +00:00
Ulrich Drepper
a334319f65 (CFLAGS-tst-align.c): Add -mpreferred-stack-boundary=4. 2004-12-22 20:10:10 +00:00
Jakub Jelinek
0ecb606cb6 2.5-18.1 2007-07-12 18:26:36 +00:00
Ulrich Drepper
5cf1ec5256 Update.
2004-12-07  Paolo Bonzini  <bonzini@gnu.org>

	* posix/regexec.c (proceed_next_node): Simplify treatment of epsilon
	nodes.  Pass the pushed node to push_fail_stack.
	(push_fail_stack): Accept a single node rather than an array
	of two epsilon destinations.
	(build_sifted_states): Only walk non-epsilon nodes.
	(check_arrival): Don't pass epsilon nodes to
	check_arrival_add_next_nodes.
	(check_arrival_add_next_nodes) [DEBUG]: Abort if an epsilon node is
	found.
	(check_node_accept): Do expensive checks later.
	(add_epsilon_src_nodes): Cache result of merging the inveclosures.
	* posix/regex_internal.h (re_dfastate_t): Add non_eps_nodes and
	inveclosure.
	(re_string_elem_size_at, re_string_char_size_at, re_string_wchar_at,
	re_string_context_at, re_string_peek_byte_case,
	re_string_fetch_byte_case, re_node_set_compare, re_node_set_contains):
	Declare as pure.
	* posix/regex_internal.c (create_newstate_common): Remove.
	(register_state): Move part of it here.  Initialize non_eps_nodes.
	(free_state): Free inveclosure and non_eps_nodes.
	(create_cd_newstate, create_ci_newstate): Allocate the new
	re_dfastate_t here.
2004-12-10 04:37:58 +00:00
Ulrich Drepper
20dc2f79f7 Update.
2004-11-23  Paolo Bonzini  <bonzini@gnu.org>

	* posix/regcomp.c (analyze_tree): Always call calc_epsdest.
	(calc_inveclosure): Use re_node_set_insert_last.
	(parse_dup_op): Lower X{1,5} to (X(X(X(XX?)?)?)?)?
	rather than X?X?X?X?X?.
	* posix/regex_internal.h (re_node_set_insert_last): New declaration.
	* posix/regex_internal.c (re_node_set_insert_last): New function.
	* posix/PCRE.tests: Add testcases.
2004-11-25 22:32:18 +00:00