2004-01-13 Ulrich Drepper <drepper@redhat.com>
* posix/regex.c: Support crappy compilers and platforms which have
problems with alloca.
* posix/regex_internal.h: Likewise.
Patch by Paolo Bonzini.
2004-01-12 Paolo Bonzini <bonzini@gnu.org>
* posix/regcomp.c [_LIBC && !RE_ENABLE_I18N]:
Drop code to support this, it is never true.
(build_range_exp) [!_LIBC]: Do not create a range
in MBCSET for a single-byte character set.
(build_range_exp) [_LIBC]: Do not create a range
in MBCSET for a single-byte character set without
collation elements.
(init_dfa): Do not conditionalize on _LIBC, it
just makes the code less clear.
(parse_bracket_exp): Use NON_MATCH variable in
addition to "mbcset->non_match", not as an
alternative.
(build_charclass_op): rename NOT parameter to
NON_MATCH, use it instead of declaring a variable.
(parse_bracket_exp) [!_LIBC]: Pass NULL for MBCSET
if the character set is single-byte.
2004-01-14 Jakub Jelinek <jakub@redhat.com>
* posix/regcomp.c (peek_token_bracket): Check remaining
string length before re_string_peek_byte (x, 1).
(parse_bracket_symbol): Likewise.
* posix/regex_internal.h (re_string_is_single_byte_char): Return
true at last byte in the string.
* posix/bug-regex22.c (main): Add new test.
2004-01-05 Jakub Jelinek <jakub@redhat.com>
* posix/regcomp.c (regcomp): Fix comment typo.
(regfree): Free preg->translate, clear buffer, allocated, fastmap
and translate fields.
* posix/regcomp.c (build_charclass, buld_charclass_op): Change first
argument to unsigned RE_TRANSLATE_TYPE.
* posix/regex_internal.h (re_string_t): Change trans type to
unsigned RE_TRANSLATE_TYPE.
* posix/regex_internal.c (re_string_construct_common): Cast
trans to unsigned RE_TRANSLATE_TYPE.
(re_string_peek_byte_case, re_string_fetch_byte_case): Avoid fast
path if pstr->trans. Never translate the character through
pstr->trans.
* posix/Makefile (tests): Add bug-regex22.
(bug-regex22-ENV): Set.
* posix/bug-regex22.c: New test.
* posix/regexec.c (get_subexp): Only set bkref_str after the first
loop, use buf + bkref_str_off in the loop instead.
* posix/bug-regex11.c (tests): Add 3 new tests.
* posix/regexec.c (clean_state_log_if_need): Rename to...
(clean_state_log_if_needed): ...this.
(transit_state_mb, get_subexp_sub): Adjust callers.
* posix/regexec.c (re_copy_regs): Allocate start and end array in
one block.
(push_fail_stack): Add missing check for failed memory allocation.
_IO_peekc_unlocked, _IO_ptc_unlocked, _IO_getwc_unlocked, and
overflow for 0 as argument. Raise Invalid exception for negative args.
2003-12-23 Paolo Bonzini <bonzini@gnu.org>
* posix/regex_internal.c (re_dfa_add_node): Initialize opt_subexp.
* posix/regex_internal.h (re_token_type_t): Put OP_DUP_PLUS
among the tokens, rather than among the epsilon-transiting nodes.
(re_token_t): Add the opt_subexp flag.
* posix/regcomp.c (optimize_utf8, calc_first,
calc_next, calc_epsdest): Don't consider OP_DUP_PLUS.
(mark_opt_subexp, mark_opt_subexp_iter): New functions.
(parse_dup_op): Mostly rewritten, lowering OP_DUP_PLUS to
OP_DUP_ASTERISK and marking optional subexpressions
as such using mark_opt_subexp.
* posix/regexec.c (set_regs): Initialize PREV_INDEX_MATCH
and pass it to update_regs.
(update_regs): Use the PREV_INDEX_MATCH parameter, together
with the opt_subexp flag, in order to discard a final empty
match of a repeated subexpression.
* posix/BOOST.tests: Adjust test vectors.
* posix/PCRE.tests: Likewise.
* posix/rxspencer/tests: Likewise.
2003-12-17 Paolo Bonzini <bonzini@gnu.org>
2003-12-16 Paolo Bonzini <bonzini@gnu.org>
2003-12-17 Paolo Bonzini <bonzini@gnu.org>
2003-12-16 Jakub Jelinek <jakub@redhat.com>
2003-04-06 Kaz Kojima <kkojima@rr.iij4u.or.jp>
2003-02-20 Paolo Bonzini <bonzini@gnu.org>
2003-01-12 Franz Sirl <Franz.Sirl-kernel@lauterbach.com>
2003-01-09 Richard Henderson <rth@redhat.com>
2003-01-09 Richard Henderson <rth@redhat.com>
2003-01-03 Paul Eggert <eggert@twinsun.com>
2003-12-16 Petter Reinholdtsen <pere@hungry.com>
* posix/regex_internal.h: Make sure the regex code compile
with non-GCC compilers by hiding attributes.
2002-12-16 Jakub Jelinek <jakub@redhat.com>
Paolo Bonzini <bonzini@gnu.org>
* posix/regexec.c (group_nodes_into_DFAstates): Never produce
dests_ch items that are empty.
2003-12-14 Paolo Bonzini <bonzini@gnu.org>
* posix/regex_internal.c (re_acquire_state_context):
Compare the node sets after all the other comparisons.
2003-12-13 Paolo Bonzini <bonzini@gnu.org>
* posix/regexec.c (find_subexp_node, check_arrival,
check_arrival_add_next_nodes, check_arrival_expand_ecl,
check_arrival_expand_ecl_sub, expand_bkref_cache):
Rename the FL_OPEN parameter to TYPE, which is either
OP_OPEN_SUBEXP or OP_CLOSE_SUBEXP. Callers adjusted.
* Makeconfig (gnulib): If have-cc-with-libunwind is "yes", also
2003-11-12 David Mosberger <davidm@hpl.hp.com>
2003-11-28 Ulrich Drepper <drepper@redhat.com>
* sysdeps/x86_64/fpu/libm-test-ulps: Add some more minor changes
to compensate other setup.
2003-11-27 Andreas Jaeger <aj@suse.de>
* sysdeps/x86_64/fpu/libm-test-ulps: Add ulps for new atan2 test.
* math/libm-test.inc (atan2_test): Add test that run infinitly.
Reported by "Willus" <etc231etc231@willus.com>.
2003-11-27 Michael Matz <matz@suse.de>
* sysdeps/ieee754/dbl-64/mpsqrt.c (fastiroot): Fix 64-bit problem
with wrong types.
2003-11-28 Jakub Jelinek <jakub@redhat.com>
* posix/regexec.c (acquire_init_state_context): Make inline.
Add always_inline attribute.
(check_matching): Add BE macro. Move if (cur_state->has_backref)
into if (dfa->nbackref).
(sift_states_backward): Fix comment.
(transit_state): Add BE macro. Move if (next_state->has_backref)
into if (dfa->nbackref && next_state). Don't check for next_state
!= NULL twice.
* posix/regcomp.c (peek_token): Use opr.ctx_type instead of opr.idx
for ANCHOR.
(parse_expression): Only call init_word_char if word context will be
needed.
* posix/bug-regex11.c (tests): Add new tests.
* posix/tst-regex.c: Include getopt.h.
(timing): New variable.
(main): Set timing to 1 if --timing argument is present.
Add 2 new tests.
(run_test, run_test_backwards): Handle timing.
2003-11-27 Jakub Jelinek <jakub@redhat.com>
* posix/regex_internal.h (re_string_t): Remove mbs_case field.
Add offsets, valid_raw_len, raw_len, raw_stop, mbs_allocated and
offsets_needed fields. Change icase, is_utf8 and map_notascii
type from int bitfield to unsigned char.
(MBS_ALLOCATED, MBS_CASE_ALLOCATED): Remove.
(build_wcs_upper_buffer): Change prototype to return int.
(re_string_peek_byte_case, re_string_fetch_byte_case): Remove
defines, add prototypes.
* posix/regex_internal.c (re_string_allocate): Don't initialize
stop here. Don't initialize mbs_case. Set valid_raw_len.
Use mbs_allocated instead of MBS_* macros.
(re_string_construct): Don't initialize stop and valid_len here.
Don't initialize mbs_case. Use mbs_allocated instead of MBS_*
macros. Reallocate buffers if build_wcs_upper_buffer converted
too few bytes. Set valid_len to bufs_len only for single byte
no translation and set in that case valid_raw_len as well.
(re_string_realloc_buffers): Reallocate offsets if not NULL.
Use mbs_allocated instead of MBS_ALLOCATED. Don't reallocate
mbs_case.
(re_string_construct_common): Initialize raw_len, mbs_allocated,
stop and raw_stop.
(build_wcs_buffer): Apply pstr->trans before mbrtowc instead of
after it. Set valid_raw_len. Don't set mbs_case.
(build_wcs_upper_buffer): Return REG_NOERROR or REG_ESPACE.
Only use the fast path if !pstr->offsets_needed. Apply pstr->trans
before mbrtowc instead of after it. If upper case character
uses different number of bytes than lower case, goto to the
slow path. Don't call towupper unnecessarily twice. Set
valid_raw_len as well. Handle in the slow path the case if
lower and upper case use different number of characters.
Don't set mbs_case.
(re_string_skip_chars): Use valid_raw_len instead of valid_len.
(build_upper_buffer): Don't set mbs_case. Add BE macro. Set
valid_raw_len.
(re_string_translate_buffer): Set mbs instead of mbs_case. Set
valid_raw_len.
(re_string_reconstruct): Use raw_len/raw_stop to initialize
len/stop. Clear valid_raw_len and offsets_needed when clearing
valid_len. Use mbs_allocated instead of MBS_* macros.
Check original offset against valid_raw_len instead of valid_len.
Remove mbs_case handling. Adjust valid_raw_len together with
valid_len. If is_utf8 and looking for tip context, apply
pstr->trans first. If buffers start with partial multi-byte
character, initialize mbs array as well if mbs_allocated.
Check return value of build_wcs_upper_buffer.
(re_string_peek_byte_case): New function.
(re_string_fetch_byte_case): New function.
(re_string_destruct): Use mbs_allocated instead of MBS_ALLOCATED.
Don't free mbs_case. Free offsets.
* posix/regcomp.c (init_dfa): Only check if charset name is UTF-8
if mb_cur_max == 6.
* posix/regexec.c (re_search_internal): Initialize input.raw_stop
as well. Use valid_raw_len instead of valid_len when looking
through fastmap. Adjust registers through input.offsets.
(extend_buffers): Allow build_wcs_upper_buffer to fail.
* posix/bug-regex18.c (tests): Enable #ifdefed out tests. Add new
tests.
2003-11-26 Jakub Jelinek <jakub@redhat.com>
* posix/regexec.c (check_subexp_limits): Only check close
subexpression limitation if one is found. Formatting.
(sift_states_backward, check_arrival, check_arrival_add_next_nodes):
Formatting.
* posix/bug-regex11.c (tests): Enable most #ifdefed out tests.
Add new test.
2003-11-25 Ulrich Drepper <drepper@redhat.com>
* posix/runptests.c (main): Make errors fatal.
* posix/PTESTS: One test in GA135 and GA136 check functionality
which seems not guaranteed.
2003-11-25 Jakub Jelinek <jakub@redhat.com>
* posix/regexec.c (re_search_internal): If prune_impossible_nodes
returned REG_NOMATCH, set match_last to -1. Don't initialize
pmatch[0] needlessly. Fix comment.
(prune_impossible_nodes): Don't segfault on NULL state_log entry.
(set_regs): Fix comment.
* posix/regcomp.c (parse_bracket_exp): Only set has_plural_match
if adding both SIMPLE_BRACKET and COMPLEX_BRACKET.
(build_charclass_op): Set has_plural_match if adding both
SIMPLE_BRACKET and COMPLEX_BRACKET.
* posix/bug-regex11.c (tests): Fix register values for one commented
out test. Add new tests.
* posix/regex_internal.c (re_string_allocate): Make sure init_len
is at least dfa->mb_cur_max.
(re_string_reconstruct): If is_utf8, don't fall back into
re_string_skip_chars just because idx points into a middle of
valid UTF-8 character. Instead, set the wcs bytes which correspond
to the partial character bytes to WEOF.
* posix/regexec.c (re_search_internal): Allocate input.bufs_len + 1
instead of dfa->nodes_len + 1 state_log entries initially.
* posix/bug-regex20.c (main): Uncomment backwards case insensitive
tests.
2003-11-24 Jakub Jelinek <jakub@redhat.com>
* posix/regex_internal.h (re_token_t): Add word_char bit. Add
comment.
(re_dfa_t): Add sb_char field.
(bitset_mask): New function.
* posix/regcomp.c (free_dfa_content): Free sb_char.
(init_dfa): Don't initialize word_char unnecessarily.
Initialize sb_char.
(duplicate_node): Don't duplicate !word_char CHARACTERs with
NEXT_WORD_CONSTRAINT constraint or word_char CHARACTERs with
NEXT_NOTWORD_CONSTRAINT. Return -1 in *new_idx instead.
(duplicate_node_closure): Handle clone_dest == -1 from
duplicate_node.
(peek_token): Initialize word_char bit.
(parse_expression, parse_dup_op): Add comments.
(parse_bracket_exp): Don't set bitmask bits for multi-byte char
starting bytes here at the beginning. Mask off the bits right
before creating SIMPLE_BRACKET.
(build_charclass_op): Likewise.
* posix/regexec.c (group_nodes_into_DFAstates) <case OP_PERIOD>: Only
set accept bits for single-byte characters.
(group_nodes_into_DFAstates): Don't rely on characters 0 .. 127
being single byte encoded and the rest multi-byte.
* posix/bug-regex19.c (tests): Add new tests.
(do_mb_tests): Initialize t to *test.
(main): Fail even on do_mb_tests errors.
2003-11-23 Ulrich Drepper <drepper@redhat.com>
* posix/regexec.c: Add const in a number of places.
* posix/regex_internal.h: Make EPSILON_BIT a macro to help
debugging. Its value isn't important.
2003-11-22 Ulrich Drepper <drepper@redhat.com>
* posix/PTESTS: Fix first test of GA143.
* posix/regex_internal.c (re_dfa_add_node): Add BE, reallocation
isn't likely.
2003-11-21 Ulrich Drepper <drepper@redhat.com>
* posix/regcomp.c (fetch_token): Change interface to match
peek_token. This avoid some copying and reduces code size.
2003-11-20 Ulrich Drepper <drepper@redhat.com>
* posix/PTESTS: Fix first test in GA143.
2003-11-20 Jakub Jelinek <jakub@redhat.com>
* posix/regex_internal.h (re_dfastate_t): Remove trtable_search.
Add word_trtable.
* posix/regex_internal.c (create_newstate_common, free_state):
Don't free trtable_search.
* posix/regexec.c (check_matching): Remove fl_search argument.
(transit_state_sb): Likewise. #ifdef out as unused.
(build_trtable): Remove fl_search argument. Set state->word_trtable
and state->trtable. Build separate word and non-word tables if
multi-byte and they differ for some character.
(transit_state): Remove fl_search argument. Don't update
state->trtable here. Handle state->word_trtable.
#ifdef out unused call to transit_state_sb.
(re_search_internal): Update check_matching caller.
(group_nodes_into_DFAstates): Don't clear non-ascii chars in accepts
bitmask for multi-byte locales.
* posix/bug-regex19.c (tests): Enable some commented out tests, add
2 new tests.
* posix/tst-rxspencer.c (mb_tests): Don't test [[=b=]] for now as
multi-byte. Don't run identical multi-byte tests multiple times
unnecessarily.
(main): Check setlocale return value.
* posix/Makefile (tst-rxspencer-ARGS): Add --utf8 argument.
(tst-rxspencer-ENV): Remove MALLOC_TRACE, add LOCPATH.
($(objpfx)tst-rxspencer-mem): Run another tst-rxspencer test
here, without --utf8 argument but with MALLOC_TRACE.
2003-11-19 Jakub Jelinek <jakub@redhat.com>
* posix/regexec.c (extend_buffers): Don't allocate
twice as big state_log as needed. Don't modify pstr->valid_len
for mb_cur_max == 1 !icase !trans.
* posix/regcomp.c (free_bin_tree): Removed.
(create_tree): Add dfa argument. Don't call re_malloc for
each tree, instead allocate from str_tree_storage.
(re_dfa_add_tree_node): New function.
(free_dfa_content): Handle freeing if dfa->nodes == NULL
or dfa->state_table == NULL.
(re_compile_internal): Call free_dfa_content if init_dfa
fails. Call free_workarea_compile, re_string_destruct
and free_dfa_content for most of the other failure paths.
(init_dfa): Initialize str_tree_storage_idx.
Don't clear any fields on allocation failure.
(free_workarea_compile): Free str_tree_storage chunks
instead of free_bin_tree (dfa->str_tree).
(parse): Call re_dfa_add_tree_node instead of re_dfa_add_node
followed by create_tree. Add dfa argument to remaining
create_tree calls. Remove new_idx variable. Remove calls
to free_bin_tree.
(parse_reg_exp, parse_branch, parse_expression, parse_sub_exp,
parse_dup_op, parse_bracket_exp, build_charclass_op): Likewise.
(duplicate_tree): Remove calls to free_bin_tree, add dfa
argument to create_tree.
* posix/regex_internal.h (BIN_TREE_STORAGE_SIZE): Define.
(bin_tree_storage_t): New type.
(re_dfa_t): Add str_tree_storage and str_tree_storage_idx
fields.
* posix/Makefile (tests): Add bug-regex21.
(generated): Add bug-regex21-mem, bug-regex21.mtrace,
tst-rxspencer-mem and tst-rxspencer.mtrace.
(tests): Depend on $(objpfx)bug-regex21-mem
and $(objpfx)tst-rxspencer-mem.
(bug-regex21-ENV, tst-rxspencer-ENV): Set.
($(objpfx)bug-regex21-mem, $(objpfx)tst-rxspencer-mem): New.
* posix/tst-rxspencer.c (main): Add call to mtrace.
Free line at the end.
* posix/bug-regex21.c: New test.
* posix/regexec.c (get_subexp): After calling get_subexp_sub
* posix/regex_internal.h (re_token_type_t): Remove unused ALT,
END_OF_RE_TOKEN_T and SUBEXP. Reorder values. Add OP_UTF8_PERIOD
and EPSILON_BIT.
(IS_EPSILON_NODE): Just test if EPSILON_BIT is set.
(ACCEPT_MB_NODE): Return 1 for OP_UTF8_PERIOD as well.
* posix/regex_internal.c (create_ci_newstate, create_cd_newstate):
Handle OP_UTF8_PERIOD.
(re_string_reconstruct): Set valid_len for single byte char searching
with no translation and case sensitivity.
* posix/regcomp.c (re_compile_fastmap_iter, calc_first): Handle
OP_UTF8_PERIOD.
(re_compile_internal): Don't call optimize_utf8 if preg->translate
!= NULL.
(optimize_utf8): Remove BACK_SLASH case.
Transform OP_PERIOD into OP_UTF8_PERIOD if the searching can be
optimized.
(parse_bracket_exp): Don't create SIMPLE_BRACKET if it doesn't have
any bits set and COMPLEX_BRACKET is used.
* posix/regexec.c (transit_state_mb): Fix comment typo.
(group_nodes_into_DFAstates, check_node_accept): Handle
OP_UTF8_PERIOD.
(check_node_accept_bytes): Likewise. Reorder slightly so that
re_string_char_size_at and re_string_elem_size_at are called
only when needed.
* posix/bug-regex20.c (BRE, ERE): Define.
(tests): Use them to make lines shorter. Expect . to be
optimized. Add lots of new tests.
(main): Run (ATM just case sensitive) test with backwards searching
as well.
2003-11-18 Jakub Jelinek <jakub@redhat.com>