erik.corry@gmail.com
5b8c63f9d5
Avoids allocating a JSArray of capture information on each non-global
...
regular expression match.
Also moves all last-match information into one place where it can be
updated from C++ code (this will be used in another afsnit).
Review URL: http://codereview.chromium.org/28184
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1383 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-02-27 10:04:34 +00:00
erik.corry@gmail.com
bbc2a73f31
Remove JSCRE
...
Review URL: http://codereview.chromium.org/21504
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1355 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-02-25 08:08:01 +00:00
erik.corry@gmail.com
b1fbed8cca
A little peephole optimization for the Irregexp bytecode interpreter.
...
Review URL: http://codereview.chromium.org/21481
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1311 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-02-19 10:03:27 +00:00
erik.corry@gmail.com
9c608b2c5a
Limit how many places we generate code to flush the same actions. This gives a
...
13% code size reduction in the php regexp with no discernable performance loss.
Review URL: http://codereview.chromium.org/20457
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1309 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-02-19 08:24:28 +00:00
erik.corry@gmail.com
3f962f0f9c
Irregexp:
...
* Fix UC16 character classes on ASCII subjects.
* Fix sign problem in Irregexp interpreter.
* Make passes over text nodes more readable.
Review URL: http://codereview.chromium.org/21450
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1304 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-02-18 16:07:03 +00:00
lrn@chromium.org
80bb2cc546
Missing handle check. Triggers bug if the runtime stack overflows and it is detected by a global regexp.
...
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1263 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-02-13 09:40:15 +00:00
lrn@chromium.org
c621bbbe45
Issue 227 Fixed. Properly handles non-ASCII characters in quick-check on ASCII strings.
...
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1248 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-02-11 11:54:30 +00:00
erik.corry@gmail.com
a5e55c4584
Fix the not-at-start optimization to trigger on the V8 regexp benchmark.
...
Review URL: http://codereview.chromium.org/20040
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1225 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-02-04 13:05:40 +00:00
lrn@chromium.org
78ec586391
RegExp: Small bugfix in debug mode.
...
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1219 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-02-03 13:39:25 +00:00
lrn@chromium.org
cf1e1b1b98
Trace contains information about whether we know that we are at the start of input.
...
Choice nodes may know that they are never not at the start of input.
This can remove start_of_input assertions in cases where they are statically known to fail.
The initial loop is unrolled once if the regexp might check for the start of input. Only the first iteration may be at the start, the following loop knows that it isn't.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1217 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-02-03 11:43:55 +00:00
erik.corry@gmail.com
7f6afa5bf4
The optimizations performed by Irregexp could possible hide bugs or
...
could themselves be a source of bugs. Add a flag to switch off
optimizations (--noirregexp-optimization) to aid testing.
Review URL: http://codereview.chromium.org/19538
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1210 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-02-02 16:27:31 +00:00
erik.corry@gmail.com
b88dbfee5c
Fix http://code.google.com/p/chromium/issues/detail?id=7258 crash in IsFlat.
...
You can't keep a StringShape across things that can cause GC.
Review URL: http://codereview.chromium.org/19749
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1199 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-02-02 08:23:42 +00:00
erik.corry@gmail.com
e091488b3e
Lint error
...
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1167 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-01-26 20:55:31 +00:00
erik.corry@gmail.com
260cd876d1
Eliminate the code that handles fallback to JSCRE. The only way to get
...
JSCRE now is to use the --noirregexp flag. Also add code to check that
we react sensibly to some very large regexps.
Review URL: http://codereview.chromium.org/18587
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1166 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-01-26 20:09:35 +00:00
erik.corry@gmail.com
34b47563ff
Reduce work done in EatsAtLeast to a sane level.
...
Review URL: http://codereview.chromium.org/18753
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1165 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-01-26 19:38:26 +00:00
lrn@chromium.org
2de5de495f
Irregexp: Backtrack past look-aheads works correctly.
...
Allows backtracking to clear registers instead of pushing and popping
them to restore state.
Redo of 1135 with bug fixed.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1156 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-01-26 14:38:17 +00:00
erik.corry@gmail.com
c956219ef4
* Remember to check for end of string even where we
...
know the character class must match.
Thanks to Mads and Christian for finding this bug
Review URL: http://codereview.chromium.org/18750
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1150 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-01-26 13:04:49 +00:00
erik.corry@gmail.com
50e5ad72cb
Fix bug where strings were not flattened before regexp.
...
Review URL: http://codereview.chromium.org/18552
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1142 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-01-26 08:35:41 +00:00
erik.corry@gmail.com
f6c3ef2d2a
Reverting r1136 due to crashes
...
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1138 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-01-23 14:33:19 +00:00
lrn@chromium.org
18c2d3ef4e
Clears captures of look-aheads on backtrack.
...
Reduces number of pushes when flushing a trace. Some are converted to clears
in the undo-code instead, and some just ignored if they have no value worth restoring.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1136 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-01-23 13:34:51 +00:00
christian.plesner.hansen@gmail.com
031e72ce99
review
...
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1130 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-01-23 07:46:44 +00:00
erik.corry@gmail.com
585e36b40e
Optimization: The quick check should ignore the negative lookahead instead of
...
insisting that it should match.
Review URL: http://codereview.chromium.org/18360
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1106 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-01-20 11:36:28 +00:00
erik.corry@gmail.com
2b77e718fa
Add support for \b and ^ and $ in multiline mode, completing Irregexp
...
features. Switch on Irregexp by default.
Review URL: http://codereview.chromium.org/18193
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1104 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-01-19 18:56:47 +00:00
deanm@chromium.org
b7c1200462
Fix a bunch of spelling mistakes :\
...
Review URL: http://codereview.chromium.org/18094
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1088 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-01-15 19:08:34 +00:00
erik.corry@gmail.com
43e9e343dd
Noone really liked the name "GenerationVariant" so here it gets renamed
...
to "Trace".
Review URL: http://codereview.chromium.org/18091
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1080 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-01-15 12:45:48 +00:00
christian.plesner.hansen@gmail.com
d6e6508bd7
Added clearing of captures before entering the body of a loop. This
...
also revealed a bug or two that had to be fixed.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1070 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-01-14 11:32:23 +00:00
lrn@chromium.org
21d2865757
Separately growing stack for irregexp ia32 backtrack stack.
...
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1053 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-01-12 13:05:23 +00:00
christian.plesner.hansen@gmail.com
4a16e4928a
Added check that bails out of a repetition when the body is empty.
...
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1047 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-01-08 12:40:47 +00:00
christian.plesner.hansen@gmail.com
afcc36a417
Added runtime call to the logging infrastructure. Made some changes
...
to the way regexps are being logged.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1028 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-01-06 13:24:52 +00:00
lrn@chromium.org
74b7d4ad00
Recognizes character classes like whitespace and non-newline and generates more efficient code.
...
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1024 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-01-02 12:23:17 +00:00
erik.corry@gmail.com
16852b987d
Some irregexp optimizations around keeping track of when the current character
...
register contains the next n characters.
Review URL: http://codereview.chromium.org/16410
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1014 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-12-22 12:48:14 +00:00
erik.corry@gmail.com
3dc722b555
Fix ARM build.
...
Review URL: http://codereview.chromium.org/14887
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1008 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-12-19 12:13:11 +00:00
erik.corry@gmail.com
ab2d4bc9bf
* Generate quick checks based on mask and compare for
...
the alternatives in a choice node. The quick checks
are conservative in the sense that they only detect
failure with certainty. Checks can do 2 or 4 characters
at a time.
* Inline the quick checks to allow the alternatives to
be checked without branching in the common case where
they fail.
Review URL: http://codereview.chromium.org/14194
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1005 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-12-19 12:02:34 +00:00
erik.corry@gmail.com
00b0b67c03
Unroll + and ? to reduce loop-related work.
...
Review URL: http://codereview.chromium.org/14836
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1003 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-12-18 15:17:24 +00:00
christian.plesner.hansen@gmail.com
e5270bd6e4
Removed propagation of information about preceding nodes by expanding
...
following nodes. Found a better solution.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1000 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-12-18 14:30:53 +00:00
christian.plesner.hansen@gmail.com
5d3cc28967
Fixed bug in interest propagation caused by following the loop edge
...
out of a loop choice node before the continuation edge.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@990 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-12-17 13:16:38 +00:00
lrn@chromium.org
00122b76d0
Each RegExtTree node can now report the min and max size of strings it can match.
...
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@988 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-12-17 10:59:14 +00:00
lrn@chromium.org
3b968e0207
Preemption code for irregexp-native-ia32. Regexps can not only succeede or
...
fail, but also report a thrown exception.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@974 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-12-12 10:49:00 +00:00
lrn@chromium.org
09e3c76137
Quantified look-aheads are sometimes removed entirely, leaving only a
...
single atom node. A flag was not set in this case, leading the wrapper
code to think the pattern was equal to the atom and use the pattern
in the indexOf operation.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@971 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-12-12 10:22:56 +00:00
ager@chromium.org
2a84fa4128
Fix lint issue.
...
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@969 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-12-11 13:04:22 +00:00
christian.plesner.hansen@gmail.com
ff3e30ae11
- Added lookbehind propagation for the initial node; now, if the
...
initial node is interested in what precedes it the automaton is
given an initial all-consuming character class that determines it.
- Added verification of some node information invariants. We now
check that if a node expresses interest in what precedes it that
information is available to it after assertion expansion.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@964 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-12-11 11:13:13 +00:00
erik.corry@gmail.com
df727ffd43
Fix build (someone tell gcc you can't take the address of a static
...
const int and someone tell MSVC it's OK to define a static const int
in a .cc file).
Review URL: http://codereview.chromium.org/13656
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@942 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-12-09 09:17:41 +00:00
erik.corry@gmail.com
7b4b4959c8
* Have an ASCII and a UC16 interpreter for Irregexp bytecodes -
...
never have to convert an ASCII string to UC16 for Irregexp.
* Generate slightly different code when we know the subject string
is ASCII.
Review URL: http://codereview.chromium.org/13247
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@941 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-12-09 08:30:49 +00:00
lrn@chromium.org
67c26a869f
Minor presentation changes
...
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@938 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-12-08 13:33:24 +00:00
lrn@chromium.org
5178af89fa
Irregexp is specialized on subject character type.
...
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@937 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-12-08 12:43:01 +00:00
erik.corry@gmail.com
ba09ec5e89
Irregexp:
...
* Facility for generating a node several ways. This allows
code to be generated for a node knowing where it is trying
to match relative to the 'current position' and it allows
code to be generated that knows where to backtrack to. Both
allow dramatic reductions in the amount of popping and pushing
on the stack and the number of indirect jumps.
* Generate special backtracking for greedy quantifiers on
constant-length atoms. This allows .* to run in constant
space relative to input string size.
* When we are checking a long sequence of characters or character
classes in the input then we do them right to left and only the
first (rightmost) needs to check for end-of-string.
* Record the pattern in the profile instead of just <CompiledRegExp>
* Nodes no longer contain an on_failure_ node. This was only used
for lookaheads and they are now handled with a choice node instead.
Review URL: http://codereview.chromium.org/12900
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@930 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-12-08 09:22:12 +00:00
lrn@chromium.org
dd9be4ef58
Matching a back-reference must handle unbound start-register (but can assume that if start register is bound, then end register is bound too).
...
After matching a back reference, the character position is advanced past
the match
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@908 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-12-03 13:24:34 +00:00
christian.plesner.hansen@gmail.com
12774ab2d8
Fixed issue where regexps were parsed without having set up a zone
...
scope, leading to zone exhaustion. Added assertion that a zone scope
exists on zone allocation.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@898 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-12-02 14:00:24 +00:00
christian.plesner.hansen@gmail.com
cc3e472843
- Fixed regexp logging issue.
...
- Removed use of std::set.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@883 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-12-02 08:16:12 +00:00
christian.plesner.hansen@gmail.com
917e91d1f2
- Added some expansion of assertions.
...
- Splitting of character classes into word and non-word parts.
- A bunch of refactorings.
- Made dispatch table construction lazy.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@880 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2008-12-01 15:42:35 +00:00