Commit Graph

90 Commits

Author SHA1 Message Date
erik.corry@gmail.com
83cbc638dc Remove some unused stuff from regexp implementation.
Review URL: https://chromiumcodereview.appspot.com/10205010

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@11423 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2012-04-24 09:34:13 +00:00
erik.corry@gmail.com
6a1b70ccbc Regexp: Unify the lookahead code for the BoyerMoore-like skipping
and the lookahead code for simplifying \b.  Also cache lookahead
results on the nodes, fix a memory leak and remove some character
class code we are no longer using.
Review URL: https://chromiumcodereview.appspot.com/9961009

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@11345 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2012-04-17 08:06:53 +00:00
erikcorry
f14b93a508 Regexp: Improve the speed that we scan for an initial point where a non-anchored
regexp can match by using a Boyer-Moore-like table.  This is done by identifying
non-greedy non-capturing loops in the nodes that eat any character one at a time.
For example in the middle of the regexp /foo[\s\S]*?bar/ we find such a loop.
There is also such a loop implicitly inserted at the start of any non-anchored
regexp.

When we have found such a loop we look ahead in the nodes to find the set of
characters that can come at given distances.  For example for the regexp
/.?foo/ we know that there are at least 3 characters ahead of us, and the sets
of characters that can occur are [any, [f, o], [o]].  We find a range in the
lookahead info where the set of characters is reasonably constrained.  In our
example this is from index 1 to 2 (0 is not constrained).  We can now look 3
characters ahead and if we don't find one of [f, o] (the union of [f, o] and
[o]) then we can skip forwards by the range size (in this case 2).

For Unicode input strings we do the same, but modulo 128.

We also look at the first string fed to the regexp and use that to get a hint
of the character frequencies in the inputs.  This affects the assessment of
whether the set of characters is 'reasonably constrained'.

We still have the old lookahead mechanism, which uses a wide load of multiple
characters followed by a mask and compare to determine whether a match is
possible at this point.
Review URL: http://codereview.chromium.org/9965010

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@11204 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2012-04-02 09:38:07 +00:00
svenpanne@chromium.org
3df99e7eb7 Thread the current isolate through a few places, avoiding Isolate::Current().
This removes approx. 12k calls of Isolate::Current() in string-tagcloud.

Review URL: https://chromiumcodereview.appspot.com/9490004

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@10856 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2012-02-28 10:32:02 +00:00
kmillikin@chromium.org
cb876c25a4 Include what you use for allocation, api, assembler, and ast.
R=fschneider@chromium.org
BUG=
TEST=

Review URL: https://chromiumcodereview.appspot.com/9288011

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@10505 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2012-01-25 16:31:25 +00:00
erik.corry@gmail.com
70da367f6b More spelling changes.
Review URL: http://codereview.chromium.org/9231009

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@10407 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2012-01-16 12:38:59 +00:00
erik.corry@gmail.com
338ab857b9 Remove a static initializer that could potentially slow down startup time.
BUG=1753
Review URL: http://codereview.chromium.org/8198005

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@9551 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2011-10-07 14:41:08 +00:00
kmillikin@chromium.org
ceee9d535a Remove #include "isolate-inl.h" from v8.h.
Include it only in the .cc files where it's needed.

R=fschneider@chromium.org
BUG=
TEST=

Review URL: http://codereview.chromium.org/8117001

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@9506 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2011-10-03 11:13:20 +00:00
ricow@chromium.org
b8cbe08fcc Fix presubmit errors caused by updated depot tools
This is all blank line before/after linting errors.
Review URL: http://codereview.chromium.org/7754022

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@9204 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2011-09-08 19:57:14 +00:00
jkummerow@chromium.org
1db6be7f2b Fix a few clang warnings (which -Werror treats as errors)
Review URL: http://codereview.chromium.org/7779033

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@9141 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2011-09-06 07:41:45 +00:00
svenpanne@chromium.org
c71cf782e8 Drastically reduce the transitive dependencies of jsregexp.h, making it (almost)
architecture-independent.

jsregexp.h is itself included transitively quite a lot, and by getting rid of 19
of its dependencies (which even included things like src/cpu.h, the various
assemblers, etc.), the recompilation behaviour is a bit less funny than it was.
Review URL: http://codereview.chromium.org/7331014

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@8589 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2011-07-11 09:12:17 +00:00
erik.corry@gmail.com
291781ed3c Limit the generation of regexp code with large inlined constants.
Review URL: http://codereview.chromium.org/6997015

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@7845 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2011-05-11 06:39:27 +00:00
svenpanne@chromium.org
5cd715cbc3 A tiny contribution for the IWYU day: Include allocation.h in every
header which uses BASE_EMBEDDED and/or AllStatic. Note that still only
45 out of 135 headers in src/ can be used stand-alone, but at least
this is a little bit more than before...
Review URL: http://codereview.chromium.org/6931031

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@7798 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2011-05-06 06:50:20 +00:00
mikhail.naganov@gmail.com
690093effe Mark single-argument inline constructors as 'explicit'.
There is currently a bug in cpplint.py hiding this problem.

R=sgjesse@chromium.org
BUG=1304
TEST=none

Review URL: http://codereview.chromium.org/6820028

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@7567 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2011-04-11 11:38:34 +00:00
vitalyr@chromium.org
7976ca2cbc Merge isolates to bleeding_edge.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@7271 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2011-03-18 20:35:07 +00:00
vitalyr@chromium.org
76e226f832 Revert r7268: it borked the history.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@7269 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2011-03-18 19:41:05 +00:00
vitalyr@chromium.org
6ff7fdebd3 Merge isolates to bleeding_edge.
Review URL: http://codereview.chromium.org/6685088

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@7268 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2011-03-18 18:49:56 +00:00
erik.corry@gmail.com
7846721d96 Fixes needed to compile on gcc-4.4.1 on ARM. It is still necessary
to add -fno-strict-aliasing.
Review URL: http://codereview.chromium.org/6123007

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@6281 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2011-01-12 11:56:41 +00:00
erik.corry@gmail.com
c5c852f64a Irregexp: Preload more characters when we are not at the
start of the input and some alternations in the disjunction
are anchored.
Review URL: http://codereview.chromium.org/5524006

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@5915 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2010-12-03 09:54:06 +00:00
lrn@chromium.org
1d24f5f56b Updated unicode library.
Added Nl category to letters predicate (as requried for JS identifiers).
Changed/simplified representation of canonicalization ranges.
Truncated tables to code points in the BMP (all that is used by JS).
Reformatted tables to avoid excessively long lines.
Removed duplicate entries from multi-character mapping result tables.

Review URL: http://codereview.chromium.org/3030026

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@5155 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2010-07-30 07:10:22 +00:00
erik.corry@gmail.com
e1b3b92a2c Make not sucking at regexp the default
(remove V8_NATIVE_REGEXP flag, add
V8_INTERPRETED_REGEXP flag).
Review URL: http://codereview.chromium.org/1635001

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@4443 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2010-04-19 19:30:11 +00:00
lrn@chromium.org
4db15f1235 Refactoring of RegExp interface to better support calling several times in a row.
Review URL: http://codereview.chromium.org/1114001

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@4190 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2010-03-19 12:01:17 +00:00
ricow@chromium.org
2fa30a8e34 Added zone-inl.h to jsregexp.h since it relies on calling new ZoneList which again relies on calling the static new method on Zone (defined in zone-inl.h but declared in zone.h).
Review URL: http://codereview.chromium.org/719001

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@4060 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2010-03-09 09:15:28 +00:00
kmillikin@chromium.org
3817a7ba6e Small simplification of #include dependencies.
Remove messages.h from v8.h and include it explicitly in only the few places
it is needed.  Many files relied on getting handles-inl.h implicitly from
messages.h through v8.h, so include handles-inl.h explicitly in v8.h
instead.

Remove zone-inl.h from header files where it is not needed, can be replaced
by a forward declaration, or can be replaced by zone.h (specifically,
factory.h and heap.h).  Include zone.h or zone-inl.h in header files where
it was implicitly included via heap.h or factory.h.  Prefer zone.h over
zone-inl.h in header files where possible by including zone-inl.h in .cc
files.

Review URL: http://codereview.chromium.org/668248

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@4058 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2010-03-09 06:38:33 +00:00
lrn@chromium.org
46504c1557 Attempt to make \b\w+ faster. Slight performance increase on, e.g., string unpacking.
Review URL: http://codereview.chromium.org/507051


git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@3563 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2010-01-07 19:01:23 +00:00
sgjesse@chromium.org
429f3cf9f2 Direct call to native RegExp code from JavaScript.
Calls to RegExp no longer have to be via a call to the runtime system. A new stub have been added which can handle this call in generated code. The stub checks all the parameters and creates RegExp entry frame in the same way as it is created by the runtime system. Bailout to the runtime system is done whenever an uncommon situation is encountered or when the static data used is not initialized. After running the native RegExp code the last match info is updated like in the runtime system.

Currently only ASCII strings are handled.

Added another argument to the RegExp entry frame. It indicated whether the call is direct from JavaScript code or through the runtime system. This information is used when RegExp execution is interrupted. If an interruption happens when RegExp code is called directly a retry is issued causing the interruption to be handled via the runtime system. The reason for this is that the direct call to RegExp code does not support garbage collection.
Review URL: http://codereview.chromium.org/521028

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@3542 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2010-01-06 11:09:30 +00:00
kasperl@chromium.org
88ba93d9db Remove unused function and function declaration.
Review URL: http://codereview.chromium.org/523036

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@3529 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2010-01-04 11:24:03 +00:00
erik.corry@gmail.com
b068a9f755 * Fix regexp benchmark regression where we were doing work to
make standard regexps like \s and . case independent.
* Make use of the fact that the subject string is ASCII only
when making character classes case independent.
* Avoid spending time making large ideogram or punctuation
ranges case independent when there is no case mapping anyway.
Review URL: http://codereview.chromium.org/378024

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@3243 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-11-09 10:01:23 +00:00
kmillikin@chromium.org
6f5ed57683 Untangle some #include dependencies.
"jsregexp.h" and "jump-target.h" required "macro-assembler.h" to
always be included first.  Instead the include of "macro-assembler.h"
has moved into those header files.

"dateparser-inl.h" required "dateparser.h" to always be included
first.  Instead the include of "dateparser.h" has moved into
"dateparser-inl.h".

Review URL: http://codereview.chromium.org/267117

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@3074 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-10-15 15:01:36 +00:00
mikhail.naganov@gmail.com
9e8216ef22 Introduce first approximation of constructor heap profile for JS objects.
It is activated with '--log-gc' flag.

JS object size is calculated as its size + size of 'properties' and 'elements' arrays, if they are non-empty. This doesn't take maps, strings, heap numbers, and other shared objects into account.

As Soeren suggested, I've moved ZoneSplayTree from jsregexp to zone, and removed now empty jsregexp-inl header file.

Review URL: http://codereview.chromium.org/159504

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2570 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-07-29 08:10:19 +00:00
lrn@chromium.org
72de7ab74e Separate native and interpreted regexp by compile time flag, not runtime.
Clean-up of RegExp code.

Review URL: http://codereview.chromium.org/155085


git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2366 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-07-07 08:11:19 +00:00
lrn@chromium.org
2e37ebe1ed Added stack overflow check for RegExp analysis phase.
A very long regexp graph can overflow the stack with recursive calls.

Review URL: http://codereview.chromium.org/113894


git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2064 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-05-27 11:23:26 +00:00
mikhail.naganov@gmail.com
30a0a7de43 Split nested namespaces declaration in two lines in accordance with C++ Style Guide.
This issue was raised by Brett Wilson while reviewing my changelist for readability. Craig Silverstein (one of C++ SG maintainers) confirmed that we should declare one namespace per line. Our way of namespaces closing seems not violating style guides (there is no clear agreement on it), so I left it intact.

Review URL: http://codereview.chromium.org/115756


git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2038 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-05-25 10:05:56 +00:00
deanm@chromium.org
2b56660a8b Introduce two separate classes of processor detection:
- TARGET, the architecture we will generate code for.
  This is brought it from the build system.
- HOST, the architecture our C++ compiler is building for.
  This is detected automatically based on compiler defines.

This adds macros for 32 or 64 bit, and cleans up some
include conditionals, etc.

Review URL: http://codereview.chromium.org/99355


git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1864 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-05-05 12:06:20 +00:00
lrn@chromium.org
ea56336518 Create build structure for X64.
Possible to attempt to build for X64.
Build will be unsuccessful, since all x64 source files are
missing and pointers are reinterpreted as integers everywhere.

Review URL: http://codereview.chromium.org/99186


git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1817 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-04-29 13:11:48 +00:00
erik.corry@gmail.com
22d6cc9d4b Lint.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1798 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-04-27 09:27:18 +00:00
lrn@chromium.org
bd8816efb0 Moved String.prototype.match implementation to C++.
Some extra runtime assertions added.


git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1608 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-03-25 12:14:10 +00:00
lrn@chromium.org
6fa2f4f0c9 RegExps now restart if their input string changes representation during preemption.
Cleaned up the handling of strings moving, so strings moved by GC and strings changing representation are handled equivalently.


git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1562 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-03-20 13:26:16 +00:00
lrn@chromium.org
eb656c723b Moved subject and index before matches in RegExp lastMatchInfo.
Some minor changes, and removed the new handlescope in the inner loop of replace. Only really affects replaces on extremely long strings.

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1524 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-03-17 12:44:20 +00:00
erik.corry@gmail.com
608a99a90c Remove all uses of StringShape variables, since that has proven
to be error-prone and of little benefit in terms of performance.
Review URL: http://codereview.chromium.org/45010

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1521 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-03-17 09:33:06 +00:00
lrn@chromium.org
e2af4529c3 String.replace implemented in C++.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1506 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-03-13 10:22:38 +00:00
erik.corry@gmail.com
912c8eb03a * Reapply revisions 1383, 1384, 1391, 1398, 1401, 1402,
1418, and 1419 from bleeding_edge, reverted in 1429.
* Fix of $1 accessor on sliced strings.
* Fix of lastParen method when last parenthesis did not match.
Review URL: http://codereview.chromium.org/43075

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1491 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-03-11 14:00:55 +00:00
kasperl@chromium.org
e9e8628380 Revert revisions 1383, 1384, 1391, 1398, 1401, 1402,
1418, and 1419 from bleeding_edge until we have a fix
for the crashers we see on the distributed test infra-
structure.

We know that revision 1383 is causing issues, but I 
had to revert some of the other recent RegExp changes
in order to get this part out.
Review URL: http://codereview.chromium.org/39186

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1429 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-03-05 15:23:17 +00:00
lrn@chromium.org
265715d90c Optimized regexp.test. No longer creates an intermediate string array.
Removed some handler code.


git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1402 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-03-03 10:54:12 +00:00
lrn@chromium.org
50e042dfcd All RegExp data are set on a single FixedArray instead of nesting them three deep.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1398 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-03-02 13:58:37 +00:00
erik.corry@gmail.com
5b8c63f9d5 Avoids allocating a JSArray of capture information on each non-global
regular expression match.
Also moves all last-match information into one place where it can be
updated from C++ code (this will be used in another afsnit).
Review URL: http://codereview.chromium.org/28184

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1383 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-02-27 10:04:34 +00:00
erik.corry@gmail.com
bbc2a73f31 Remove JSCRE
Review URL: http://codereview.chromium.org/21504

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1355 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-02-25 08:08:01 +00:00
erik.corry@gmail.com
9c608b2c5a Limit how many places we generate code to flush the same actions. This gives a
13% code size reduction in the php regexp with no discernable performance loss.
Review URL: http://codereview.chromium.org/20457

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1309 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-02-19 08:24:28 +00:00
erik.corry@gmail.com
3f962f0f9c Irregexp:
* Fix UC16 character classes on ASCII subjects.
* Fix sign problem in Irregexp interpreter.
* Make passes over text nodes more readable.
Review URL: http://codereview.chromium.org/21450

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1304 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-02-18 16:07:03 +00:00
lrn@chromium.org
80bb2cc546 Missing handle check. Triggers bug if the runtime stack overflows and it is detected by a global regexp.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1263 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2009-02-13 09:40:15 +00:00