In the RegExpUnparser::VisitText(RegExpText* that, void* data) function always RegExpUnparser::VisitAtom function called via that->elements()->at(i).data.u_atom->Accept(this, data); even if the type of the object is RegExpCharacterClass.
The problem shows using g++ 4.7(.2, .3) since r16232, since GCC optimizes virtual method calls to direct calls based on __final/final hints. Tested on MIPS and x64:
Program received signal SIGSEGV, Segmentation fault.
0x0000000000588928 in v8::internal::RegExpUnparser::VisitAtom(v8::internal::RegExpAtom*, void*) ()
This cleans up the TextElement class to avoid the unsafe+unchecked union access, that caused the crash.
TEST=cctest/test-regexp/ParserRegression
R=jkummerow@chromium.org
Review URL: https://codereview.chromium.org/22815033
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@16289 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
The CompilationInfo record now saves a Zone, and the compiler pipeline
allocates memory from the Zone in the CompilationInfo. Before
compiling a function, we create a Zone on the stack and save a pointer
to that Zone to the CompilationInfo; which then gets picked up and
allocated from.
BUG=
TEST=
Review URL: https://chromiumcodereview.appspot.com/10534139
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@11877 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
regexp can match by using a Boyer-Moore-like table. This is done by identifying
non-greedy non-capturing loops in the nodes that eat any character one at a time.
For example in the middle of the regexp /foo[\s\S]*?bar/ we find such a loop.
There is also such a loop implicitly inserted at the start of any non-anchored
regexp.
When we have found such a loop we look ahead in the nodes to find the set of
characters that can come at given distances. For example for the regexp
/.?foo/ we know that there are at least 3 characters ahead of us, and the sets
of characters that can occur are [any, [f, o], [o]]. We find a range in the
lookahead info where the set of characters is reasonably constrained. In our
example this is from index 1 to 2 (0 is not constrained). We can now look 3
characters ahead and if we don't find one of [f, o] (the union of [f, o] and
[o]) then we can skip forwards by the range size (in this case 2).
For Unicode input strings we do the same, but modulo 128.
We also look at the first string fed to the regexp and use that to get a hint
of the character frequencies in the inputs. This affects the assessment of
whether the set of characters is 'reasonably constrained'.
We still have the old lookahead mechanism, which uses a wide load of multiple
characters followed by a mask and compare to determine whether a match is
possible at this point.
Review URL: http://codereview.chromium.org/9965010
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@11204 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
architecture-independent.
jsregexp.h is itself included transitively quite a lot, and by getting rid of 19
of its dependencies (which even included things like src/cpu.h, the various
assemblers, etc.), the recompilation behaviour is a bit less funny than it was.
Review URL: http://codereview.chromium.org/7331014
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@8589 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Added Nl category to letters predicate (as requried for JS identifiers).
Changed/simplified representation of canonicalization ranges.
Truncated tables to code points in the BMP (all that is used by JS).
Reformatted tables to avoid excessively long lines.
Removed duplicate entries from multi-character mapping result tables.
Review URL: http://codereview.chromium.org/3030026
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@5155 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Remove messages.h from v8.h and include it explicitly in only the few places
it is needed. Many files relied on getting handles-inl.h implicitly from
messages.h through v8.h, so include handles-inl.h explicitly in v8.h
instead.
Remove zone-inl.h from header files where it is not needed, can be replaced
by a forward declaration, or can be replaced by zone.h (specifically,
factory.h and heap.h). Include zone.h or zone-inl.h in header files where
it was implicitly included via heap.h or factory.h. Prefer zone.h over
zone-inl.h in header files where possible by including zone-inl.h in .cc
files.
Review URL: http://codereview.chromium.org/668248
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@4058 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Calls to RegExp no longer have to be via a call to the runtime system. A new stub have been added which can handle this call in generated code. The stub checks all the parameters and creates RegExp entry frame in the same way as it is created by the runtime system. Bailout to the runtime system is done whenever an uncommon situation is encountered or when the static data used is not initialized. After running the native RegExp code the last match info is updated like in the runtime system.
Currently only ASCII strings are handled.
Added another argument to the RegExp entry frame. It indicated whether the call is direct from JavaScript code or through the runtime system. This information is used when RegExp execution is interrupted. If an interruption happens when RegExp code is called directly a retry is issued causing the interruption to be handled via the runtime system. The reason for this is that the direct call to RegExp code does not support garbage collection.
Review URL: http://codereview.chromium.org/521028
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@3542 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
make standard regexps like \s and . case independent.
* Make use of the fact that the subject string is ASCII only
when making character classes case independent.
* Avoid spending time making large ideogram or punctuation
ranges case independent when there is no case mapping anyway.
Review URL: http://codereview.chromium.org/378024
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@3243 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
"jsregexp.h" and "jump-target.h" required "macro-assembler.h" to
always be included first. Instead the include of "macro-assembler.h"
has moved into those header files.
"dateparser-inl.h" required "dateparser.h" to always be included
first. Instead the include of "dateparser.h" has moved into
"dateparser-inl.h".
Review URL: http://codereview.chromium.org/267117
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@3074 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
It is activated with '--log-gc' flag.
JS object size is calculated as its size + size of 'properties' and 'elements' arrays, if they are non-empty. This doesn't take maps, strings, heap numbers, and other shared objects into account.
As Soeren suggested, I've moved ZoneSplayTree from jsregexp to zone, and removed now empty jsregexp-inl header file.
Review URL: http://codereview.chromium.org/159504
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2570 ce2b1a6d-e550-0410-aec6-3dcde31c8c00