Named capture groups may be specified using the /(?<name>pattern)/u
syntax, with named backreferences specified as /\k<name>/u. They're
hidden behind the --harmony-regexp-named-captures flag, and are only
enabled for unicode regexps.
R=yangguo@chromium.org
BUG=
Review-Url: https://codereview.chromium.org/2050343002
Cr-Commit-Position: refs/heads/master@{#36986}
Short external strings do not cache the resource data, and may be used
for compressible strings. The assumptions about their lengths is
invalid and may lead to oob reads.
R=jkummerow@chromium.org
BUG=v8:4923,chromium:604897
LOG=N
Review URL: https://codereview.chromium.org/1901573003
Cr-Commit-Position: refs/heads/master@{#35660}
- RegExp.prototype.toString() doesn't have any special handling of
RegExp instances and simply calls the source and flags getters
- Use the original values of global and sticky, rather than based
on the current flag getters, as specified in
https://github.com/tc39/ecma262/pull/494R=yangguo@chromium.org,adamk
LOG=Y
BUG=v8:4602
Review URL: https://codereview.chromium.org/1846303002
Cr-Commit-Position: refs/heads/master@{#35225}
We expect that the majority of malloc'd memory held by V8 is allocated
in Zone objects. Introduce an Allocator class that is used by Zones to
manage memory, and allows for querying the current usage.
BUG=none
R=titzer@chromium.org,bmeurer@chromium.org,jarin@chromium.org
LOG=n
TBR=rossberg@chromium.org
Review URL: https://codereview.chromium.org/1847543002
Cr-Commit-Position: refs/heads/master@{#35196}
It's been on since M49. Also moved tests from harmony -> es6,
one of which was merged with another test of the same name.
While moving stuff over to regexp.js, I also noticed that there
were unused calls to %FunctionSetName and %SetNativeFlag (those
calls are already handled by InstallGetter()).
Review URL: https://codereview.chromium.org/1838563003
Cr-Commit-Position: refs/heads/master@{#35076}
The CharacterRange constructor checks the input for validity. However,
CharacterRange::Singleton also uses the constructor and may have
kEndMarker as input, causing the check to fail.
The solution is to move the check to CharacterRange::Range and
consistently use it across the code base.
R=jkummerow@chromium.org
BUG=chromium:593282
LOG=N
Review URL: https://codereview.chromium.org/1776013003
Cr-Commit-Position: refs/heads/master@{#34626}
We divide character ranges into
- BMP, matched normally.
- non-BMP, matched as alternatives of surrogate pair ranges.
- lone surrogates, matched with lookaround assertion that its indeed lone.
R=erik.corry@gmail.com
BUG=v8:2952
LOG=N
Review URL: https://codereview.chromium.org/1578253005
Cr-Commit-Position: refs/heads/master@{#33432}
Unexpectedly, websites depend on doing feature testing with
RegExp.prototype.sticky and browser testing with RegExp.prototype.toString().
ES2015 newly throws exceptions for both of these. In order to enable shipping
new ES2015 semantics, this patch puts in narrow workarounds for those two
cases, keeping their old behavior. UseCounters are added for how often
those particular cases come up, so we can see if it can be deprecated.
This reland replaces problematic legacy const usage with var, to
avoid issues with nosnap builds.
R=yangguo
CC=bmeurer
BUG=v8:4637,v8:4617
LOG=Y
CQ_INCLUDE_TRYBOTS=tryserver.chromium.linux:linux_chromium_rel_ng;tryserver.blink:linux_blink_rel
Review URL: https://codereview.chromium.org/1545633002
Cr-Commit-Position: refs/heads/master@{#33002}
Reason for revert:
Breaks nosnap: http://build.chromium.org/p/client.v8/builders/V8%20Linux%20-%20nosnap/builds/5883
Original issue's description:
> Add web compat workarounds for ES2015 RegExp semantics
>
> Unexpectedly, websites depend on doing feature testing with
> RegExp.prototype.sticky and browser testing with RegExp.prototype.toString().
> ES2015 newly throws exceptions for both of these. In order to enable shipping
> new ES2015 semantics, this patch puts in narrow workarounds for those two
> cases, keeping their old behavior. UseCounters are added for how often
> those particular cases come up, so we can see if it can be deprecated.
>
> R=yangguo
> BUG=v8:4637,v8:4617
> LOG=Y
> CQ_INCLUDE_TRYBOTS=tryserver.chromium.linux:linux_chromium_rel_ng;tryserver.blink:linux_blink_rel
>
> Committed: https://crrev.com/98f819c3e0c92d54a306cdacadda73cf96d21b52
> Cr-Commit-Position: refs/heads/master@{#32997}
TBR=yangguo@google.com,yangguo@chromium.org,littledan@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=v8:4637,v8:4617
Review URL: https://codereview.chromium.org/1546493003
Cr-Commit-Position: refs/heads/master@{#32999}
Unexpectedly, websites depend on doing feature testing with
RegExp.prototype.sticky and browser testing with RegExp.prototype.toString().
ES2015 newly throws exceptions for both of these. In order to enable shipping
new ES2015 semantics, this patch puts in narrow workarounds for those two
cases, keeping their old behavior. UseCounters are added for how often
those particular cases come up, so we can see if it can be deprecated.
R=yangguo
BUG=v8:4637,v8:4617
LOG=Y
CQ_INCLUDE_TRYBOTS=tryserver.chromium.linux:linux_chromium_rel_ng;tryserver.blink:linux_blink_rel
Review URL: https://codereview.chromium.org/1543723002
Cr-Commit-Position: refs/heads/master@{#32997}
The test expectations should fail consistently in both release and debug
builds. DCHECK is only meant for debug-only checks in production code.
R=yangguo@chromium.org
Review URL: https://codereview.chromium.org/1506753002
Cr-Commit-Position: refs/heads/master@{#32639}
Moves all files related to AST and scopes into ast/,
and all files related to scanner & parser to parsing/.
Also eliminates a couple of spurious dependencies.
R=mstarzinger@chromium.org
BUG=
Review URL: https://codereview.chromium.org/1481613002
Cr-Commit-Position: refs/heads/master@{#32351}
This tries to remove includes of "-inl.h" headers from normal ".h"
headers, thereby reducing the chance of any cyclic dependencies and
decreasing the average size of our compilation units.
Note that this change still leaves 7 violations of that rule in the
code. However there now is the "tools/check-inline-includes.sh" tool
detecting such violations.
R=bmeurer@chromium.org
Review URL: https://codereview.chromium.org/1283033003
Cr-Commit-Position: refs/heads/master@{#30125}
Along the way:
- Thread isolate parameter explicitly through code that used to
rely on getting it from the zone.
- Canonicalize the parameter position of isolate and zone for
affected code
- Change Hydrogen New<> instruction templates to automatically
pass isolate
R=mstarzinger@chromium.org
LOG=N
Review URL: https://codereview.chromium.org/868883002
Cr-Commit-Position: refs/heads/master@{#26252}
> We also initialize the Isolate on creation.
>
> This should allow for getting rid of the last remaining default isolate
> traces. Also, it'll speed up several isolate related operations that no
> longer require locks.
>
> Embedders that relied on v8::Isolate to return an uninitialized Isolate
> (so they can set ResourceConstraints for example, or set flags that
> modify the way the isolate is created) should either do the setup before
> creating the isolate, or use the recently added CreateParams to pass e.g.
> ResourceConstraints.
>
> BUG=none
> LOG=y
> R=svenpanne@chromium.org
>
> Review URL: https://codereview.chromium.org/469783002
BUG=none
LOG=y
TBR=svenpanne@chromium.org
Review URL: https://codereview.chromium.org/583153002
git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@24067 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
We also initialize the Isolate on creation.
This should allow for getting rid of the last remaining default isolate
traces. Also, it'll speed up several isolate related operations that no
longer require locks.
Embedders that relied on v8::Isolate to return an uninitialized Isolate
(so they can set ResourceConstraints for example, or set flags that
modify the way the isolate is created) should either do the setup before
creating the isolate, or use the recently added CreateParams to pass e.g.
ResourceConstraints.
BUG=none
LOG=y
R=svenpanne@chromium.org
Review URL: https://codereview.chromium.org/469783002
git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@24052 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Causes a flaky failure on buildbots. Here is the (deterministic) repro step (thanks to Michael Stanton):
first go to flag-definitions.h and set this to false.
DEFINE_BOOL(enable_sse4_1, false,
"enable use of SSE4.1 instructions if available")
Run the following and it should fail:
tools/run-tests.py --arch=ia32 --mode=release cctest/test-api/Regress2107
R=yangguo@chromium.org
BUG=
Review URL: https://codereview.chromium.org/580123002
git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@24045 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Made operator* return reference to the raw type, not pointer. New method 'get()' should be used when raw pointer is needed.
Also removed useless inline modifier from the SmaprtPointer methods and added const modifier to the methods that don't change smart pointer.
Made ~SmartPointerBase protected to avoid accidental calls of the non-virtual base class's destructor.
drive-by: fixed use after free in src/factory.cc
BUG=None
LOG=N
R=alph@chromium.org, svenpanne@chromium.org
Review URL: https://codereview.chromium.org/101763003
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@18275 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
I tried to limit the use of v8::Isolate::GetCurrent() and v8::internal::Isolate::Current() as much as possible, but sometimes this would have involved restructuring tests quite a bit, which is better left for a separate CL.
BUG=v8:2487
Review URL: https://codereview.chromium.org/12716010
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@13953 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
The reason this test fails on ARM hardware but not on Intel hardware
(including the ARM simulator) is this:
'\xa0' is interpreted as a negative signed byte number. Casting it to
uc16 sign-extends it. The resulting string does not fit into a one-byte
string, thus a two-byte string is allocated.
For some reason the code compiled for ARM does not sign-extend, and 0xa0
fits into a one-byte string. Thus a one-byte string is allocated. Trying
to cast it to two-byte causes assertion failure.
BUG=
Review URL: https://chromiumcodereview.appspot.com/12319111
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@13729 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
The CompilationInfo record now saves a Zone, and the compiler pipeline
allocates memory from the Zone in the CompilationInfo. Before
compiling a function, we create a Zone on the stack and save a pointer
to that Zone to the CompilationInfo; which then gets picked up and
allocated from.
BUG=
TEST=
Review URL: https://chromiumcodereview.appspot.com/10534139
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@11877 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
regexp can match by using a Boyer-Moore-like table. This is done by identifying
non-greedy non-capturing loops in the nodes that eat any character one at a time.
For example in the middle of the regexp /foo[\s\S]*?bar/ we find such a loop.
There is also such a loop implicitly inserted at the start of any non-anchored
regexp.
When we have found such a loop we look ahead in the nodes to find the set of
characters that can come at given distances. For example for the regexp
/.?foo/ we know that there are at least 3 characters ahead of us, and the sets
of characters that can occur are [any, [f, o], [o]]. We find a range in the
lookahead info where the set of characters is reasonably constrained. In our
example this is from index 1 to 2 (0 is not constrained). We can now look 3
characters ahead and if we don't find one of [f, o] (the union of [f, o] and
[o]) then we can skip forwards by the range size (in this case 2).
For Unicode input strings we do the same, but modulo 128.
We also look at the first string fed to the regexp and use that to get a hint
of the character frequencies in the inputs. This affects the assessment of
whether the set of characters is 'reasonably constrained'.
We still have the old lookahead mechanism, which uses a wide load of multiple
characters followed by a mask and compare to determine whether a match is
possible at this point.
Review URL: http://codereview.chromium.org/9965010
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@11204 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
("narrowing conversion ... inside { } is ill-formed in C++11").
* src/mksnapshot.cc: Cast "char" to "unsigned char" when outputting snapshot.
* test/cctest/test-regexp.cc: Use static_cast to uc16 as the char
literal is signed.
Review URL: http://codereview.chromium.org/8825003
Patch from Tobias Burnus <burnus@net-b.de>.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@10241 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
architecture-independent.
jsregexp.h is itself included transitively quite a lot, and by getting rid of 19
of its dependencies (which even included things like src/cpu.h, the various
assemblers, etc.), the recompilation behaviour is a bit less funny than it was.
Review URL: http://codereview.chromium.org/7331014
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@8589 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
This commit adds current working versions of assembler, macro-assembler,
disassembler, and simulator.
All other mips arch files are replaced with stubbed-out versions that
will build.
Arch independent files are updated as needed to support building and
running mips.
The only test is cctest/test-assembler-mips, and this passes on the
simulator and on mips hardware.
TEST=none
BUG=none
Patch by Paul Lind from MIPS.
Review URL: http://codereview.chromium.org/6730029/
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@7388 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Added Nl category to letters predicate (as requried for JS identifiers).
Changed/simplified representation of canonicalization ranges.
Truncated tables to code points in the BMP (all that is used by JS).
Reformatted tables to avoid excessively long lines.
Removed duplicate entries from multi-character mapping result tables.
Review URL: http://codereview.chromium.org/3030026
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@5155 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
As the start index is already passed it is easy to calculate the "at start" boolean in generated code. Also as direct entry has been implemented this needs to be done in generated code anyway, and therefore might as well be moved to the generated code for RegExp. The "at start" value is now calcualted as a local variable on the native RegExp frame based on the value of the start index argument.
The x64 version have been tested on both Linux and 64-bit Windows Vista.
For ARM I have tested cctest/test-regexp on ARM hardware, but the rest of the tests have only been run on the ARM simulator.
Review URL: http://codereview.chromium.org/554078
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@3709 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
make standard regexps like \s and . case independent.
* Make use of the fact that the subject string is ASCII only
when making character classes case independent.
* Avoid spending time making large ideogram or punctuation
ranges case independent when there is no case mapping anyway.
Review URL: http://codereview.chromium.org/378024
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@3243 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
It is activated with '--log-gc' flag.
JS object size is calculated as its size + size of 'properties' and 'elements' arrays, if they are non-empty. This doesn't take maps, strings, heap numbers, and other shared objects into account.
As Soeren suggested, I've moved ZoneSplayTree from jsregexp to zone, and removed now empty jsregexp-inl header file.
Review URL: http://codereview.chromium.org/159504
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2570 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
- TARGET, the architecture we will generate code for.
This is brought it from the build system.
- HOST, the architecture our C++ compiler is building for.
This is detected automatically based on compiler defines.
This adds macros for 32 or 64 bit, and cleans up some
include conditionals, etc.
Review URL: http://codereview.chromium.org/99355
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1864 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Cleaned up the handling of strings moving, so strings moved by GC and strings changing representation are handled equivalently.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1562 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Some minor changes, and removed the new handlescope in the inner loop of replace. Only really affects replaces on extremely long strings.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1524 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
decimal escape be accepted as a capture index.
We introduce a limit on the nubmer of allowed captures in a regexp, and break off
parsing of the decimal escape at that point.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1189 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
single atom node. A flag was not set in this case, leading the wrapper
code to think the pattern was equal to the atom and use the pattern
in the indexOf operation.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@971 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
initial node is interested in what precedes it the automaton is
given an initial all-consuming character class that determines it.
- Added verification of some node information invariants. We now
check that if a node expresses interest in what precedes it that
information is available to it after assertion expansion.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@964 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
* Facility for generating a node several ways. This allows
code to be generated for a node knowing where it is trying
to match relative to the 'current position' and it allows
code to be generated that knows where to backtrack to. Both
allow dramatic reductions in the amount of popping and pushing
on the stack and the number of indirect jumps.
* Generate special backtracking for greedy quantifiers on
constant-length atoms. This allows .* to run in constant
space relative to input string size.
* When we are checking a long sequence of characters or character
classes in the input then we do them right to left and only the
first (rightmost) needs to check for end-of-string.
* Record the pattern in the profile instead of just <CompiledRegExp>
* Nodes no longer contain an on_failure_ node. This was only used
for lookaheads and they are now handled with a choice node instead.
Review URL: http://codereview.chromium.org/12900
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@930 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
- Splitting of character classes into word and non-word parts.
- A bunch of refactorings.
- Made dispatch table construction lazy.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@880 ce2b1a6d-e550-0410-aec6-3dcde31c8c00