This CL implements early SyntaxErrors for regular expressions. Early
errors are thrown when a malformed pattern is parsed, rather than when
the code first runs.
We do this by having the JS parser call into the regexp parser when
a regexp pattern is found. Regexps are expected to be relatively
rare, small, and cheap to parse - that's why we currently accept that
the regexp parser does unnecessary work (e.g. creating the AST
structures).
If needed, we can optimize in the future. Ideas:
- Split up the regexp parser to avoid useless work for syntax validation.
- Preserve parser results to avoid reparsing later.
Bug: v8:896
Change-Id: I3d1ec18c980ba94439576ac3764138552418b85d
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3106647
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Leszek Swirski <leszeks@chromium.org>
Reviewed-by: Patrick Thier <pthier@chromium.org>
Cr-Commit-Position: refs/heads/main@{#76502}
This is a reland of 7d1f95d6e4
The reland fixes a performance issue in that we incorrectly marked
every pattern containing a backslash as needing to be escaped,
resulting in a new string allocation instead of reusing the existing
string.
Original change's description:
> [regexp] Correctly escape a backslash-newline sequence
>
> When printing the source string, a backslash-newline sequence ('\\\n',
> '\\\r', '\\\u2028', '\\\u2029') should be formatted as '\n', '\r',
> '\u2028', '\u2029', respectively. Prior to this CL it was formatted as
> a backslash followed by the literal newline character.
>
> Bug: v8:8615
> Change-Id: Iac90195c56ea1707ea8469066b0cc967ea87fc73
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2016583
> Commit-Queue: Jakob Gruber <jgruber@chromium.org>
> Reviewed-by: Georg Neis <neis@chromium.org>
> Auto-Submit: Jakob Gruber <jgruber@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#65986}
Bug: v8:8615,chromium:1046678
Change-Id: I5d75904f1ea543ec679649668e54749821116442
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2074159
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Georg Neis <neis@chromium.org>
Auto-Submit: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66476}
This reverts commit 7d1f95d6e4.
Reason for revert: Speculative revert for https://crbug.com/1046678
Original change's description:
> [regexp] Correctly escape a backslash-newline sequence
>
> When printing the source string, a backslash-newline sequence ('\\\n',
> '\\\r', '\\\u2028', '\\\u2029') should be formatted as '\n', '\r',
> '\u2028', '\u2029', respectively. Prior to this CL it was formatted as
> a backslash followed by the literal newline character.
>
> Bug: v8:8615
> Change-Id: Iac90195c56ea1707ea8469066b0cc967ea87fc73
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2016583
> Commit-Queue: Jakob Gruber <jgruber@chromium.org>
> Reviewed-by: Georg Neis <neis@chromium.org>
> Auto-Submit: Jakob Gruber <jgruber@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#65986}
TBR=neis@chromium.org,jgruber@chromium.org
# Not skipping CQ checks because original CL landed > 1 day ago.
Bug: v8:8615,chromium:1046678
Change-Id: If28626a1c6868ed848310c0d30cf61a73326f2c1
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2027452
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66022}
When printing the source string, a backslash-newline sequence ('\\\n',
'\\\r', '\\\u2028', '\\\u2029') should be formatted as '\n', '\r',
'\u2028', '\u2029', respectively. Prior to this CL it was formatted as
a backslash followed by the literal newline character.
Bug: v8:8615
Change-Id: Iac90195c56ea1707ea8469066b0cc967ea87fc73
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2016583
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Georg Neis <neis@chromium.org>
Auto-Submit: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#65986}
EscapeRegExpPattern should return a string representation of a
RegExp instance that in turn can be used to construct a new
RegExp instance with the same internal state as the original one.
Previous versions incorrectly escaped '/' also inside character classes
(e.g. /[/]/ returned "[\/]").
This patch properly escapes '/' when necessary and omits unnecessary
escapes.
Bug: v8:8615, v8:1982, v8:9446
Change-Id: I4ecb993dc69d6976f4637cedf43465cd0c32e427
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1688050
Commit-Queue: Patrick Thier <pthier@google.com>
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Mathias Bynens <mathias@chromium.org>
Reviewed-by: Peter Marshall <petermarshall@chromium.org>
Cr-Commit-Position: refs/heads/master@{#62587}
This is a reland of 0e22ec737b
Original change's description:
> [regexp] Escape newlines when setting [[OriginalSource]]
>
> This escapes LineTerminator characters in a regexp pattern when
> creating the string that will be stored in the [[OriginalSource]] slot.
>
> As an example, the source property for all following objects will equal
> "\n" (a '\' character followed by 'n'):
>
> /\n/
> new RegExp("\n")
> new RegExp("\\n")
>
> Bug: v8:1982, chromium:855009
> Change-Id: I3b539497a0697e3d51ec969cae49308b0b312a19
> Reviewed-on: https://chromium-review.googlesource.com/c/1384316
> Commit-Queue: Jakob Gruber <jgruber@chromium.org>
> Reviewed-by: Yang Guo <yangguo@chromium.org>
> Reviewed-by: Mathias Bynens <mathias@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#58387}
Bug: v8:1982, chromium:855009
Change-Id: I1ba22395477ec37e8e8c944000f9beade1e3250b
Reviewed-on: https://chromium-review.googlesource.com/c/1386495
Reviewed-by: Yang Guo <yangguo@chromium.org>
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#58419}
This reverts commit 0e22ec737b.
Reason for revert: Breaks layout tests:
https://ci.chromium.org/p/v8/builders/luci.v8.ci/V8-Blink%20Linux%2064/28814
Original change's description:
> [regexp] Escape newlines when setting [[OriginalSource]]
>
> This escapes LineTerminator characters in a regexp pattern when
> creating the string that will be stored in the [[OriginalSource]] slot.
>
> As an example, the source property for all following objects will equal
> "\n" (a '\' character followed by 'n'):
>
> /\n/
> new RegExp("\n")
> new RegExp("\\n")
>
> Bug: v8:1982, chromium:855009
> Change-Id: I3b539497a0697e3d51ec969cae49308b0b312a19
> Reviewed-on: https://chromium-review.googlesource.com/c/1384316
> Commit-Queue: Jakob Gruber <jgruber@chromium.org>
> Reviewed-by: Yang Guo <yangguo@chromium.org>
> Reviewed-by: Mathias Bynens <mathias@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#58387}
TBR=yangguo@chromium.org,jgruber@chromium.org,mathias@chromium.org
Change-Id: I1db7e6a0c6cd1cd995fe9f499458108e88dc8cb9
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: v8:1982, chromium:855009
Reviewed-on: https://chromium-review.googlesource.com/c/1386493
Reviewed-by: Michael Achenbach <machenbach@chromium.org>
Commit-Queue: Michael Achenbach <machenbach@chromium.org>
Cr-Commit-Position: refs/heads/master@{#58396}
This escapes LineTerminator characters in a regexp pattern when
creating the string that will be stored in the [[OriginalSource]] slot.
As an example, the source property for all following objects will equal
"\n" (a '\' character followed by 'n'):
/\n/
new RegExp("\n")
new RegExp("\\n")
Bug: v8:1982, chromium:855009
Change-Id: I3b539497a0697e3d51ec969cae49308b0b312a19
Reviewed-on: https://chromium-review.googlesource.com/c/1384316
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Yang Guo <yangguo@chromium.org>
Reviewed-by: Mathias Bynens <mathias@chromium.org>
Cr-Commit-Position: refs/heads/master@{#58387}
The condition only applies in unicode mode, where any lone surrogates
are desugared into a character class (and will not be considered in this
optimization). Non-unicode mode treats lone surrogates exactly like
any other codepoint.
BUG=chromium:711092
Review-Url: https://codereview.chromium.org/2808403006
Cr-Commit-Position: refs/heads/master@{#44638}
The test case did not test anything in its original form. Fix it and add
documentation.
BUG=v8:5339
Review-Url: https://codereview.chromium.org/2481733002
Cr-Commit-Position: refs/heads/master@{#40794}
This moves the implementation of @@replace from regexp.js to builtins-regexp.cc
(the TurboFan fast path) and runtime-regexp.cc (slow path). The fast path
handles all cases in which the regexp itself is an unmodified JSRegExp
instance, the given 'replace' argument is not callable and does not contain any
'$' characters (i.e. we are doing a string replacement).
BUG=v8:5339
Review-Url: https://codereview.chromium.org/2398423002
Cr-Commit-Position: refs/heads/master@{#40253}
Encapsulate the helper functions in mjsunit.js.
Now only exposes the exception class and the assertXXX functions.
Make assertEquals use === instead of ==.
This prevents a lot of possiblefalse positives in tests, and avoids
having to do assertTrue(expected === actual) when you need it.
Fixed some tests that were either buggy or assuming == test.
Review URL: http://codereview.chromium.org/6869007
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@7628 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
decimal escape be accepted as a capture index.
We introduce a limit on the nubmer of allowed captures in a regexp, and break off
parsing of the decimal escape at that point.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1189 ce2b1a6d-e550-0410-aec6-3dcde31c8c00