Commit Graph

24 Commits

Author SHA1 Message Date
jgruber
159236ec25 [regexp] Update semantics of GetSubstitution with named captures
The specced semantics of GetSubstitution are expected to change in the
case of malformed named references, or named references to nonexistent
named groups. The former will evaluate to the identity replacement of
'$<', while the latter will result in replacement by the empty string.

See also:
https://github.com/tc39/proposal-regexp-named-groups/issues/29

Bug: v8:5437, v8:6912
Cq-Include-Trybots: master.tryserver.v8:v8_linux_noi18n_rel_ng
Change-Id: I879288f775774cb0ec563f9d9129a99710efb77c
Reviewed-on: https://chromium-review.googlesource.com/708654
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Yang Guo <yangguo@chromium.org>
Cr-Commit-Position: refs/heads/master@{#48426}
2017-10-10 11:37:29 +00:00
littledan
f918404590 Revert of [regexp] Support unicode capture names in non-unicode patterns (patchset #3 id:40001 of https://codereview.chromium.org/2791163003/ )
Reason for revert:
The decision for the specification was to not have this syntax, and instead the syntax before this patch.

Original issue's description:
> [regexp] Support unicode capture names in non-unicode patterns
>
> This ensures that capture names containing surrogate pairs are parsed
> correctly even in non-unicode RegExp patterns by introducing a new
> scanning mode which unconditionally combines surrogate pairs.
>
> BUG=v8:5437,v8:6192
>
> Review-Url: https://codereview.chromium.org/2791163003
> Cr-Commit-Position: refs/heads/master@{#44466}
> Committed: a8651c5671

R=yangguo@chromium.org,jgruber@chromium.org
# Not skipping CQ checks because original CL landed more than 1 days ago.
BUG=v8:5437,v8:6192

Review-Url: https://codereview.chromium.org/2859933003
Cr-Commit-Position: refs/heads/master@{#45088}
2017-05-04 12:33:38 +00:00
jgruber
9372dd95d9 [regexp] Fix unicode escapes in test strings
Some of these tests pass the pattern as a string, and in this case
there's a subtle distinction between

"/\u{0041}/"  // Unicode escape interpreted in string literal.

and

"/\\u{0041}/"  // Unicode escape interpreted by regexp parser.

Extend these tests to check both cases.

Thanks littledan@ for pointing this out.

BUG=v8:5437

Review-Url: https://codereview.chromium.org/2839923002
Cr-Commit-Position: refs/heads/master@{#44840}
2017-04-25 11:20:34 +00:00
jgruber
f3b848fe5d [regexp] Updates for unicode escapes in capture names
Update docs and tests for recent changes in the spec for unicode escapes
in capture group names.

https://github.com/tc39/proposal-regexp-named-groups/issues/23

BUG=v8:5437

Review-Url: https://codereview.chromium.org/2788423003
Cr-Commit-Position: refs/heads/master@{#44474}
2017-04-07 08:57:42 +00:00
jgruber
1329d15e99 [regexp] Throw on invalid capture group names in replacer string
References to invalid names (i.e. not specified as a named group in the
pattern) throw a SyntaxError. Unmatched groups are still replaced by the
empty string.

See https://github.com/tc39/proposal-regexp-named-groups/issues/14.

BUG=v8:5437

Review-Url: https://codereview.chromium.org/2791183002
Cr-Commit-Position: refs/heads/master@{#44471}
2017-04-07 08:32:46 +00:00
jgruber
a8651c5671 [regexp] Support unicode capture names in non-unicode patterns
This ensures that capture names containing surrogate pairs are parsed
correctly even in non-unicode RegExp patterns by introducing a new
scanning mode which unconditionally combines surrogate pairs.

BUG=v8:5437,v8:6192

Review-Url: https://codereview.chromium.org/2791163003
Cr-Commit-Position: refs/heads/master@{#44466}
2017-04-07 07:34:10 +00:00
jgruber
d890ec3261 [regexp] Disallow '\' in capture names
IdentifierStart::Is and IdentifierContinue::Is both return true for '\'.
The reason for this is lost to history.

Special-case '\' in the regexp parser to handle this.

BUG=v8:5437,v8:5868

Review-Url: https://codereview.chromium.org/2795093003
Cr-Commit-Position: refs/heads/master@{#44396}
2017-04-05 07:01:50 +00:00
jgruber
a3be9e78c1 [regexp] Allow named captures and back-references in non-unicode patterns
Previously, named captures (and related functionality) were restricted to
unicode-mode regexps.

This CL extends that support to non-unicode patterns. Named groups are
supported regardless of the mode, and named back-references are supported if
the regexp is in unicode mode or if it contains a named capture (otherwise '\k'
is treated as an identity escape).

BUG=v8:5437,v8:6192

Review-Url: https://codereview.chromium.org/2788873002
Cr-Commit-Position: refs/heads/master@{#44324}
2017-04-03 08:03:09 +00:00
jgruber
3f8b2aeb35 [regexp] Fix numbered reference before named capture
Numbered back-references that occur before the referenced capture
trigger an internal mini-parser that looks ahead in the pattern and
counts capturing groups.

This updates the mini-parser to correctly handle named captures.

BUG=v8:5437

Review-Url: https://codereview.chromium.org/2792523002
Cr-Commit-Position: refs/heads/master@{#44303}
2017-03-31 10:50:05 +00:00
jgruber
cb812f8e58 [regexp] Extend tests for named captures
Additional tests, mostly for interactions with lookbehind assertions.

BUG=v8:5437

Review-Url: https://codereview.chromium.org/2784813002
Cr-Commit-Position: refs/heads/master@{#44290}
2017-03-31 07:57:15 +00:00
jgruber
19f626f076 [regexp] Handle unmatched groups in callable replacers
BUG=v8:5437

Review-Url: https://codereview.chromium.org/2776263003
Cr-Commit-Position: refs/heads/master@{#44194}
2017-03-28 13:29:22 +00:00
jgruber
9403edfa83 [regexp] Named capture support for string replacements
This implements support for named captures in
RegExp.prototype[@@replace] for when the replaceValue is not callable.

Named captures can be referenced from replacement strings by using the
"$<name>" syntax. A couple of examples:

let re = /(?<fst>.)(?<snd>.)/u;
"abcd".replace(re, "$<snd>$<fst>")  // "bacd"
"abcd".replace(re, "$2$1")     // "bacd" (numbered refs work as always)
"abcd".replace(re, "$<snd")    // SyntaxError (unterminated named ref)
"abcd".replace(re, "$<42$1>")  // "cd" (invalid name)
"abcd".replace(re, "$<thd>")   // "cd" (non-existent name)
"abcd".replace(/(?<fst>.)|(?<snd>.)/u, "$<snd>")  // "cd" (non-matched capture)

Support is currently behind the --harmony-regexp-named-captures flag.

BUG=v8:5437

Review-Url: https://codereview.chromium.org/2775303002
Cr-Original-Commit-Position: refs/heads/master@{#44171}
Committed: 17f13863b6
Review-Url: https://codereview.chromium.org/2775303002
Cr-Commit-Position: refs/heads/master@{#44182}
2017-03-28 09:09:42 +00:00
jgruber
34ffdd6238 Revert of [regexp] Named capture support for string replacements (patchset #5 id:80001 of https://codereview.chromium.org/2775303002/ )
Reason for revert:
Invalid DCHECKs for non-matched groups.

Original issue's description:
> [regexp] Named capture support for string replacements
>
> This implements support for named captures in
> RegExp.prototype[@@replace] for when the replaceValue is not callable.
>
> Named captures can be referenced from replacement strings by using the
> "$<name>" syntax. A couple of examples:
>
> let re = /(?<fst>.)(?<snd>.)/u;
> "abcd".replace(re, "$<snd>$<fst>")  // "bacd"
> "abcd".replace(re, "$2$1")     // "bacd" (numbered refs work as always)
> "abcd".replace(re, "$<snd")    // SyntaxError (unterminated named ref)
> "abcd".replace(re, "$<42$1>")  // "cd" (invalid name)
> "abcd".replace(re, "$<thd>")   // "cd" (non-existent name)
> "abcd".replace(/(?<fst>.)|(?<snd>.)/u, "$<snd>")  // "cd" (non-matched capture)
>
> Support is currently behind the --harmony-regexp-named-captures flag.
>
> BUG=v8:5437
>
> Review-Url: https://codereview.chromium.org/2775303002
> Cr-Commit-Position: refs/heads/master@{#44171}
> Committed: 17f13863b6

TBR=yangguo@chromium.org,littledan@chromium.org
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=v8:5437

Review-Url: https://codereview.chromium.org/2776293003
Cr-Commit-Position: refs/heads/master@{#44180}
2017-03-28 09:02:14 +00:00
jgruber
17f13863b6 [regexp] Named capture support for string replacements
This implements support for named captures in
RegExp.prototype[@@replace] for when the replaceValue is not callable.

Named captures can be referenced from replacement strings by using the
"$<name>" syntax. A couple of examples:

let re = /(?<fst>.)(?<snd>.)/u;
"abcd".replace(re, "$<snd>$<fst>")  // "bacd"
"abcd".replace(re, "$2$1")     // "bacd" (numbered refs work as always)
"abcd".replace(re, "$<snd")    // SyntaxError (unterminated named ref)
"abcd".replace(re, "$<42$1>")  // "cd" (invalid name)
"abcd".replace(re, "$<thd>")   // "cd" (non-existent name)
"abcd".replace(/(?<fst>.)|(?<snd>.)/u, "$<snd>")  // "cd" (non-matched capture)

Support is currently behind the --harmony-regexp-named-captures flag.

BUG=v8:5437

Review-Url: https://codereview.chromium.org/2775303002
Cr-Commit-Position: refs/heads/master@{#44171}
2017-03-28 08:02:03 +00:00
jgruber
80879b8c26 [regexp] Named capture support for callable replacements
This implements support for named captures in
RegExp.prototype[@@replace] for when the replaceValue is callable.

In that case, the result.groups object is passed to the replacer
function as the last argument.

BUG=v8:5437

Review-Url: https://codereview.chromium.org/2764343004
Cr-Commit-Position: refs/heads/master@{#44142}
2017-03-27 11:18:31 +00:00
jgruber
8c0f2315fc [regexp] Rename result.group to result.groups
This is just an update to reflect the current spec proposal.
https://tc39.github.io/proposal-regexp-named-groups/

BUG=v8:5437

Review-Url: https://codereview.chromium.org/2769143002
Cr-Commit-Position: refs/heads/master@{#44067}
2017-03-23 15:42:07 +00:00
jgruber
d52ec9e6cf [regexp] Store named captures on the regexp result
This implements storing named captures on the regexp result object.
For instance, /(?<a>.)/u.exec("b") will return a result such that:

result.group.a  // "b"

https://tc39.github.io/proposal-regexp-named-groups/

BUG=v8:5437

Review-Url: https://codereview.chromium.org/2630233003
Cr-Original-Original-Original-Commit-Position: refs/heads/master@{#42532}
Committed: 70000946eb
Review-Url: https://codereview.chromium.org/2630233003
Cr-Original-Original-Commit-Position: refs/heads/master@{#42570}
Committed: ee94fa11ed
Review-Url: https://codereview.chromium.org/2630233003
Cr-Original-Commit-Position: refs/heads/master@{#42676}
Committed: 8bf52534f6
Review-Url: https://codereview.chromium.org/2630233003
Cr-Commit-Position: refs/heads/master@{#42838}
2017-02-01 08:54:38 +00:00
jgruber
25bfdf1b46 Revert of [regexp] Create property on result for each named capture (patchset #9 id:160001 of https://codereview.chromium.org/2630233003/ )
Reason for revert:
Some heap tests are broken leading to failures on nosnap builds:

https://build.chromium.org/p/client.v8.ports/builders/V8%20Linux%20-%20arm64%20-%20sim%20-%20nosnap%20-%20debug/builds/3677

Reverting again until tests are fixed to keep bots green.

Original issue's description:
> [regexp] Store named captures on the regexp result
>
> This implements storing named captures on the regexp result object.
> For instance, /(?<a>.)/u.exec("b") will return a result such that:
>
> result.group.a  // "b"
>
> https://tc39.github.io/proposal-regexp-named-groups/
>
> BUG=v8:5437
>
> Review-Url: https://codereview.chromium.org/2630233003
> Cr-Original-Original-Commit-Position: refs/heads/master@{#42532}
> Committed: 70000946eb
> Review-Url: https://codereview.chromium.org/2630233003
> Cr-Original-Commit-Position: refs/heads/master@{#42570}
> Committed: ee94fa11ed
> Review-Url: https://codereview.chromium.org/2630233003
> Cr-Commit-Position: refs/heads/master@{#42676}
> Committed: 8bf52534f6

TBR=yangguo@chromium.org,littledan@chromium.org
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=v8:5437

Review-Url: https://codereview.chromium.org/2654233002
Cr-Commit-Position: refs/heads/master@{#42681}
2017-01-26 09:31:08 +00:00
jgruber
8bf52534f6 [regexp] Store named captures on the regexp result
This implements storing named captures on the regexp result object.
For instance, /(?<a>.)/u.exec("b") will return a result such that:

result.group.a  // "b"

https://tc39.github.io/proposal-regexp-named-groups/

BUG=v8:5437

Review-Url: https://codereview.chromium.org/2630233003
Cr-Original-Original-Commit-Position: refs/heads/master@{#42532}
Committed: 70000946eb
Review-Url: https://codereview.chromium.org/2630233003
Cr-Original-Commit-Position: refs/heads/master@{#42570}
Committed: ee94fa11ed
Review-Url: https://codereview.chromium.org/2630233003
Cr-Commit-Position: refs/heads/master@{#42676}
2017-01-26 07:59:21 +00:00
jgruber
50e0fe29bb Revert of [regexp] Create property on result for each named capture (patchset #7 id:120001 of https://codereview.chromium.org/2630233003/ )
Reason for revert:
Breaks arm64.

Original issue's description:
> [regexp] Store named captures on the regexp result
>
> This implements storing named captures on the regexp result object.
> For instance, /(?<a>.)/u.exec("b") will return a result such that:
>
> result.group.a  // "b"
>
> The spec proposal is not yet final, so this may still change in the future.
>
> BUG=v8:5437
>
> Review-Url: https://codereview.chromium.org/2630233003
> Cr-Original-Commit-Position: refs/heads/master@{#42532}
> Committed: 70000946eb
> Review-Url: https://codereview.chromium.org/2630233003
> Cr-Commit-Position: refs/heads/master@{#42570}
> Committed: ee94fa11ed

TBR=yangguo@chromium.org,littledan@chromium.org
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=v8:5437

Review-Url: https://codereview.chromium.org/2639403008
Cr-Commit-Position: refs/heads/master@{#42577}
2017-01-20 19:03:14 +00:00
jgruber
ee94fa11ed [regexp] Store named captures on the regexp result
This implements storing named captures on the regexp result object.
For instance, /(?<a>.)/u.exec("b") will return a result such that:

result.group.a  // "b"

The spec proposal is not yet final, so this may still change in the future.

BUG=v8:5437

Review-Url: https://codereview.chromium.org/2630233003
Cr-Original-Commit-Position: refs/heads/master@{#42532}
Committed: 70000946eb
Review-Url: https://codereview.chromium.org/2630233003
Cr-Commit-Position: refs/heads/master@{#42570}
2017-01-20 16:11:13 +00:00
jgruber
9c68654c39 Revert of [regexp] Create property on result for each named capture (patchset #5 id:80001 of https://codereview.chromium.org/2630233003/ )
Reason for revert:
Breaks no18n build: https://build.chromium.org/p/client.v8/builders/V8%20Linux%20-%20noi18n%20-%20debug/builds/11604

Original issue's description:
> [regexp] Store named captures on the regexp result
>
> This implements storing named captures on the regexp result object.
> For instance, /(?<a>.)/u.exec("b") will return a result such that:
>
> result.group.a  // "b"
>
> The spec proposal is not yet final, so this may still change in the future.
>
> BUG=v8:5437
>
> Review-Url: https://codereview.chromium.org/2630233003
> Cr-Commit-Position: refs/heads/master@{#42532}
> Committed: 70000946eb

TBR=yangguo@chromium.org,littledan@chromium.org
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=v8:5437

Review-Url: https://codereview.chromium.org/2643213002
Cr-Commit-Position: refs/heads/master@{#42534}
2017-01-20 08:42:03 +00:00
jgruber
70000946eb [regexp] Store named captures on the regexp result
This implements storing named captures on the regexp result object.
For instance, /(?<a>.)/u.exec("b") will return a result such that:

result.group.a  // "b"

The spec proposal is not yet final, so this may still change in the future.

BUG=v8:5437

Review-Url: https://codereview.chromium.org/2630233003
Cr-Commit-Position: refs/heads/master@{#42532}
2017-01-20 08:04:07 +00:00
jgruber
ae23436cbf [regexp] Experimental support for regexp named captures
Named capture groups may be specified using the /(?<name>pattern)/u
syntax, with named backreferences specified as /\k<name>/u. They're
hidden behind the --harmony-regexp-named-captures flag, and are only
enabled for unicode regexps.

R=yangguo@chromium.org
BUG=

Review-Url: https://codereview.chromium.org/2050343002
Cr-Commit-Position: refs/heads/master@{#36986}
2016-06-15 06:49:55 +00:00