v8/test/mjsunit/regexp-backtrack-limit.js

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

33 lines
1.1 KiB
JavaScript
Raw Normal View History

// Copyright 2019 the V8 project authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.
[regexp] Support capture groups in experimental engine This commit adds support for capture groups (as in e.g. /x(123|abc)y/) in the experimental regexp engine. Now every InterpreterThread owns a register array containing (sub)match boundaries. There is a new instruction to record the current input index in some register. Submatches in quantifier bodies should be reported only if they occur during the last repetition. Thus we reset those registers before attempting to match the body of a quantifier. This is implemented with another new instruction. Because of concerns for the growing sizeof the NfaInterpreter object (which is allocated on the stack), this commit replaces the `SmallVector` members of the NfaInterpreter with zone-allocated arrays. Register arrays, which for a fixed regexp are all the same size, are allocated with a RecyclingZoneAllocator for cheap memory reclamation via a linked list of equally-sized free blocks. Possible optimizations for management of register array memory: 1. If there are few register per thread, then it is likely faster to store them inline in the InterpreterThread struct. 2. re2 implements copy-on-write: InterpreterThreads can share the same register array. If a thread attempts to write to shared register array, the register array is cloned first. 3. The register at index 1 contains the end of the match; this is only written to right before an ACCEPT statement. We could make ACCEPT equivalent to what's currently CAPTURE 1 followed by ACCEPT. We could then save the memory for register 1 for threads that haven't finished yet. This is particularly interesting if now optimization 1 kicks in. Cq-Include-Trybots: luci.v8.try:v8_linux64_fyi_rel_ng Bug: v8:10765 Change-Id: I2c0503206ce331e13ac9912945bb66736d740197 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2390770 Commit-Queue: Martin Bidlingmaier <mbid@google.com> Reviewed-by: Jakob Gruber <jgruber@chromium.org> Cr-Commit-Position: refs/heads/master@{#69929}
2020-09-15 19:03:51 +00:00
// Flags: --allow-natives-syntax --no-enable-experimental-regexp-engine
// Flags: --no-enable-experimental-regexp-engine-on-excessive-backtracks
const kNoBacktrackLimit = 0; // To match JSRegExp::kNoBacktrackLimit.
const re0 = %NewRegExpWithBacktrackLimit("(\\d+)+x", "", kNoBacktrackLimit);
const re1 = %NewRegExpWithBacktrackLimit("(\\d+)+x", "", 50);
// Backtracks remain below the limit on this subject string.
{
let s = "3333ax3333x";
assertArrayEquals(["3333x", "3333"], re0.exec(s));
assertEquals(["3333x", "3333"], re1.exec(s));
}
// A longer subject exceeds the limit.
{
let s = "333333333ax3333x";
assertArrayEquals(["3333x", "3333"], re0.exec(s));
assertEquals(null, re1.exec(s));
}
// ATOM regexp construction with a limit; in this case the limit should just be
// ignored, ATOMs never backtrack.
{
const re = %NewRegExpWithBacktrackLimit("ax", "", 50);
let s = "3333ax3333x";
assertArrayEquals(["ax"], re.exec(s));
}