Previously to this commit only quantifiers of the form /<x>*/, i.e.
arbitrarily often greedy repetition, were implemented. Now a much
larger class is supported, e.g. + and ? and their non-greedy variants.
Because it came up repeatedly during the implementation, the commit also
adds the Label and DeferredLabel classes to patch JMP and FORK target
addresses more easily.
Still not supported are the following quantifiers:
- Possessive quantifiers, where I'm not entirely sure whether they could
be implemented in principle. Re2 doesn't support them.
- Quantifiers with large but finite numbers for min and max numbers of
repetitions, as in e.g. /<x>{9000, 90000}/. These are currently
limited to some small value. This is because the body of such
repetitions is unrolled explicitly, so the size of the bytecode is
linear in the number of repetitions.
Cq-Include-Trybots: luci.v8.try:v8_linux64_fyi_rel_ng
Bug: v8:10765
Change-Id: Id04d893252588abb0f80c3cb33cfc707f6601ea0
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2387575
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#69759}
Previously we checked whether a thread's pc IsPcProcessed before pushing
to the stack of (postponed) active_threads_. This commit moves the
IsPcProcessed check and corresponding MarkPcProcessed call to when the
thread is actually processed, i.e. when it is popped from the
active_threads_ stack again.
This fixes two issues:
- Consider what used to happen in the following scenario:
1. An active thread t is postponed (e.g. because it is a fork) and
pushed on active_threads_. IsPcProcessed(t.pc) is false, so t is
not discarded and does actually end up on active_threads_.
2. Some other thread s is executed, and at some point s.pc == t.pc,
i.e. t.pc is marked as processed.
3. t is popped from active_threads_ for processing.
In 3 we don't want to continue execution of t: After all, its pc is
already marked as processed. But because previously we only checked
for IsPcProcessed in step 1 before pushing to active_threads_, we used
to continue execution in 3. I don't think this is a correctness
issue, but possibly a performance problem. In any case, this commit
moves the IsPcProcessed check from 1 to 3 and so fixes this.
- After flushing blocked_threads_, we push them to active_threads_
again. While doing so, we used to mark these thread's pcs as processed.
This meant that sometimes a (fork of a) high priority thread was
cancelled by the IsPcProcessed check even though its pc was only
marked as processed by a thread with lower priority during flushing.
We need it to be the other way round: The low priority thread should
be cancelled after its pc is processed by a thread with higher
priority.
With this commit we don't MarkPcProcessed during flushing, it's
postponed to when we're actually processing. This was a correctness
issue, and there's a new corresponding test case.
Bug: v8:10765
Change-Id: Ie12682cf3f8a04222d907edd8a3ad25baa69465a
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2388112
Commit-Queue: Martin Bidlingmaier <mbid@google.com>
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#69668}
This CL adds support for disjunctions and some quantification in
EXPERIMENTAL regexp patterns. It is implemented using a new bytecode
format and an NFA-based breadth-first interpreter.
R=jgruber@chromium.org
Bug: v8:10765
Change-Id: Idd49a3bbc9a9fcc2be80d822c9d84a638e53e777
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2370634
Commit-Queue: Martin Bidlingmaier <mbid@google.com>
Reviewed-by: Dominik Inführ <dinfuehr@chromium.org>
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Cr-Commit-Position: refs/heads/master@{#69621}