Commit Graph

254 Commits

Author SHA1 Message Date
Dan Elphick
44e73e0b78 Reland "[base] Move most of src/numbers into base"
This is a reland of 9701d4a420
with a small fix for some code landed in between the dry-run and
submission.

Original change's description:
> [base] Move most of src/numbers into base
>
> Moves all but conversions.*, hash-seed-inl.h and math-random.* into
> base, in preparation for moving the parts of conversions that don't
> access HeapObjects.
>
> Also moves uc16 and uc32 out of commons/globals.h into base/strings.h.
>
> Bug: v8:11917
> Change-Id: Ife359148bb0961a63833aff40d26331454b6afb6
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2979595
> Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
> Reviewed-by: Clemens Backes <clemensb@chromium.org>
> Commit-Queue: Ross McIlroy <rmcilroy@chromium.org>
> Auto-Submit: Dan Elphick <delphick@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#75354}

Bug: v8:11917
Change-Id: Ie1ec9032fe56646a7c7303185cecc70fce5694ae
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2982607
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Commit-Queue: Dan Elphick <delphick@chromium.org>
Cr-Commit-Position: refs/heads/master@{#75368}
2021-06-24 15:00:27 +00:00
Nico Hartmann
10f6151d7e Revert "[base] Move most of src/numbers into base"
This reverts commit 9701d4a420.

Reason for revert: https://ci.chromium.org/ui/p/v8/builders/ci/V8%20Mac64/40802/overview

Original change's description:
> [base] Move most of src/numbers into base
>
> Moves all but conversions.*, hash-seed-inl.h and math-random.* into
> base, in preparation for moving the parts of conversions that don't
> access HeapObjects.
>
> Also moves uc16 and uc32 out of commons/globals.h into base/strings.h.
>
> Bug: v8:11917
> Change-Id: Ife359148bb0961a63833aff40d26331454b6afb6
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2979595
> Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
> Reviewed-by: Clemens Backes <clemensb@chromium.org>
> Commit-Queue: Ross McIlroy <rmcilroy@chromium.org>
> Auto-Submit: Dan Elphick <delphick@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#75354}

Bug: v8:11917
Change-Id: Iacf796c95256016fa74f0a910c5bb1a86baa425a
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2982605
Bot-Commit: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com>
Reviewed-by: Nico Hartmann <nicohartmann@chromium.org>
Commit-Queue: Nico Hartmann <nicohartmann@chromium.org>
Cr-Commit-Position: refs/heads/master@{#75356}
2021-06-24 11:14:24 +00:00
Dan Elphick
9701d4a420 [base] Move most of src/numbers into base
Moves all but conversions.*, hash-seed-inl.h and math-random.* into
base, in preparation for moving the parts of conversions that don't
access HeapObjects.

Also moves uc16 and uc32 out of commons/globals.h into base/strings.h.

Bug: v8:11917
Change-Id: Ife359148bb0961a63833aff40d26331454b6afb6
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2979595
Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Commit-Queue: Ross McIlroy <rmcilroy@chromium.org>
Auto-Submit: Dan Elphick <delphick@chromium.org>
Cr-Commit-Position: refs/heads/master@{#75354}
2021-06-24 11:01:23 +00:00
Dan Elphick
7f5383e8ad [base] Move utils/vector.h to base/vector.h
The adding of base:: was mostly prepared using git grep and sed:
git grep -l <pattern> | grep -v base/vector.h | \
  xargs sed -i 's/\b<pattern>\b/base::<pattern>/
with lots of manual clean-ups due to the resulting
v8::internal::base::Vectors.

#includes were fixed using:
git grep -l "src/utils/vector.h" | \
  axargs sed -i 's!src/utils/vector.h!src/base/vector.h!'

Bug: v8:11879
Change-Id: I3e6d622987fee4478089c40539724c19735bd625
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2968412
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Reviewed-by: Hannes Payer <hpayer@chromium.org>
Commit-Queue: Dan Elphick <delphick@chromium.org>
Cr-Commit-Position: refs/heads/master@{#75243}
2021-06-18 13:33:13 +00:00
Igor Sheludko
39c1f718b5 [ext-code-space] Migrate JSRegExp code fields to CodeT
Bug: v8:11880
Change-Id: Idf23521d6cb1885922f92e1050937daa2d29acd7
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2968409
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Commit-Queue: Igor Sheludko <ishell@chromium.org>
Cr-Commit-Position: refs/heads/master@{#75225}
2021-06-17 17:37:01 +00:00
Brice Dobry
ffd9e82dd5 Add RISC-V backend
This very large changeset adds support for RISC-V.

Bug: v8:10991
Change-Id: Ic997c94cc12bba6881bc208e66526f423dd0679c
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2571344
Commit-Queue: Brice Dobry <brice.dobry@futurewei.com>
Commit-Queue: Georg Neis <neis@chromium.org>
Reviewed-by: Georg Neis <neis@chromium.org>
Reviewed-by: Hannes Payer <hpayer@chromium.org>
Reviewed-by: Michael Achenbach <machenbach@chromium.org>
Reviewed-by: Michael Stanton <mvstanton@chromium.org>
Cr-Commit-Position: refs/heads/master@{#72598}
2021-02-09 17:06:36 +00:00
Santiago Aboy Solanes
20876fcf98 [object] Remove FlatStringReader's vector constructor
This simplifies the logic since we can guarantee to have a
Handle<String>. The removed constructor was only used in tests.

Change-Id: I13519e474fe92892e9e8a39802d84cfab2c5b5ed
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2505711
Commit-Queue: Santiago Aboy Solanes <solanes@chromium.org>
Reviewed-by: Leszek Swirski <leszeks@chromium.org>
Cr-Commit-Position: refs/heads/master@{#70849}
2020-10-28 16:15:33 +00:00
Maya Lekova
95bb97bc02 [turbofan] Make OSR and stack slots compatible
Bug: chromium:1130844, v8:10973
Change-Id: I912f2cf6cedaf22dd50d456622880ea266b65dcd
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2445509
Commit-Queue: Maya Lekova <mslekova@chromium.org>
Reviewed-by: Georg Neis <neis@chromium.org>
Reviewed-by: Andreas Haas <ahaas@chromium.org>
Cr-Commit-Position: refs/heads/master@{#70323}
2020-10-05 17:41:02 +00:00
Martin Bidlingmaier
e83511c260 [regexp] Support assertions in experimental engine
Assertions are implemented with the new ASSERTION instruction.  The nfa
interpreter evaluates the assertion based on the current context in the
subject string every time a thread executes ASSERTION.  This is
analogous to what re2 and rust/regex do.

Alternatives to this approach:
- The interpreter could calculate eagerly for all assertion types
  whether they are satisfied whenever the current input position is
  advanced.  This would make evaluating the ASSERTION instruction itself
  cheaper, but at the cost of making every advance in the input string
  more expensive.  I suspect this would be slower on average because
  assertions are not that common that we typically evaluate >= 2
  assertions at every input position.
- Assertions in a regexp could be desugared into CONSUME_RANGE
  instructions, so that no new instruction would be necessary.  For
  example, the word boundary assertion \b is satisfied at a given
  position/state if we have just consumed a word character and will
  consume a non-word character next, or vice-versa.  The tricky part
  about this is that the assertion itself should not consume input, so
  we'd have to split (automaton) states according to whether we've
  arrived at them via a word character or not.  The current compiler is
  not really equipped for this kind of transformation.  For {start,end}
  of {line,file} assertions, we'd need to introduce dummy characters
  indicating start/end of input (say, 0x10000 and 0x10001) which we feed
  to the interpreter before respectively after the actual input.
  I suspect that this approach wouldn't make much of a difference for
  NFA execution. It would likely speed up (lazy) DFA execution though
  because assertions would be dealt with in the fast path.

Cq-Include-Trybots: luci.v8.try:v8_linux64_fyi_rel_ng
Bug: v8:10765
Change-Id: Ic2012c943e0ce54eb8662789fb3d4c1b6cd8d606
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2398644
Commit-Queue: Martin Bidlingmaier <mbid@google.com>
Reviewed-by: Leszek Swirski <leszeks@chromium.org>
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#70026}
2020-09-21 13:30:14 +00:00
Martin Bidlingmaier
46bf70a567 [regexp] Prototype new linear time EXPERIMENTAL regexp engine
This adds the new JsRegExp::Type EXPERIMENTAL, which should eventually
be implemented with the algorithm based on automata. Currently the new
engine deals with plain search strings only, i.e. regexps that do not
contain operators or escape sequences.

R=jgruber@chromium.org

Bug: v8:10765
Change-Id: I6a10d9cdf4605d219dbe7cc1989df3bfa7349ff8
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2339094
Reviewed-by: Dominik Inführ <dinfuehr@chromium.org>
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#69442}
2020-08-18 05:51:24 +00:00
Jakob Gruber
c51041f454 [nci] Replace CompilationTarget with a new Code::Kind value
With the new Turbofan variants (NCI and Turboprop), we need a way to
distinguish between them both during and after compilation. We
initially introduced CompilationTarget to track the variant during
compilation, but decided to reuse the code kind as the canonical spot to
store this information instead.

Why? Because it is an established mechanism, already available in most
of the necessary spots (inside the pipeline, on Code objects, in
profiling traces).

This CL removes CompilationTarget and adds a new
NATIVE_CONTEXT_INDEPENDENT kind, plus helper functions to determine
various things about a given code kind (e.g.: does this code kind
deopt?).

As a (very large) drive-by, refactor both Code::Kind and
AbstractCode::Kind into a new CodeKind enum class.

Bug: v8:8888
Change-Id: Ie858b9a53311b0731630be35cf5cd108dee95b39
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2336793
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Reviewed-by: Dominik Inführ <dinfuehr@chromium.org>
Reviewed-by: Georg Neis <neis@chromium.org>
Cr-Commit-Position: refs/heads/master@{#69244}
2020-08-05 12:27:22 +00:00
Igor Sheludko
921c247694 [zone] Cleanup zone allocations in src/regexp and tests
... by migrating old-style code
  MyObject* obj = new (zone) MyObject(...)

to the new style
  MyObject* obj = zone->New<MyObject>(...)

Bug: v8:10689
Change-Id: Icc60fdbf247ec05f9b5688b3d2d73d4fed06ea89
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2289770
Commit-Queue: Igor Sheludko <ishell@chromium.org>
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68784}
2020-07-10 10:30:05 +00:00
Iain Ireland
b65fcfe925 [regexp] Fix non-unicode ignore-case backreferences
https://crrev.com/c/2072858 rewrote the implementation of non-unicode
ignore-case matches to comply with the JS spec in some corner
cases. It fixed character matches and character class matches.

We missed a similar bug in the implementation of back references. This
CL fixes that bug.

The main change is in regexp-macro-assembler.cc, where
CaseInsensitiveCompareUC16 is split into CaseInsensitiveCompareUnicode
(which has the same semantics as before) and
CaseInsensitiveCompareNonUnicode (which has the semantics described
here: https://tc39.es/ecma262/#sec-runtime-semantics-canonicalize-ch).

Most of the rest of the patch undoes https://crrev.com/c/2081816 to
once again make the unicode flag available to the macroassembler, so
that we can decide which helper function to call.

The testcase is a version of test/intl/regress-10248.js, modified to
test backreferences.

Bug: v8:10573
Change-Id: I70ef7d134d37f99b1f75a5eba17020e82d59f1b9
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2219284
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#68129}
2020-06-03 08:59:08 +00:00
Zhao Jiazhong
fc03e548b0 [regexp] Loosen limit in UnicodePropertyEscapeCodeSize test
The UnicodePropertyEscapeCodeSize test set the max code size as 150KB,
which is too strict for mips64. This CL loosen the limit to 200KB.

Bug: v8:10441
Change-Id: I8532d4d51eedd7713075d86e84c52a58d2412861
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2172927
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Commit-Queue: Zhao Jiazhong <zhaojiazhong-hf@loongson.cn>
Cr-Commit-Position: refs/heads/master@{#67489}
2020-04-30 08:24:14 +00:00
Jakob Gruber
10842cad3c Reland "[regexp] Limit the size of inlined choice nodes"
This is a reland of 6a0e7224f3

Original change's description:
> [regexp] Limit the size of inlined choice nodes
>
> Codegen for unicode property escapes (e.g.: /\p{L}/u) can produce huge
> code objects. This effect can be further magnified through inlining,
> leading to exponential code growth in the size of the pattern.
>
> This CL is a (fairly hacky) way to avoid exponential growth. We
> recognize choice nodes with 'many' choices and disable inlining for
> them. In the future we should fix this properly, either by using the
> code size budget correctly, or by improving codegen for property
> escapes.
>
> Bug: v8:10441
> Change-Id: I817f145251ec8b1b9906cc735c9e9bdb004c98ed
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2170229
> Commit-Queue: Jakob Gruber <jgruber@chromium.org>
> Reviewed-by: Yang Guo <yangguo@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#67433}

Tbr: yangguo@chromium.org
Bug: v8:10441
Change-Id: I9a16cc9e8248cb46d3d16a4e2d250968cc1b7b39
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2172679
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67462}
2020-04-29 07:18:11 +00:00
Clemens Backes
f8b23009bf Revert "[regexp] Limit the size of inlined choice nodes"
This reverts commit 6a0e7224f3.

Reason for revert: Fails noi18n: https://ci.chromium.org/p/v8/builders/ci/V8%20Linux%20-%20noi18n%20-%20debug/31513

Original change's description:
> [regexp] Limit the size of inlined choice nodes
> 
> Codegen for unicode property escapes (e.g.: /\p{L}/u) can produce huge
> code objects. This effect can be further magnified through inlining,
> leading to exponential code growth in the size of the pattern.
> 
> This CL is a (fairly hacky) way to avoid exponential growth. We
> recognize choice nodes with 'many' choices and disable inlining for
> them. In the future we should fix this properly, either by using the
> code size budget correctly, or by improving codegen for property
> escapes.
> 
> Bug: v8:10441
> Change-Id: I817f145251ec8b1b9906cc735c9e9bdb004c98ed
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2170229
> Commit-Queue: Jakob Gruber <jgruber@chromium.org>
> Reviewed-by: Yang Guo <yangguo@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#67433}

TBR=yangguo@chromium.org,jgruber@chromium.org

Change-Id: I503b8b2be539468d86e4ec1ac13074cd1c06a5cb
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: v8:10441
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2169101
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Commit-Queue: Clemens Backes <clemensb@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67436}
2020-04-28 13:35:20 +00:00
Jakob Gruber
6a0e7224f3 [regexp] Limit the size of inlined choice nodes
Codegen for unicode property escapes (e.g.: /\p{L}/u) can produce huge
code objects. This effect can be further magnified through inlining,
leading to exponential code growth in the size of the pattern.

This CL is a (fairly hacky) way to avoid exponential growth. We
recognize choice nodes with 'many' choices and disable inlining for
them. In the future we should fix this properly, either by using the
code size budget correctly, or by improving codegen for property
escapes.

Bug: v8:10441
Change-Id: I817f145251ec8b1b9906cc735c9e9bdb004c98ed
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2170229
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Yang Guo <yangguo@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67433}
2020-04-28 12:55:48 +00:00
Jakob Gruber
fe60913945 [regexp] Consistent expectations for output registers
... between the interpreter and generated code.

Prior to this CL, pre- and post conditions on the output register
array differed between the interpreter and generated code.

Interpreter
Pre: `output` fits captures and temporary registers.
Post: None.

Generated code
Pre:  `output` fits capture registers.
Post: `output` is modified if and only if the match succeeded.

This CL changes the interpreter to match generated code pre- and
post conditions by allocating space for temporary registers inside
the interpreter.

Drive-by: Add MaxRegisterCount, RegistersForCaptureCount helpers.

Bug: chromium:1067270
Change-Id: I2900ef2f31207d817ec7ead3e0e2215b23b398f0
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2135642
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Leszek Swirski <leszeks@chromium.org>
Cr-Commit-Position: refs/heads/master@{#67268}
2020-04-21 10:13:03 +00:00
Iain Ireland
560f2d8bb3 Reland "[regexp] Rewrite error handling"
This is a reland of e80ca24c80

Original change's description:
> [regexp] Rewrite error handling
>
> This patch modifies irregexp's error handling. Instead of representing
> errors as C strings, they are represented as an enumeration value
> (RegExpError), and only converted to strings when throwing the error
> object in regexp.cc. This makes it significantly easier to integrate
> into SpiderMonkey. A few notes:
>
> 1. Depending on whether the stack overflows during parsing or
>    analysis, the stack overflow message can vary ("Stack overflow" or
>    "Maximum call stack size exceeded"). I kept that behaviour in this
>    patch, under the assumption that stack overflow messages are
>    (sadly) the sorts of things that real world code ends up depending
>    on.
>
> 2. Depending on the point in code where the error was identified,
>    invalid unicode escapes could be reported as "Invalid Unicode
>    escape", "Invalid unicode escape", or "Invalid Unicode escape
>    sequence". I fervently hope that nobody depends on the specific
>    wording of a syntax error, so I standardized on the first one. (It
>    was both the most common, and the most consistent with other
>    "Invalid X escape" messages.)
>
> 3. In addition to changing the representation, this patch also adds an
>    error_pos field to RegExpParser and RegExpCompileData, which stores
>    the position at which an error occurred. This is used by
>    SpiderMonkey to provide more helpful messages about where a syntax
>    error occurred in large regular expressions.
>
> 4. This model is closer to V8's existing MessageTemplate
>    infrastructure. I considered trying to integrate it more closely
>    with MessageTemplate, but since one of our stated goals for this
>    project was to make it easier to use irregexp outside of V8, I
>    decided to hold off.
>
> R=jgruber@chromium.org
>
> Bug: v8:10303
> Change-Id: I62605fd2def2fc539f38a7e0eefa04d36e14bbde
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2091863
> Commit-Queue: Jakob Gruber <jgruber@chromium.org>
> Reviewed-by: Jakob Gruber <jgruber@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#66784}

R=jgruber@chromium.org

Bug: v8:10303
Change-Id: Iad1f11a0e0b9e525d7499aacb56c27eff9e7c7b5
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2109952
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66798}
2020-03-19 16:59:43 +00:00
Leszek Swirski
2193f691da Revert "[regexp] Rewrite error handling"
This reverts commit e80ca24c80.

Reason for revert: Causes failures in the fast/regex/non-pattern-characters.html Blink web test (https://ci.chromium.org/p/v8/builders/ci/V8%20Blink%20Linux/3679)

Original change's description:
> [regexp] Rewrite error handling
> 
> This patch modifies irregexp's error handling. Instead of representing
> errors as C strings, they are represented as an enumeration value
> (RegExpError), and only converted to strings when throwing the error
> object in regexp.cc. This makes it significantly easier to integrate
> into SpiderMonkey. A few notes:
> 
> 1. Depending on whether the stack overflows during parsing or
>    analysis, the stack overflow message can vary ("Stack overflow" or
>    "Maximum call stack size exceeded"). I kept that behaviour in this
>    patch, under the assumption that stack overflow messages are
>    (sadly) the sorts of things that real world code ends up depending
>    on.
> 
> 2. Depending on the point in code where the error was identified,
>    invalid unicode escapes could be reported as "Invalid Unicode
>    escape", "Invalid unicode escape", or "Invalid Unicode escape
>    sequence". I fervently hope that nobody depends on the specific
>    wording of a syntax error, so I standardized on the first one. (It
>    was both the most common, and the most consistent with other
>    "Invalid X escape" messages.)
> 
> 3. In addition to changing the representation, this patch also adds an
>    error_pos field to RegExpParser and RegExpCompileData, which stores
>    the position at which an error occurred. This is used by
>    SpiderMonkey to provide more helpful messages about where a syntax
>    error occurred in large regular expressions.
> 
> 4. This model is closer to V8's existing MessageTemplate
>    infrastructure. I considered trying to integrate it more closely
>    with MessageTemplate, but since one of our stated goals for this
>    project was to make it easier to use irregexp outside of V8, I
>    decided to hold off.
> 
> R=​jgruber@chromium.org
> 
> Bug: v8:10303
> Change-Id: I62605fd2def2fc539f38a7e0eefa04d36e14bbde
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2091863
> Commit-Queue: Jakob Gruber <jgruber@chromium.org>
> Reviewed-by: Jakob Gruber <jgruber@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#66784}

TBR=jgruber@chromium.org,iireland@mozilla.com

Change-Id: I9247635f3c5b17c943b9c4abaf82ebe7b2de165e
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: v8:10303
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2108550
Reviewed-by: Leszek Swirski <leszeks@chromium.org>
Commit-Queue: Leszek Swirski <leszeks@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66786}
2020-03-19 09:58:12 +00:00
Iain Ireland
e80ca24c80 [regexp] Rewrite error handling
This patch modifies irregexp's error handling. Instead of representing
errors as C strings, they are represented as an enumeration value
(RegExpError), and only converted to strings when throwing the error
object in regexp.cc. This makes it significantly easier to integrate
into SpiderMonkey. A few notes:

1. Depending on whether the stack overflows during parsing or
   analysis, the stack overflow message can vary ("Stack overflow" or
   "Maximum call stack size exceeded"). I kept that behaviour in this
   patch, under the assumption that stack overflow messages are
   (sadly) the sorts of things that real world code ends up depending
   on.

2. Depending on the point in code where the error was identified,
   invalid unicode escapes could be reported as "Invalid Unicode
   escape", "Invalid unicode escape", or "Invalid Unicode escape
   sequence". I fervently hope that nobody depends on the specific
   wording of a syntax error, so I standardized on the first one. (It
   was both the most common, and the most consistent with other
   "Invalid X escape" messages.)

3. In addition to changing the representation, this patch also adds an
   error_pos field to RegExpParser and RegExpCompileData, which stores
   the position at which an error occurred. This is used by
   SpiderMonkey to provide more helpful messages about where a syntax
   error occurred in large regular expressions.

4. This model is closer to V8's existing MessageTemplate
   infrastructure. I considered trying to integrate it more closely
   with MessageTemplate, but since one of our stated goals for this
   project was to make it easier to use irregexp outside of V8, I
   decided to hold off.

R=jgruber@chromium.org

Bug: v8:10303
Change-Id: I62605fd2def2fc539f38a7e0eefa04d36e14bbde
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2091863
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66784}
2020-03-19 08:51:32 +00:00
Georgia Kouveli
ea82d0311b [arm64] Use BTI instructions for forward CFI
Generate a BTI instruction at each target of an indirect branch
(BR/BLR). An indirect branch that doesn't jump to a BTI instruction
will generate an exception on a BTI-enabled core. On cores that do
not support the BTI extension, the BTI instruction is a NOP.

Targets of indirect branch instructions include, among other things,
function entrypoints, exception handlers and jump tables. Lazy deopt
exits can potentially be reached through an indirect branch when an
exception is thrown, so they also get an additional BTI instruction.

Bug: v8:10026
Change-Id: I0ebf51071f1b604f60f524096e013dfd64fcd7ff
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1967315
Commit-Queue: Georgia Kouveli <georgia.kouveli@arm.com>
Reviewed-by: Ross McIlroy <rmcilroy@chromium.org>
Reviewed-by: Georg Neis <neis@chromium.org>
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66751}
2020-03-17 17:52:28 +00:00
Jakob Gruber
d303f4fba9 [regexp] Always pass the isolate to CaseInsensitiveCompareUC16
In the past we've used the isolate argument to signal whether we were
in unicode mode (nullptr) or not (the real isolate). This is no longer
needed, and in fact breaks no-i18n mode which always expects to have a
real isolate.

Bug: v8:10120
Change-Id: I2f848c4ff8c2ff0e9b84278cbcdf3c3670e44e58
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2081816
Reviewed-by: Jakob Kummerow <jkummerow@chromium.org>
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66520}
2020-03-02 10:46:15 +00:00
Wouter Vermeiren
8199a7ac23 [ppc64][ppc] Split up ARCH_PPC and ARCH_PPC64
After support for ARCH_PPC was dropped, it became a subset of
ARCH_PPC64. If you compile for ppc64, then you set the ARCH_PPC64
define which also sets the ARCH_PPC define.
To be able to again support ppc (32 bit) those defines should be
split up again.

This commit only splits up the defines but does not introduce a
working ARCH_PPC variant.

Bug: v8:10102
Change-Id: I64e0749f8e5a7dc078ee7890d92e57b82706a849
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1989826
Commit-Queue: Clemens Backes <clemensb@chromium.org>
Reviewed-by: Peter Marshall <petermarshall@chromium.org>
Reviewed-by: Benedikt Meurer <bmeurer@chromium.org>
Reviewed-by: Hannes Payer <hpayer@chromium.org>
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Reviewed-by: Michael Achenbach <machenbach@chromium.org>
Reviewed-by: Milad Farazmand <miladfar@ca.ibm.com>
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#66390}
2020-02-21 15:42:20 +00:00
Jakob Gruber
48756fcf74 [regexp] Add a backtracking limit in the interpreter
V8 uses a backtracking regexp engine, which has the caveat that some
regexp patterns can have exponential runtime behavior when excessive
backtracking is involved.

Especially when regexp patterns are user-controlled, it would be useful
to be able to set an upper limit for a single regexp execution. This CL
takes an initial step in that direction by adding a backtracking limit
(intended to approximate execution time):

- The limit is stored in the JSRegExp's data array.
- A limit can currently only be set through the %NewRegExpWithLimit
runtime function.
- The limit is applied during interpreter execution. When exceeded, the
interpreter stops execution and returns FAILURE (even if continued
execution would at some later point have resulted in SUCCESS).

In follow-up CLs, this mechanism will be extended to work in jitted
regexp code, and exposed through the V8 API.

Bug: v8:9695
Change-Id: Iadb5c100052f4a63b26f1ec49cf97c6713a66b9b
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1864934
Commit-Queue: Ulan Degenbaev <ulan@chromium.org>
Auto-Submit: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Ulan Degenbaev <ulan@chromium.org>
Reviewed-by: Peter Marshall <petermarshall@chromium.org>
Cr-Commit-Position: refs/heads/master@{#64417}
2019-10-21 12:48:15 +00:00
Mathias Bynens
65940f4369 [regexp] Remove UseCounter for matchAll with non-g RegExp
We've gathered sufficient data, so the use counter can now be removed
again.

The use counter was originally added here:

- V8 CL: https://chromium-review.googlesource.com/c/v8/v8/+/1718145
- Chromium CL: https://chromium-review.googlesource.com/c/chromium/src/+/1718367

The Chromium plumbing was removed here:
https://chromium-review.googlesource.com/c/chromium/src/+/1839851

BUG=v8:9551

Change-Id: I829a0fe34d9ebade1403cb4d1c0b9c997f125074
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1844774
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Commit-Queue: Mathias Bynens <mathias@chromium.org>
Cr-Commit-Position: refs/heads/master@{#64124}
2019-10-07 09:04:23 +00:00
Jakob Gruber
282a74c7f0 Reland "[regexp] Bytecode peephole optimization"
This is a reland of 6612943010

Fixed: Unaligned reads, unspecified evaluation order.

Original change's description:
> [regexp] Bytecode peephole optimization
>
> Bytecodes used by the regular expression interpreter often occur in
> specific sequences. The number of dispatches in the interpreter can be
> reduced if those sequences are combined into a single bytecode.
>
> This CL adds a peephole optimization pass for regexp bytecodes.
> This pass checks the generated bytecode for pre-defined sequences that
> can be merged into a single bytecode.
>
> With the currently implemented bytecode sequences a speedup of 1.12x on
> regex-dna and octane-regexp is achieved.
>
> Bug: v8:9330
> Change-Id: I827f93273a5848e5963c7e3329daeb898995d151
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1813743
> Commit-Queue: Patrick Thier <pthier@google.com>
> Reviewed-by: Peter Marshall <petermarshall@chromium.org>
> Reviewed-by: Jakob Gruber <jgruber@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#63992}

Cq-Include-Trybots: luci.v8.try:v8_linux64_ubsan_rel_ng
Cq-Include-Trybots: luci.v8.try:v8_linux_gcc_rel
Bug: v8:9330,chromium:1008502,chromium:1008631
Change-Id: Ib9fc395b6809aa1debdb54d9fba5b7f09a235e5b
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1828917
Reviewed-by: Peter Marshall <petermarshall@chromium.org>
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#64064}
2019-10-01 12:50:24 +00:00
Clemens Backes [né Hammacher]
05eda1acc2 Revert "[regexp] Bytecode peephole optimization"
This reverts commit 6612943010.

Reason for revert: Fails on gcc: https://ci.chromium.org/p/v8/builders/ci/V8%20Linux%20gcc/3394

Original change's description:
> [regexp] Bytecode peephole optimization
> 
> Bytecodes used by the regular expression interpreter often occur in
> specific sequences. The number of dispatches in the interpreter can be
> reduced if those sequences are combined into a single bytecode.
> 
> This CL adds a peephole optimization pass for regexp bytecodes.
> This pass checks the generated bytecode for pre-defined sequences that
> can be merged into a single bytecode.
> 
> With the currently implemented bytecode sequences a speedup of 1.12x on
> regex-dna and octane-regexp is achieved.
> 
> Bug: v8:9330
> Change-Id: I827f93273a5848e5963c7e3329daeb898995d151
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1813743
> Commit-Queue: Patrick Thier <pthier@google.com>
> Reviewed-by: Peter Marshall <petermarshall@chromium.org>
> Reviewed-by: Jakob Gruber <jgruber@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#63992}

TBR=jgruber@chromium.org,petermarshall@chromium.org,pthier@google.com

Change-Id: Ie526fe3691f6abdd16b51979000fdafb7afce8ef
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: v8:9330
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1826727
Reviewed-by: Clemens Backes [né Hammacher] <clemensb@chromium.org>
Commit-Queue: Clemens Backes [né Hammacher] <clemensb@chromium.org>
Cr-Commit-Position: refs/heads/master@{#63998}
2019-09-26 15:57:02 +00:00
Patrick Thier
6612943010 [regexp] Bytecode peephole optimization
Bytecodes used by the regular expression interpreter often occur in
specific sequences. The number of dispatches in the interpreter can be
reduced if those sequences are combined into a single bytecode.

This CL adds a peephole optimization pass for regexp bytecodes.
This pass checks the generated bytecode for pre-defined sequences that
can be merged into a single bytecode.

With the currently implemented bytecode sequences a speedup of 1.12x on
regex-dna and octane-regexp is achieved.

Bug: v8:9330
Change-Id: I827f93273a5848e5963c7e3329daeb898995d151
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1813743
Commit-Queue: Patrick Thier <pthier@google.com>
Reviewed-by: Peter Marshall <petermarshall@chromium.org>
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#63992}
2019-09-26 14:57:37 +00:00
Patrick Thier
213504b9d7 [regexp] Consolidate calls to jitted irregexp and regexp interpreter
The code fields in a JSRegExp object now either contain irregexp
compiled code or a trampoline to the interpreter. This way the code
can be executed without explicitly checking if the regexp shall be
interpreted or executed natively.
In case of interpreted regexp the generated bytecode is now stored in
its own fields instead of the code fields for Latin1 and UC16
respectively.
The signatures of the jitted irregexp match and the regexp interpreter
have been equalized.

Bug: v8:9516
Change-Id: I30e3d86f4702a902d3387bccc1ee91dea501fe4e
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1762513
Commit-Queue: Patrick Thier <pthier@google.com>
Reviewed-by: Peter Marshall <petermarshall@chromium.org>
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Cr-Commit-Position: refs/heads/master@{#63457}
2019-08-29 15:19:58 +00:00
Ana Peško
71acda2252 [regexp] Naive tiering-up
This CL implements a naive tiering-up strategy where the interpreter
is used for the first execution for every regex, and the compiler is
used for every execution after that. The only exception is if a
global replace is being executed on a regex, we eagerly tier-up to
native code right away.

To use the tier-up logic --regexp-tier-up needs to be set. It is
currently disabled by default.

Bug v8:9566

Change-Id: Ib64ed77cbfcde10411161c0541dfa2501a0a93bd
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1710661
Reviewed-by: Peter Marshall <petermarshall@chromium.org>
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Commit-Queue: Ana Pesko <anapesko@google.com>
Cr-Commit-Position: refs/heads/master@{#63150}
2019-08-12 08:41:48 +00:00
Mathias Bynens
dd7190a979 [regexp] Add UseCounter for matchAll with non-g RegExp
Per the July TC39 meeting consensus, we'd like to make the
upcoming String.prototype.replaceAll proposal throw for
non-global RegExp searchValues. However,
String.prototype.matchAll currently does not throw in this
case, causing consistency concerns.

This patch adds a use counter for String.prototype.matchAll
with a non-global RegExp as the searchValue. Hopefully, this
pattern isn't too common in real-world code today, in which case
we can both a) change matchAll and b) proceed with the desired
replaceAll semantics.

https://github.com/tc39/proposal-string-replaceall/issues/16

V8 CL: https://chromium-review.googlesource.com/c/v8/v8/+/1718145
Chromium CL: https://chromium-review.googlesource.com/c/chromium/src/+/1718367

BUG=v8:9551

Change-Id: Ica660a0a6189d84c3d33398c98305d0bcb9f8c23
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1718145
Commit-Queue: Mathias Bynens <mathias@chromium.org>
Reviewed-by: Ulan Degenbaev <ulan@chromium.org>
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#62913}
2019-07-25 12:53:02 +00:00
Patrick Thier
3a0f407d26 Reland "Reland "[regexp] Call the regexp interpreter without CEntry overhead""
This is a reland of c2ee4a7999

Original change's description:
> Reland "[regexp] Call the regexp interpreter without CEntry overhead"
> 
> This is a reland of d4d28b73cb
> 
> Original change's description:
> > [regexp] Call the regexp interpreter without CEntry overhead
> > 
> > Previously all RegExp calls went through Runtime_RegExpExec when --regexp-interpret-all was set.
> > 
> > This CL avoids the runtime overhead by calling into the interpreter directly from the RegExpExec Builtin when the regular expression subject was already compiled to ByteCode (i.e. after the first call).
> > 
> > Bug: v8:8954
> > Change-Id: Iae9dfcef3370b772a05b2942305335d592f6f15a
> > Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1698391
> > Commit-Queue: Patrick Thier <pthier@google.com>
> > Reviewed-by: Jakob Gruber <jgruber@chromium.org>
> > Reviewed-by: Peter Marshall <petermarshall@chromium.org>
> > Cr-Commit-Position: refs/heads/master@{#62753}
> 
> Bug: v8:8954
> Change-Id: I1f0b6de9c6da65bcb582ddb41a37419116a5c510
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1706053
> Reviewed-by: Jakob Gruber <jgruber@chromium.org>
> Commit-Queue: Patrick Thier <pthier@google.com>
> Cr-Commit-Position: refs/heads/master@{#62794}

Bug: v8:8954
Change-Id: Ice77c05240f1fabd36bf97b8e789dd4c25a9718f
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1715451
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Peter Marshall <petermarshall@chromium.org>
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#62904}
2019-07-24 17:20:15 +00:00
Sathya Gunasekaran
aa478cac4f Revert "Reland "[regexp] Call the regexp interpreter without CEntry overhead""
This reverts commit c2ee4a7999.

Reason for revert: webgl_conformance_tests deqp/data/gles2/shaders/conversions.html crashes on Android FYI Release (Nexus 9)
See https://bugs.chromium.org/p/chromium/issues/detail?id=985624

Original change's description:
> Reland "[regexp] Call the regexp interpreter without CEntry overhead"
>
> This is a reland of d4d28b73cb
>
> Original change's description:
> > [regexp] Call the regexp interpreter without CEntry overhead
> >
> > Previously all RegExp calls went through Runtime_RegExpExec when --regexp-interpret-all was set.
> >
> > This CL avoids the runtime overhead by calling into the interpreter directly from the RegExpExec Builtin when the regular expression subject was already compiled to ByteCode (i.e. after the first call).
> >
> > Bug: v8:8954
> > Change-Id: Iae9dfcef3370b772a05b2942305335d592f6f15a
> > Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1698391
> > Commit-Queue: Patrick Thier <pthier@google.com>
> > Reviewed-by: Jakob Gruber <jgruber@chromium.org>
> > Reviewed-by: Peter Marshall <petermarshall@chromium.org>
> > Cr-Commit-Position: refs/heads/master@{#62753}
>
> Bug: v8:8954
> Change-Id: I1f0b6de9c6da65bcb582ddb41a37419116a5c510
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1706053
> Reviewed-by: Jakob Gruber <jgruber@chromium.org>
> Commit-Queue: Patrick Thier <pthier@google.com>
> Cr-Commit-Position: refs/heads/master@{#62794}

TBR=jgruber@chromium.org,petermarshall@chromium.org,pthier@google.com

# Not skipping CQ checks because original CL landed > 1 day ago.

Bug: v8:8954, chromium:985624
Change-Id: I5bc2c397a09979f42f28670f80a5366f2a33d80f
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1709411
Commit-Queue: Sathya Gunasekaran <gsathya@chromium.org>
Reviewed-by: Sathya Gunasekaran <gsathya@chromium.org>
Cr-Commit-Position: refs/heads/master@{#62824}
2019-07-19 10:41:59 +00:00
Patrick Thier
c2ee4a7999 Reland "[regexp] Call the regexp interpreter without CEntry overhead"
This is a reland of d4d28b73cb

Original change's description:
> [regexp] Call the regexp interpreter without CEntry overhead
> 
> Previously all RegExp calls went through Runtime_RegExpExec when --regexp-interpret-all was set.
> 
> This CL avoids the runtime overhead by calling into the interpreter directly from the RegExpExec Builtin when the regular expression subject was already compiled to ByteCode (i.e. after the first call).
> 
> Bug: v8:8954
> Change-Id: Iae9dfcef3370b772a05b2942305335d592f6f15a
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1698391
> Commit-Queue: Patrick Thier <pthier@google.com>
> Reviewed-by: Jakob Gruber <jgruber@chromium.org>
> Reviewed-by: Peter Marshall <petermarshall@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#62753}

Bug: v8:8954
Change-Id: I1f0b6de9c6da65bcb582ddb41a37419116a5c510
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1706053
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Commit-Queue: Patrick Thier <pthier@google.com>
Cr-Commit-Position: refs/heads/master@{#62794}
2019-07-18 07:23:14 +00:00
Sathya Gunasekaran
95d4df3f16 Revert "[regexp] Call the regexp interpreter without CEntry overhead"
This reverts commit d4d28b73cb.

Reason for revert: breaks TSAN bot:
https://ci.chromium.org/p/v8/builders/ci/V8%20Linux64%20TSAN%20-%20concurrent%20marking/9526

Original change's description:
> [regexp] Call the regexp interpreter without CEntry overhead
> 
> Previously all RegExp calls went through Runtime_RegExpExec when --regexp-interpret-all was set.
> 
> This CL avoids the runtime overhead by calling into the interpreter directly from the RegExpExec Builtin when the regular expression subject was already compiled to ByteCode (i.e. after the first call).
> 
> Bug: v8:8954
> Change-Id: Iae9dfcef3370b772a05b2942305335d592f6f15a
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1698391
> Commit-Queue: Patrick Thier <pthier@google.com>
> Reviewed-by: Jakob Gruber <jgruber@chromium.org>
> Reviewed-by: Peter Marshall <petermarshall@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#62753}

TBR=jgruber@chromium.org,petermarshall@chromium.org,pthier@google.com

Change-Id: I3257220c4359a3b801dd80e0eff6c4534d8badee
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: v8:8954
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1706050
Reviewed-by: Sathya Gunasekaran <gsathya@chromium.org>
Commit-Queue: Sathya Gunasekaran <gsathya@chromium.org>
Cr-Commit-Position: refs/heads/master@{#62757}
2019-07-17 08:23:48 +00:00
Patrick Thier
d4d28b73cb [regexp] Call the regexp interpreter without CEntry overhead
Previously all RegExp calls went through Runtime_RegExpExec when --regexp-interpret-all was set.

This CL avoids the runtime overhead by calling into the interpreter directly from the RegExpExec Builtin when the regular expression subject was already compiled to ByteCode (i.e. after the first call).

Bug: v8:8954
Change-Id: Iae9dfcef3370b772a05b2942305335d592f6f15a
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1698391
Commit-Queue: Patrick Thier <pthier@google.com>
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Peter Marshall <petermarshall@chromium.org>
Cr-Commit-Position: refs/heads/master@{#62753}
2019-07-17 06:44:31 +00:00
Peter Marshall
6b2b60cb02 [cleanup] Rename RegExpMacroAssemblerIrregexp to RegExpBytecodeGenerator
This makes it clearer what this class does, and is more consistent with
the terminology used by ignition (BytecodeGenerator).

Change-Id: I9085f29f437cf15605a5ae971b1fc72d6c79feaa
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1692923
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Commit-Queue: Peter Marshall <petermarshall@chromium.org>
Cr-Commit-Position: refs/heads/master@{#62612}
2019-07-10 08:01:10 +00:00
Jakob Gruber
3663e83424 [regexp] Remove unused DispatchTable and ZoneSplayTree
Bug: v8:9359
Change-Id: I237f16324ff036f2cbfb7ca97b4ac208442b06cc
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1664056
Auto-Submit: Jakob Gruber <jgruber@chromium.org>
Commit-Queue: Sigurd Schneider <sigurds@chromium.org>
Reviewed-by: Sigurd Schneider <sigurds@chromium.org>
Reviewed-by: Peter Marshall <petermarshall@chromium.org>
Cr-Commit-Position: refs/heads/master@{#62268}
2019-06-19 07:19:38 +00:00
Jakob Gruber
83da1c2d4c [regexp] Simplify UnicodeRangeSplitter
This class used to be based on DispatchTable, which itself uses an
interval tree to both categorize and canonicalize ranges
(i.e. such that no overlap and all immediately adjacent ranges are
merged). The produced ranges were then entered into lists for
{bmp,lead_surrogate,trail_surrogate,non_bmp} splits.

With this CL, we simplify to a plain loop over all character range
kinds instead. The dispatch table (and ZoneSplayList, perhaps
SplayList) can be removed in follow-ups.

Bug: v8:9359
Change-Id: I9c6b72f3bc44d1557af7c74419709ae5662611f1
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1664053
Auto-Submit: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Michael Lippautz <mlippautz@chromium.org>
Reviewed-by: Peter Marshall <petermarshall@chromium.org>
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#62260}
2019-06-18 20:18:18 +00:00
Jakob Gruber
6155b33a9c [regexp] Remove dead DispatchTableConstructor
Bug: v8:9359
Change-Id: I1b490c928ed884f4ad33e005699f98614be75233
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1662306
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Auto-Submit: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Peter Marshall <petermarshall@chromium.org>
Cr-Commit-Position: refs/heads/master@{#62242}
2019-06-18 12:40:50 +00:00
Jakob Gruber
a8c62102e1 [regexp] Further narrow public API and restrict includes to regexp.h
This CL renames jsregexp.{h,cc} to regexp.{h,cc}, hides all non-public
functions of RegExpImpl in the .cc file, and renames the public parts
of RegExpImpl to just RegExp. Include directives from outside the
src/regexp directory are limited to regexp.h, regexp-stack.h, and
regexp-utils.h. We also expose all result codes that can be returned
by irregexp code (including RETRY) on the public header since they
are needed elsewhere, e.g. in builtins.

Bug: v8:9359
Change-Id: Iae1a01ac9f6e1e4dc168f3fbe8fe8679cb6b1259
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1662297
Reviewed-by: Michael Achenbach <machenbach@chromium.org>
Reviewed-by: Leszek Swirski <leszeks@chromium.org>
Reviewed-by: Ulan Degenbaev <ulan@chromium.org>
Reviewed-by: Peter Marshall <petermarshall@chromium.org>
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#62240}
2019-06-18 12:23:16 +00:00
Jakob Gruber
c7d57dd309 [regexp] Reduce public API surface
This further reduces the number of things declared in the public
regexp API file, currently still named jsregexp.h.

* Move JSRegExp::Flags convenience functions to regexp-compiler.h.
* Set RegExpImpl methods private if possible (these will later be
  moved to a new hidden impl class).
* Merge RegExpEngine::CompilationResult into RegExpCompileData.
* Move remaining RegExpEngine methods to RegExpImpl and delete
  RegExpEngine.
* Extract RegExpGlobalCache.
* Document a few data structures.

Upcoming CLs will rename RegExpImpl to RegExp and jsregexp.h to
regexp.h. This should then be the only header included from other
directories.

Bug: v8:9359
Change-Id: I78c8f4cca495a2b95735a48b6181583bc3310bdf
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1662294
Reviewed-by: Peter Marshall <petermarshall@chromium.org>
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#62218}
2019-06-17 15:10:24 +00:00
Jakob Gruber
c51e4f3c66 [regexp] Rewrite certain Assertion sequences
RegExp assertions (e.g.: '^', '$', '\b', ...) sequences have certain
properties that this rewriter exploits:

1. They are zero-width and order-independent, thus one can remove all
duplicate assertions.
2. If a subsequence is guaranteed to fail, the entire sequence fails.
Any sequence always known to fail (e.g. containing both '\b' and '\B')
can be rewritten to a single node that triggers failure.

This CL generalizes the previous optimization for repeated assertions
to be order-independent, i.e. assertions only have to be in the same
sequence but not next to each other.

Bug: v8:6515, v8:6126
Change-Id: I3f92f081ce8a55ad8c34c269a09a6686e3b008f3
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1657925
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Peter Marshall <petermarshall@chromium.org>
Cr-Commit-Position: refs/heads/master@{#62201}
2019-06-17 09:21:58 +00:00
Jakob Gruber
d61a558a23 Reland "[regexp] Move AST-to-Node code to a dedicated file"
This is a reland of 811bfbbc56

Original change's description:
> [regexp] Move AST-to-Node code to a dedicated file
>
> Prior to this CL, jsregexp contains a bunch of things that are slightly
> related but would be cleaner in separate files, including: AST-to-Node
> transformations, the compiler implementation, and a debugging printer.
>
> This CL extracts AST-to-Node transformations.
>
> Bug: v8:9359
> Change-Id: I030cfca5c40cfd72e3a7abe2188e4654cfe2277c
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1655303
> Auto-Submit: Jakob Gruber <jgruber@chromium.org>
> Reviewed-by: Yang Guo <yangguo@chromium.org>
> Commit-Queue: Jakob Gruber <jgruber@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#62148}

Tbr: yangguo@chromium.org
Bug: v8:9359
Change-Id: I68a16086dc56c9a059547033ca8bc1e9de1080db
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1658568
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#62154}
2019-06-13 16:39:56 +00:00
Leszek Swirski
ee279dc223 Revert "[regexp] Move AST-to-Node code to a dedicated file"
This reverts commit 811bfbbc56.

Reason for revert: Breaks noi18n build (https://ci.chromium.org/p/v8/builders/ci/V8%20Linux%20-%20noi18n%20-%20debug/27201)

Original change's description:
> [regexp] Move AST-to-Node code to a dedicated file
> 
> Prior to this CL, jsregexp contains a bunch of things that are slightly
> related but would be cleaner in separate files, including: AST-to-Node
> transformations, the compiler implementation, and a debugging printer.
> 
> This CL extracts AST-to-Node transformations.
> 
> Bug: v8:9359
> Change-Id: I030cfca5c40cfd72e3a7abe2188e4654cfe2277c
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1655303
> Auto-Submit: Jakob Gruber <jgruber@chromium.org>
> Reviewed-by: Yang Guo <yangguo@chromium.org>
> Commit-Queue: Jakob Gruber <jgruber@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#62148}

TBR=yangguo@chromium.org,jgruber@chromium.org,petermarshall@chromium.org

Change-Id: I079e15b02d73d81aef806992f324f08d7008e367
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: v8:9359
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1658160
Reviewed-by: Leszek Swirski <leszeks@chromium.org>
Commit-Queue: Leszek Swirski <leszeks@chromium.org>
Cr-Commit-Position: refs/heads/master@{#62149}
2019-06-13 15:05:01 +00:00
Jakob Gruber
811bfbbc56 [regexp] Move AST-to-Node code to a dedicated file
Prior to this CL, jsregexp contains a bunch of things that are slightly
related but would be cleaner in separate files, including: AST-to-Node
transformations, the compiler implementation, and a debugging printer.

This CL extracts AST-to-Node transformations.

Bug: v8:9359
Change-Id: I030cfca5c40cfd72e3a7abe2188e4654cfe2277c
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1655303
Auto-Submit: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Yang Guo <yangguo@chromium.org>
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#62148}
2019-06-13 14:40:08 +00:00
Jakob Gruber
b0899cf8ab [regexp] Add wrapper header for arch-specific files
This adds regexp-macro-assembler-arch.h which contains the arch-specific
include dispatch.

Change-Id: Ibc2be8059d54b57afeed9b7ce244229ce1bd79bc
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1655296
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Auto-Submit: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Peter Marshall <petermarshall@chromium.org>
Cr-Commit-Position: refs/heads/master@{#62118}
2019-06-12 14:17:13 +00:00
Jakob Gruber
89ad50be1f [regexp] Rename interpreter files
bytecodes-irregexp.h -> regexp-bytecodes.h
interpreter-irregexp.{cc,h} -> regexp-interpreter.{cc,h}

Change-Id: I98ca9d5c3264ad0adbd280b93082aa3e01b45b67
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1655294
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Auto-Submit: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Peter Marshall <petermarshall@chromium.org>
Cr-Commit-Position: refs/heads/master@{#62113}
2019-06-12 11:57:58 +00:00
Clemens Hammacher
a335f2aeed [cleanup] Replace simple typedefs by using
This replaces all typedefs that define types and not functions by the
equivalent "using" declaration.

This was done mostly automatically using this command:
ag -l '\btypedef\b' src test | xargs -L1 \
     perl -i -p0e 's/typedef ([^*;{}]+) (\w+);/using \2 = \1;/sg'

Patchset 2 then adds some manual changes for typedefs for pointer types,
where the regular expression did not match.

R=mstarzinger@chromium.org
TBR=yangguo@chromium.org, jarin@chromium.org

Bug: v8:9183
Change-Id: I6f6ee28d1793b7ac34a58f980b94babc21874b78
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1631409
Commit-Queue: Clemens Hammacher <clemensh@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Cr-Commit-Position: refs/heads/master@{#61849}
2019-05-27 12:39:49 +00:00