This is a reland of commit 491de34bcc
co-authors: Ji Qiu <qiuji@iscas.ac.cn>
Alvise De Faveri Tron <elvisilde@gmail.com>
Usman Zain <uszain@gmail.com>
Zheng Quan <vitalyankh@gmail.com>
Original change's description:
> [riscv32] Add RISCV32 backend
>
> This very large changeset adds support for RISCV32.
>
> Bug: v8:13025
> Change-Id: Ieacc857131e6620f0fcfd7daa88a0f8d77056aa9
> Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3736732
> Reviewed-by: Michael Achenbach <machenbach@chromium.org>
> Commit-Queue: Yahan Lu <yahan@iscas.ac.cn>
> Reviewed-by: ji qiu <qiuji@iscas.ac.cn>
> Reviewed-by: Andreas Haas <ahaas@chromium.org>
> Reviewed-by: Hannes Payer <hpayer@chromium.org>
> Reviewed-by: Nico Hartmann <nicohartmann@chromium.org>
> Cr-Commit-Position: refs/heads/main@{#82053}
Bug: v8:13025
Change-Id: I220fae4b8e2679bdc111724e08817b079b373bd5
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3807124
Commit-Queue: Yahan Lu <yahan@iscas.ac.cn>
Reviewed-by: Michael Achenbach <machenbach@chromium.org>
Reviewed-by: ji qiu <qiuji@iscas.ac.cn>
Reviewed-by: Hannes Payer <hpayer@chromium.org>
Reviewed-by: Andreas Haas <ahaas@chromium.org>
Cr-Commit-Position: refs/heads/main@{#82198}
This very large changeset adds support for RISCV32.
Bug: v8:13025
Change-Id: Ieacc857131e6620f0fcfd7daa88a0f8d77056aa9
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3736732
Reviewed-by: Michael Achenbach <machenbach@chromium.org>
Commit-Queue: Yahan Lu <yahan@iscas.ac.cn>
Reviewed-by: ji qiu <qiuji@iscas.ac.cn>
Reviewed-by: Andreas Haas <ahaas@chromium.org>
Reviewed-by: Hannes Payer <hpayer@chromium.org>
Reviewed-by: Nico Hartmann <nicohartmann@chromium.org>
Cr-Commit-Position: refs/heads/main@{#82053}
There are two changes in this patch.
1. We previously added `VerifyRegExpSyntax` in regexp-parser.h to support checking regexp syntax for early errors in SpiderMonkey. Now that V8 is also emitting early errors for regexps (bug v8:896), SpiderMonkey can use the same code as V8.
2. Bug v8:11069 used a std::unordered_map as a cache for range arrays. This is currently the only place in irregexp that can call non-placement new, which SpiderMonkey has a static analysis to detect. Converting this to a ZoneUnorderedMap solves the problem for us, and seems consistent with the rest of irregexp.
Bug: v8:13108
Change-Id: Icedafd7d30fd040760cb0676a7bef8d55853bb93
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3785444
Commit-Queue: Jakob Linke <jgruber@chromium.org>
Reviewed-by: Jakob Linke <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/main@{#81988}
Extend the effect of --freeze-flags-after-init to also protect updates
of individual flags instead of only the API.
For this, we wrap each flag in a {FlagValue} class which implicitly
converts to the value of the flag. Some cases still require the explicit
{value()} accessor though. That accessor is {constexpr}, in contrast to
the implicit conversion, because otherwise clang emits a lot of warnings
about dead code within "if (FLAG...)" scopes.
R=cbruni@chromium.org
Bug: v8:12887
Change-Id: I87d3457e49ceb317d34d6a21cf09c520d4171eb5
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3683321
Reviewed-by: Camillo Bruni <cbruni@chromium.org>
Reviewed-by: Jakob Kummerow <jkummerow@chromium.org>
Reviewed-by: Patrick Thier <pthier@chromium.org>
Commit-Queue: Clemens Backes <clemensb@chromium.org>
Reviewed-by: Maya Lekova <mslekova@chromium.org>
Cr-Commit-Position: refs/heads/main@{#80938}
Now that we require C++17 support, we can just use the standard
static_assert without message, instead of our STATIC_ASSERT macro.
R=leszeks@chromium.org
Bug: v8:12425
Change-Id: I1d4e39c310b533bcd3a4af33d027827e6c083afe
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3647353
Reviewed-by: Leszek Swirski <leszeks@chromium.org>
Reviewed-by: Hannes Payer <hpayer@chromium.org>
Commit-Queue: Clemens Backes <clemensb@chromium.org>
Cr-Commit-Position: refs/heads/main@{#80524}
The no_condition / kNoCondition not only has the flaw that it's a
special case which represents an illegal / nonexisting condition, and
thus needs special handling in all methods which get a condition as
input (this check is often missing), it is also weird in that every
negative condition value must be considered a "no condition".
It turns out that this "no condition" is rarely used, and can easily be
avoided by duplicating methods, or storing a {base::Optional<Condition>}
instead (not needed anywhere yet).
This is a follow-up to https://crrev.com/c/3629553.
R=tebbi@chromium.org, pthier@chromium.org
Bug: v8:12425
Change-Id: Id2270b1660fcb0aff0a8460961b57068ed1c3c73
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3632102
Reviewed-by: Tobias Tebbi <tebbi@chromium.org>
Commit-Queue: Clemens Backes <clemensb@chromium.org>
Reviewed-by: Patrick Thier <pthier@chromium.org>
Cr-Commit-Position: refs/heads/main@{#80397}
And port commit 5ee6b7a701
Change-Id: Ia43d1d888154ebffcd56d436e6dfa8970eae6583
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3600174
Reviewed-by: ji qiu <qiuji@iscas.ac.cn>
Commit-Queue: Yahan Lu <yahan@iscas.ac.cn>
Auto-Submit: Yahan Lu <yahan@iscas.ac.cn>
Cr-Commit-Position: refs/heads/main@{#80132}
Callee saved registers do not include the LR anymore, so we can
now remove the last place where we pass a non-default template
argument to PushCPURegList/PopCPURegList (in the code generator).
This makes the template argument redundant, so we can remove the
template altogether.
Change-Id: I07f0c0a10840817df8a5afc1dc74330e290ce5bf
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3571816
Reviewed-by: Jakob Linke <jgruber@chromium.org>
Commit-Queue: Georgia Kouveli <georgia.kouveli@arm.com>
Cr-Commit-Position: refs/heads/main@{#79842}
Port 8a0d1b6fe5
Original Commit Message:
Modernise the RegList interface to be a proper class, rather than a
typedef to an integer, and add proper methods onto it rather than ad-hoc
bit manipulation.
In particular, this makes RegList typesafe, adding a DoubleRegList for
DoubleRegisters.
The Arm64 CPURegList isn't updated to use (or extend) the new RegList
interface, because of its weird type-erasing semantics (it can store
Registers and VRegisters). Maybe in the future we'll want to get rid of
CPURegList entirely and use RegList/DoubleRegList directly.
R=leszeks@chromium.org, joransiu@ca.ibm.com, junyan@redhat.com, midawson@redhat.com
BUG=
LOG=N
Change-Id: I997156fe4f4f2ccc40b2631d5cb752efdc8a5ad2
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3525084
Reviewed-by: Junliang Yan <junyan@redhat.com>
Commit-Queue: Milad Farazmand <mfarazma@redhat.com>
Cr-Commit-Position: refs/heads/main@{#79484}
Modernise the RegList interface to be a proper class, rather than a
typedef to an integer, and add proper methods onto it rather than ad-hoc
bit manipulation.
In particular, this makes RegList typesafe, adding a DoubleRegList for
DoubleRegisters.
The Arm64 CPURegList isn't updated to use (or extend) the new RegList
interface, because of its weird type-erasing semantics (it can store
Registers and VRegisters). Maybe in the future we'll want to get rid of
CPURegList entirely and use RegList/DoubleRegList directly.
Change-Id: I3cb2a4d386cb92a4dcd2edbdd3fba9ef71f354d6
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3516747
Commit-Queue: Leszek Swirski <leszeks@chromium.org>
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Camillo Bruni <cbruni@chromium.org>
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Cr-Commit-Position: refs/heads/main@{#79460}
With String contents being accessible off-main-thread or from multiple
main threads, add a SLOW_DCHECK that the hash of the string contents
inside a String::FlatContent doesn't change during its lifetime.
Bug: v8:12007
Change-Id: Iaf6bb785e44c97c13ac2fe9c5c20099bf1e0d2fd
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3451355
Reviewed-by: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Leszek Swirski <leszeks@chromium.org>
Commit-Queue: Shu-yu Guo <syg@chromium.org>
Cr-Commit-Position: refs/heads/main@{#79080}
The regexp parser historically has tried to gracefully detect and bail
out from excess zone allocations, where 'excess' was determined to be
an arbitrary limit of 256MB.
This leads to issues now that the regexp parser may run from within
the JS parser - the JS parser doesn't observe this arbitrary limit and
happily keeps allocating until the underlying allocator actually runs
out of memory; this way, the JS parser can handle very large JS files,
and it's now counterproductive if the regexp parser (which reuses the
JS parser zone) bails out on excess allocations.
This CL simply removes the excess_allocation mechanism.
Bug: chromium:1264014
Change-Id: I8d93a1e52aa65bb0ea6c2aab3b68b479ce79a1f6
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3401580
Reviewed-by: Toon Verwaest <verwaest@chromium.org>
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/main@{#78991}
Use the FatalProcessOutOfMemory function such that tooling recognizes
these crashes as OOM's.
Drive-by: Skip one more test that leads to such stack overflows.
Fixed: v8:12555, chromium:1288456
Bug: v8:12472
Change-Id: Ib9203a4aa0487744f7cea9a212aeeffda579ae23
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3401861
Auto-Submit: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Clemens Backes <clemensb@chromium.org>
Commit-Queue: Clemens Backes <clemensb@chromium.org>
Cr-Commit-Position: refs/heads/main@{#78692}
Apply case-insensitive comparisons not only for the initial character,
but for the entire prefix. This avoids degenerate behavior for patterns
like /aaaa|AAAA|AAAA/i (i.e. generate a single 4-char prefix instead of
four 1-char prefixes).
Bug: v8:12472
Change-Id: Ib2b49fe73ca846a1b7ec90056cc64bdf5cf33026
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3398114
Reviewed-by: Patrick Thier <pthier@chromium.org>
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/main@{#78668}
Recursive ToNode node generation may overflow the stack for large
graphs. As a quick fix, insert periodic stack overflow checks in
selected ToNode methods.
As a more permanent fix, in the future we could abort gracefully
(instead of crashing on a CHECK), and/or refactor into iterative node
generation.
Bug: v8:12472
Change-Id: Ie5fbe838c5f6a5192d7d9b44bfe6f6c76a8d26e7
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3398112
Reviewed-by: Leszek Swirski <leszeks@chromium.org>
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/main@{#78667}
This commit allows using unaligned load/store, which is more efficient
for 2 bytes,4 bytes and 8 bytes memory access.
Use RISCV_HAS_NO_UNALIGNED to control whether enable the fast path or not.
Change-Id: I1d321e6e5fa5bc31541c8dbfe582881d80743483
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3329803
Reviewed-by: ji qiu <qiuji@iscas.ac.cn>
Commit-Queue: Yahan Lu <yahan@iscas.ac.cn>
Cr-Commit-Position: refs/heads/main@{#78427}
... in order to avoid Code <-> CodeT conversions in builtins.
This CL changes the meaning of RelocInfo::CODE_TARGET which now expects
CodeT objects as a code target.
In order to reduce code churn this CL makes BUILTIN_CODE and friends
return CodeT instead of Code. In the follow-up CLs BUILTIN_CODET and
friends will be removed.
Bug: v8:11880
Change-Id: Ib8f60973e55c60fc62ba84707471da388f8201b4
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3338483
Reviewed-by: Patrick Thier <pthier@chromium.org>
Reviewed-by: Nico Hartmann <nicohartmann@chromium.org>
Reviewed-by: Leszek Swirski <leszeks@chromium.org>
Commit-Queue: Igor Sheludko <ishell@chromium.org>
Cr-Commit-Position: refs/heads/main@{#78393}
When emitting code, character ranges must only specify ranges which
the actual subject string (one- or two-byte) may contain.
This was not always the case, specifically for ranges with
`from <= kMaxUint8` and `to > kMaxUint8`.
The reason this is so tricky: 1. not all parts of the pipeline know
whether we are compiling for one- or two-byte subjects; 2. for
case-insensitive regexps, an out-of-bounds CharacterRange may have an
in-bounds case equivalent (e.g. /[Ÿ]/i also matches 'ÿ' == \u{ff}),
which only gets added somewhere in the middle of the pipeline.
Our current solution is to clamp immediately before code emission. We
also keep the existing handling/dchecks of the 0x10ffff marker value
which may occur in the two-byte subject case.
Bug: v8:11069
Change-Id: Ic7b34a13a900ea2aa3df032daac9236bf5682a42
Fixed: chromium:1275096
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3306569
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Leszek Swirski <leszeks@chromium.org>
Cr-Commit-Position: refs/heads/main@{#78186}
Port commit 3e3a027da1
Beside, some registers are changed to callee-saved, and the previous
related save and restore operations are removed.
Bug: v8:11382
Change-Id: Ic3161f8173771c1b7c190c77cbaf2534f52ec422
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3281673
Reviewed-by: Zhao Jiazhong <zhaojiazhong-hf@loongson.cn>
Commit-Queue: Zhao Jiazhong <zhaojiazhong-hf@loongson.cn>
Auto-Submit: Liu yu <liuyu@loongson.cn>
Cr-Commit-Position: refs/heads/main@{#77902}
This CL adds an Allocator to SmallVector to control how dynamic
storage is managed. The default value uses the plain old C++
std::allocator<T>, i.e. acts like malloc/free.
For use with zone memory, one can pass a ZoneAllocator as follows:
// Allocates in zone memory.
base::SmallVector<int, kInitialSize, ZoneAllocator<int>>
xs(ZoneAllocator<int>(zone));
Note: this is a follow-up to crrev.com/c/3240823.
Drive-by: hide the internal `reset` function. It doesn't free the
dynamic backing store; that's a surprise and should not be exposed to
external use.
Change-Id: I1f92f184924541e2269493fb52c30f2fdec032be
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3257711
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Leszek Swirski <leszeks@chromium.org>
Reviewed-by: Igor Sheludko <ishell@chromium.org>
Cr-Commit-Position: refs/heads/main@{#77755}
Capture group names were extended in
https://github.com/tc39/ecma262/pull/1869/fileshttps://github.com/tc39/ecma262/pull/1932/files
RegExpIdentifierName now explicitly enables unicode (+U) for
unicode escape sequences; likewise, surrogate pairs are now allowed
unconditionally.
The implementation simply switches on unicode temporarily while
parsing a capture group name.
Good news everyone, /(?<𝒜>.)/ is now a legal pattern.
Bug: v8:10384
Change-Id: Ida805998eb91ed717b2e05d81d52c1ed61104e3f
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3233234
Auto-Submit: Jakob Gruber <jgruber@chromium.org>
Commit-Queue: Leszek Swirski <leszeks@chromium.org>
Reviewed-by: Leszek Swirski <leszeks@chromium.org>
Cr-Commit-Position: refs/heads/main@{#77722}
.. as a custom data structure with questionable value.
Also: a few drive-by refactors.
Change-Id: I74957b70c4357795dc46ef5520d58b6a78be31b2
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3240823
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Leszek Swirski <leszeks@chromium.org>
Cr-Commit-Position: refs/heads/main@{#77674}
Unfortunately, CharacterRanges may use 0x10ffff as a marker value
signifying 'highest possible code unit' irrespective of whether the
regexp instance has the unicode flag or not. This value makes it
through RegExpCharacterClass::ToNode unmodified (since no surrogate
desugaring takes place without /u). Correctly mask out the 0xffff
value for purposes of building our uint16_t range array.
Note: It'd be better to never introduce 0x10ffff in the first place,
but given the irregexp pipeline's lack of hackability I hesitate to
change this - we are sure to rely on it implicitly in other spots.
Drive-by: Refactors.
Fixed: chromium:1264508
Bug: v8:11069
Change-Id: Ib3c5780e91f682f1a6d15f26eb4cf03636d93c25
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3256549
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Mathias Bynens <mathias@chromium.org>
Cr-Commit-Position: refs/heads/main@{#77673}
Character class handling in the irregexp pipeline is quite complex;
codepoints outside the BMP (basic multilingual plane) are only
translated into surrogate pairs when needed, e.g. when the subject
string is two-byte. If not needed, the codepoints simply stay part of
the list of CharacterRanges.
In EmitCharClass, we determine the valid subset of ranges through
ranges_length; until this CL, we forgot to pass that information on to
MakeRangeArray. Do that now by truncating the list of CharacterRanges.
Fixed: chromium:1262423
Change-Id: I5bb5b839e9935890ca2d10908ad66d72c3217178
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3240782
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Auto-Submit: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Mathias Bynens <mathias@chromium.org>
Cr-Commit-Position: refs/heads/main@{#77514}
Port 8bbb44e537
Port 7c08633bf6
Change-Id: Iebc3e223a0a7bc5f31ef0f21d8589e60ccdc0833
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3233695
Auto-Submit: Yahan Lu <yahan@iscas.ac.cn>
Commit-Queue: ji qiu <qiuji@iscas.ac.cn>
Reviewed-by: ji qiu <qiuji@iscas.ac.cn>
Cr-Commit-Position: refs/heads/main@{#77485}
Port 8bbb44e537
Original Commit Message:
Large character classes may easily be created when unicode
properties (e.g.: /\p{L}/u and /\P{L}/u) are used - these are
expanded internally into character classes that consist of hundreds
of character ranges. Previously to this CL, we'd emit branching code
for each of these ranges, leading to very large regexp code objects.
This CL adds a new codegen mode for large character classes (where
'large' currently means > 16 ranges). Instead of emitting branching
code inline, the ranges are written into a ByteArray and we call into
the C function IsCharacterInRangeArray for the actual branching logic.
The ByteArray is smaller than emitted code and is deduplicated if the
same character class is matched repeatedly in the same pattern.
Note this mode is *not* implemented for the interpreter, since we
currently don't have a constant pool for irregexp bytecode, and thus
cannot reference ByteArrays.
R=jgruber@chromium.org, joransiu@ca.ibm.com, junyan@redhat.com, midawson@redhat.com
BUG=
LOG=N
Change-Id: I2ded01fa2767e56e72be81b949eefb5fb85b7013
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3231981
Reviewed-by: Junliang Yan <junyan@redhat.com>
Commit-Queue: Milad Fa <mfarazma@redhat.com>
Cr-Commit-Position: refs/heads/main@{#77473}
Large character classes may easily be created when unicode
properties (e.g.: /\p{L}/u and /\P{L}/u) are used - these are
expanded internally into character classes that consist of hundreds
of character ranges. Previously to this CL, we'd emit branching code
for each of these ranges, leading to very large regexp code objects.
This CL adds a new codegen mode for large character classes (where
'large' currently means > 16 ranges). Instead of emitting branching
code inline, the ranges are written into a ByteArray and we call into
the C function IsCharacterInRangeArray for the actual branching logic.
The ByteArray is smaller than emitted code and is deduplicated if the
same character class is matched repeatedly in the same pattern.
Note this mode is *not* implemented for the interpreter, since we
currently don't have a constant pool for irregexp bytecode, and thus
cannot reference ByteArrays.
Bug: v8:11069
Change-Id: I2d728e42d85114b796c637f791848731a104cd54
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3229377
Reviewed-by: Patrick Thier <pthier@chromium.org>
Auto-Submit: Jakob Gruber <jgruber@chromium.org>
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/main@{#77463}
- Anonymous namespaces instead of static functions.
- Comments.
- Reserve enough space in the range ZoneList.
Change-Id: Ie79fda770974796cd590a155dc5fd504472e5bc9
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3220341
Auto-Submit: Jakob Gruber <jgruber@chromium.org>
Reviewed-by: Patrick Thier <pthier@chromium.org>
Commit-Queue: Jakob Gruber <jgruber@chromium.org>
Cr-Commit-Position: refs/heads/main@{#77391}