Moves the unicode predicate cache tables out of the unicode cache,
and turns them into generic predicates in char-predicates.h which
use static constexpr tables.
This drops the per-isolate cost of unicode caches, and removes the
need for accessing the unicode cache from most files. It does remove
the mutability of the cache, which means that there may be regressions
when parsing non-ASCII identifiers. Most likely the benefits to ASCII
identifiers/keywords will outweigh any non-ASCII costs.
Change-Id: I9a7a8b7c9b22d3e9ede824ab4e27f133ce20a399
Reviewed-on: https://chromium-review.googlesource.com/c/1335564
Reviewed-by: Yang Guo <yangguo@chromium.org>
Reviewed-by: Toon Verwaest <verwaest@chromium.org>
Reviewed-by: Michael Starzinger <mstarzinger@chromium.org>
Commit-Queue: Leszek Swirski <leszeks@chromium.org>
Cr-Commit-Position: refs/heads/master@{#57506}
Change the keyword/identifier scan to a single loop that branchlessly
collects information on whether this is a possible keyword, identifier
terminator or slow path (i.e. escapes) by looking up the value in a
flags table (as long as the character is ascii).
Also rewrites that loop as an AdvanceUntil, and sprinkles in some
V8_LIKELY magic which is 'likely' to improve things.
Change-Id: If06b0fff23630e7593b515308e5ffeca2d65daa8
Reviewed-on: https://chromium-review.googlesource.com/c/1328943
Reviewed-by: Leszek Swirski <leszeks@chromium.org>
Reviewed-by: Toon Verwaest <verwaest@chromium.org>
Commit-Queue: Leszek Swirski <leszeks@chromium.org>
Cr-Commit-Position: refs/heads/master@{#57391}
For such a simple predicate, calling a(n inline) function that checks against
the values is faster (*) than maintaining the cache.
(*) When scanning a file that contains only comments, we're basically calling
IsLineTerminator in a loop. Parsing such files is now 7-18% faster in local
experiments.
BUG=v8:6092
Change-Id: I6a8f2aba9669a76152292f4e6c7853638d15aae3
Reviewed-on: https://chromium-review.googlesource.com/645633
Commit-Queue: Marja Hölttä <marja@chromium.org>
Reviewed-by: Adam Klein <adamk@chromium.org>
Cr-Commit-Position: refs/heads/master@{#47810}
Use ICU to check ID_Start, ID_Continue and WhiteSpace even for BMP
when V8_INTL_SUPPORT is on (which is default).
Change LineTerminator::Is() to check 4 code points from
ES#sec-line-terminators instead of using tables and Lookup function.
Remove Lowercase::Is(). It's not used anywhere.
Update webkit/{ToNumber,parseFloat}.js to have the correct expectation
for U+180E and the corresponding expected files. This is a follow-up to
an earlier change ( https://codereview.chromium.org/2720953003 ).
CQ_INCLUDE_TRYBOTS=master.tryserver.v8:v8_win_dbg,v8_mac_dbg;master.tryserver.chromium.android:android_arm64_dbg_recipe
CQ_INCLUDE_TRYBOTS=master.tryserver.v8:v8_linux_noi18n_rel_ng
BUG=v8:5370,v8:5155
TEST=unittests --gtest_filter=CharP*
TEST=webkit: ToNumber, parseFloat
TEST=test262: built-ins/Number/S9.3*, built-ins/parse{Int,Float}/S15*
TEST=test262: language/white-space/mong*
TEST=test262: built-ins/String/prototype/trim/u180e
TEST=mjsunit: whitespaces
Review-Url: https://codereview.chromium.org/2331303002
Cr-Commit-Position: refs/heads/master@{#45957}
This enables linter checking for "readability/namespace" violations
during presubmit and instead marks the few known exceptions that we
allow explicitly.
R=bmeurer@chromium.org
Review URL: https://codereview.chromium.org/1371083003
Cr-Commit-Position: refs/heads/master@{#31019}
This tries to remove includes of "-inl.h" headers from normal ".h"
headers, thereby reducing the chance of any cyclic dependencies and
decreasing the average size of our compilation units.
Note that this change still leaves 7 violations of that rule in the
code. However there now is the "tools/check-inline-includes.sh" tool
detecting such violations.
R=bmeurer@chromium.org
Review URL: https://codereview.chromium.org/1283033003
Cr-Commit-Position: refs/heads/master@{#30125}
ES5.1 section 6 ("Source Text"):
"Throughout the rest of this document, the phrase “code unit” and the
word “character” will be used to refer to a 16-bit unsigned value
used to represent a single 16-bit unit of text."
This changed in ES6 draft section 10.1 ("Source Text"):
"The ECMAScript code is expressed using Unicode, version 5.1 or later.
ECMAScript source text is a sequence of code points. All Unicode code
point values from U+0000 to U+10FFFF, including surrogate code points,
may occur in source text where permitted by the ECMAScript grammars."
This patch is to reflect this spec change.
BUG=v8:3617
LOG=Y
R=jochen@chromium.org
Review URL: https://codereview.chromium.org/640193002
git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@24510 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
This issue was raised by Brett Wilson while reviewing my changelist for readability. Craig Silverstein (one of C++ SG maintainers) confirmed that we should declare one namespace per line. Our way of namespaces closing seems not violating style guides (there is no clear agreement on it), so I left it intact.
Review URL: http://codereview.chromium.org/115756
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2038 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
- Splitting of character classes into word and non-word parts.
- A bunch of refactorings.
- Made dispatch table construction lazy.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@880 ce2b1a6d-e550-0410-aec6-3dcde31c8c00