icu-case-mapping was shipped a few months ago. By dropping
the flag, unibrow's case conversion code won't be included
by default because V8_INTL_SUPPORT is on by default.
BUG=v8:4477, v8:4476
TEST=test262/{built-ins,intl402}/Strings/*, webkit/fast/js/*,
mjsunit/string-case, intl/general/case*
Cq-Include-Trybots: master.tryserver.v8:v8_linux_noi18n_rel_ng
Change-Id: I78be9cc64b4588bc5af79ecbbadf93af6e84a1df
Reviewed-on: https://chromium-review.googlesource.com/534541
Commit-Queue: Jungshik Shin <jshin@chromium.org>
Reviewed-by: Benedikt Meurer <bmeurer@chromium.org>
Reviewed-by: Daniel Ehrenberg <littledan@chromium.org>
Cr-Commit-Position: refs/heads/master@{#46304}
1.
DCHECK in runtime-i18n.cc for case mapping was wrong to
assume that the longest primary language tag is 3 characters.
BCP 47 actually allows up to 8 characters.
2. GetFlatContent() was called to a string without flattening it first.
BUG=680314,680464
TEST=intl/general/case-mapping (see also the bugs)
Review-Url: https://codereview.chromium.org/2629763003
Cr-Commit-Position: refs/heads/master@{#42343}
Use FastAsciiConvert (as used by Unibrow) for i18n-aware
case conversion with --icu_case_mapping.
Move FastAsciiConvert to src/string-case.cc so that it can be used
by both runtime-{string,i18n}.
Add more tests.
BUG=v8:4477,v8:4476
TEST=intl/general/case*
Review-Url: https://codereview.chromium.org/2533983006
Cr-Commit-Position: refs/heads/master@{#41821}
Due to a typo in runtime-i18n.js, 'ç'(U+00E7) was not uppercased while
'÷'(U+00F7) was incorrectly uppercased to '×'(U+00D7).
Add a comprehensive test for Latin-1 supplemental block (U+00A0 ~ U+00FF).
(they're special-cased for speed-up and needs to have a test for the range.).
TEST=intl/general/case-mapping
BUG=v8:5681
Review-Url: https://codereview.chromium.org/2533033003
Cr-Commit-Position: refs/heads/master@{#41331}
ICU now supports uppercasing in Greek via its regular uppercasing API.
So, there's no need to use a slow transliteration API for uppercasing
in Greek.
This CL includes rolling ICU to ICU 58.1.
Besides, drop intl402/Intl/getCanonicalLocales/weird-cases from
test262.status because it passes now with ICU 58.1.
BUG=chromium:637001,v8:5012
Review-Url: https://codereview.chromium.org/2491333003
Cr-Commit-Position: refs/heads/master@{#41009}
Throw 'Range Error: invalid string length' when the result of
case mapping is longer than the max string length (kMaxLength in
objects.h = 1 << 28 - 16).
This is for case mapping with ICU.
BUG=v8:5271
TEST=intl/general/case-mapping.js with --icu_case_mapping
Review-Url: https://codereview.chromium.org/2236593002
Cr-Commit-Position: refs/heads/master@{#38565}
When I18N is enabled, use ICU's case conversion API and transliteration
API [1] to implement String.prototype.to{Upper,Lower}Case and
String.prototype.toLocale{Upper,Lower}Case.
* ICU-based case conversion was implemented in runtime-i18n.cc/i18n.js
* The above 4 functions are overridden with those in i18n.js when
--icu_case_mapping flag is turned on. To control the override by the flag,
they're overriden in icu-case-mapping.js
Previously, toLocale{U,L}Case just called to{U,L}Case so that they didn't
support locale-sensitive case conversion for Turkic languages (az, tr),
Greek (el) and Lithuanian (lt).
Before ICU APIs for the most general case are called, a fast-path for Latin-1
is tried. It's taken from Blink and adopted as necessary. This fast path
is always tried for to{U,L}Case. For toLocale{U,L}Case, it's only taken
when a locale (explicitly specified or default) is not in {az, el, lt, tr}.
With these changes, a build with --icu_case_mapping=true passes a bunch
of tests in test262/intl402/Strings/* and intl/* that failed before.
Handling of pure ASCII strings (aligned at word boundary) are not as fast
as Unibrow's implementation that uses word-by-word case conversion. OTOH,
Latin-1 input handling is faster than Unibrow. General Unicode input
handling is slower but more accurate.
See https://docs.google.com/spreadsheets/d/1KJCJxKc1FxFXjwmYqABS0_2cNdPetvnd8gY8_HGSbrg/edit?usp=sharing for the benchmark.
This CL started with http://crrev.com/1544023002#ps200001 by littledan@,
but has changed significantly since.
[1] See why transliteration API is needed for uppercasing in Greek.
http://bugs.icu-project.org/trac/ticket/10582
R=yangguo
BUG=v8:4476,v8:4477
LOG=Y
TEST=test262/{built-ins,intl402}/Strings/*, webkit/fast/js/*, mjsunit/string-case,
intl/general/case*
Review-Url: https://codereview.chromium.org/1812673005
Cr-Commit-Position: refs/heads/master@{#36187}