Commit Graph

12 Commits

Author SHA1 Message Date
jshin
ac9e628539 Fix two DCHECK failures in ICU case mapping code
1.
DCHECK in runtime-i18n.cc for case mapping was wrong to
assume that the longest primary language tag is 3 characters.
BCP 47 actually allows up to 8 characters.

2. GetFlatContent() was called to a string without flattening it first.

BUG=680314,680464
TEST=intl/general/case-mapping (see also the bugs)

Review-Url: https://codereview.chromium.org/2629763003
Cr-Commit-Position: refs/heads/master@{#42343}
2017-01-13 23:12:43 +00:00
littledan
b0a09d7809 [intl] Add new semantics + compat fallback to Intl constructor
ECMA 402 v2 made Intl constructors more strict in terms of how they would
initialize objects, refusing to initialize objects which have already
been constructed. However, when Chrome tried to ship these semantics,
we ran into web compatibility issues.

This patch tries to square the circle and implement the simpler v2 object
semantics while including a compatibility workaround to allow objects to
sort of be initialized later, storing the real underlying Intl object
in a symbol-named property.

The new semantics are described in this PR against the ECMA 402 spec:
https://github.com/tc39/ecma402/pull/84

BUG=v8:4360, v8:4870
LOG=Y

Review-Url: https://codereview.chromium.org/2582993002
Cr-Commit-Position: refs/heads/master@{#41943}
2016-12-23 14:32:16 +00:00
jshin
af38272dd9 Optimize case conversion with icu_case_mapping
Use FastAsciiConvert (as used by Unibrow) for i18n-aware
case conversion with --icu_case_mapping.

Move FastAsciiConvert to src/string-case.cc so that it can be used
by both runtime-{string,i18n}.

Add more tests.

BUG=v8:4477,v8:4476
TEST=intl/general/case*

Review-Url: https://codereview.chromium.org/2533983006
Cr-Commit-Position: refs/heads/master@{#41821}
2016-12-19 18:43:55 +00:00
jshin
2f5da9a551 Fix the uppercasing of U+00E7(ç) and U+00F7(÷)
Due to a typo in runtime-i18n.js, 'ç'(U+00E7) was not uppercased while
'÷'(U+00F7) was incorrectly uppercased to '×'(U+00D7).

Add a comprehensive test for Latin-1 supplemental block (U+00A0 ~ U+00FF).
(they're special-cased for speed-up and needs to have a test for the range.).

TEST=intl/general/case-mapping
BUG=v8:5681

Review-Url: https://codereview.chromium.org/2533033003
Cr-Commit-Position: refs/heads/master@{#41331}
2016-11-28 22:55:49 +00:00
jshin
4f224b3995 Use a regular ICU API for el-Upper
ICU now supports uppercasing in Greek via its regular uppercasing API.
So, there's no need to use a slow transliteration API for uppercasing
in Greek.

This CL includes rolling ICU to ICU 58.1.

Besides, drop intl402/Intl/getCanonicalLocales/weird-cases from
test262.status because it passes now with ICU 58.1.

BUG=chromium:637001,v8:5012

Review-Url: https://codereview.chromium.org/2491333003
Cr-Commit-Position: refs/heads/master@{#41009}
2016-11-15 18:30:17 +00:00
jshin
520f38fce7 Expose getCanonicalLocales() for Intl object.
Also add a test for the return object of getCanonicalLocaleList().

See https://github.com/tc39/test262/issues/745 for more details.

BUG=v8:5012
TEST=test262/intl402/Intl/getCanonicalLocales/*
TEST=intl/general/getCanonicalLocales

Review-Url: https://codereview.chromium.org/2239523002
Cr-Commit-Position: refs/heads/master@{#38733}
2016-08-18 23:27:23 +00:00
machenbach
08f7c10e38 Revert of Throw when case mapping result > max string length (patchset #3 id:40001 of https://codereview.chromium.org/2236593002/ )
Reason for revert:
The test is very flaky and made it on many configurations into the top 10 of the slowest tests:

https://build.chromium.org/p/client.v8.ports/builders/V8%20Arm/builds/845
https://build.chromium.org/p/client.v8/builders/V8%20Win32%20-%20nosnap%20-%20shared/builds/15418
https://build.chromium.org/p/client.v8/builders/V8%20Linux/builds/12369/steps/Check/logs/durations

Original issue's description:
> Throw when case mapping result > max string length
>
> Throw 'Range Error: invalid string length' when the result of
> case mapping is longer than the max string length (kMaxLength in
> objects.h = 1 << 28 - 16).
>
> This is for case mapping with ICU.
>
> BUG=v8:5271
> TEST=intl/general/case-mapping.js with --icu_case_mapping
>
> Committed: https://crrev.com/c7a2046670468b900b9dbbb4ce45beb5e0e717fd
> Cr-Commit-Position: refs/heads/master@{#38565}

TBR=littledan@chromium.org,jshin@chromium.org
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=v8:5271

Review-Url: https://codereview.chromium.org/2236393002
Cr-Commit-Position: refs/heads/master@{#38582}
2016-08-11 13:39:46 +00:00
jshin
c7a2046670 Throw when case mapping result > max string length
Throw 'Range Error: invalid string length' when the result of
case mapping is longer than the max string length (kMaxLength in
objects.h = 1 << 28 - 16).

This is for case mapping with ICU.

BUG=v8:5271
TEST=intl/general/case-mapping.js with --icu_case_mapping

Review-Url: https://codereview.chromium.org/2236593002
Cr-Commit-Position: refs/heads/master@{#38565}
2016-08-10 21:46:05 +00:00
jshin
b348d47bb9 Use ICU case conversion/transliterator for case conversion
When I18N is enabled, use ICU's case conversion API and transliteration
API [1] to implement String.prototype.to{Upper,Lower}Case and
String.prototype.toLocale{Upper,Lower}Case.

* ICU-based case conversion was implemented in runtime-i18n.cc/i18n.js
* The above 4 functions are overridden with those in i18n.js when
  --icu_case_mapping flag is turned on. To control the override by the flag,
  they're overriden in icu-case-mapping.js

Previously, toLocale{U,L}Case just called to{U,L}Case so that they didn't
support locale-sensitive case conversion for Turkic languages (az, tr),
Greek (el) and Lithuanian (lt).

Before ICU APIs for the most general case are called, a fast-path for Latin-1
is tried. It's taken from Blink and adopted as necessary. This fast path
is always tried for to{U,L}Case. For toLocale{U,L}Case, it's only taken
when a locale (explicitly specified or default) is not in {az, el, lt, tr}.

With these changes, a build with --icu_case_mapping=true passes a bunch
of tests in test262/intl402/Strings/* and intl/* that failed before.

Handling of pure ASCII strings (aligned at word boundary) are not as fast
as Unibrow's implementation that uses word-by-word case conversion. OTOH,
Latin-1 input handling is faster than Unibrow. General Unicode input
handling is slower but more accurate.

See https://docs.google.com/spreadsheets/d/1KJCJxKc1FxFXjwmYqABS0_2cNdPetvnd8gY8_HGSbrg/edit?usp=sharing for the benchmark.

This CL started with http://crrev.com/1544023002#ps200001 by littledan@,
but has changed significantly since.

[1] See why transliteration API is needed for uppercasing in Greek.
    http://bugs.icu-project.org/trac/ticket/10582

R=yangguo
BUG=v8:4476,v8:4477
LOG=Y
TEST=test262/{built-ins,intl402}/Strings/*, webkit/fast/js/*, mjsunit/string-case,
     intl/general/case*

Review-Url: https://codereview.chromium.org/1812673005
Cr-Commit-Position: refs/heads/master@{#36187}
2016-05-11 19:03:04 +00:00
yangguo@chromium.org
0dd69ec439 Allow identifier code points from supplementary multilingual planes.
ES5.1 section 6 ("Source Text"):
"Throughout the rest of this document, the phrase “code unit” and the
word “character” will be used to refer to a 16-bit unsigned value
used to represent a single 16-bit unit of text."

This changed in ES6 draft section 10.1 ("Source Text"):
"The ECMAScript code is expressed using Unicode, version 5.1 or later.
ECMAScript source text is a sequence of code points. All Unicode code
point values from U+0000 to U+10FFFF, including surrogate code points,
may occur in source text where permitted by the ECMAScript grammars."

This patch is to reflect this spec change.

BUG=v8:3617
LOG=Y
R=jochen@chromium.org

Review URL: https://codereview.chromium.org/640193002

git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@24510 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-10-10 07:13:46 +00:00
jochen@chromium.org
8bee9f0c3a Remove test that v8Intl symbol exists, as we don't define it anymore.
R=jkummerow@chromium.org

Review URL: https://codereview.chromium.org/21511002

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@16013 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2013-08-01 19:20:42 +00:00
jochen@chromium.org
c61c74d24f Import intl test suite from v8-i18n project
BUG=v8:2745
R=jkummerow@chromium.org

Review URL: https://codereview.chromium.org/18687003

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@15584 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2013-07-10 10:49:04 +00:00