Commit Graph

9 Commits

Author SHA1 Message Date
jshin
2f5da9a551 Fix the uppercasing of U+00E7(ç) and U+00F7(÷)
Due to a typo in runtime-i18n.js, 'ç'(U+00E7) was not uppercased while
'÷'(U+00F7) was incorrectly uppercased to '×'(U+00D7).

Add a comprehensive test for Latin-1 supplemental block (U+00A0 ~ U+00FF).
(they're special-cased for speed-up and needs to have a test for the range.).

TEST=intl/general/case-mapping
BUG=v8:5681

Review-Url: https://codereview.chromium.org/2533033003
Cr-Commit-Position: refs/heads/master@{#41331}
2016-11-28 22:55:49 +00:00
jshin
4f224b3995 Use a regular ICU API for el-Upper
ICU now supports uppercasing in Greek via its regular uppercasing API.
So, there's no need to use a slow transliteration API for uppercasing
in Greek.

This CL includes rolling ICU to ICU 58.1.

Besides, drop intl402/Intl/getCanonicalLocales/weird-cases from
test262.status because it passes now with ICU 58.1.

BUG=chromium:637001,v8:5012

Review-Url: https://codereview.chromium.org/2491333003
Cr-Commit-Position: refs/heads/master@{#41009}
2016-11-15 18:30:17 +00:00
jshin
520f38fce7 Expose getCanonicalLocales() for Intl object.
Also add a test for the return object of getCanonicalLocaleList().

See https://github.com/tc39/test262/issues/745 for more details.

BUG=v8:5012
TEST=test262/intl402/Intl/getCanonicalLocales/*
TEST=intl/general/getCanonicalLocales

Review-Url: https://codereview.chromium.org/2239523002
Cr-Commit-Position: refs/heads/master@{#38733}
2016-08-18 23:27:23 +00:00
machenbach
08f7c10e38 Revert of Throw when case mapping result > max string length (patchset #3 id:40001 of https://codereview.chromium.org/2236593002/ )
Reason for revert:
The test is very flaky and made it on many configurations into the top 10 of the slowest tests:

https://build.chromium.org/p/client.v8.ports/builders/V8%20Arm/builds/845
https://build.chromium.org/p/client.v8/builders/V8%20Win32%20-%20nosnap%20-%20shared/builds/15418
https://build.chromium.org/p/client.v8/builders/V8%20Linux/builds/12369/steps/Check/logs/durations

Original issue's description:
> Throw when case mapping result > max string length
>
> Throw 'Range Error: invalid string length' when the result of
> case mapping is longer than the max string length (kMaxLength in
> objects.h = 1 << 28 - 16).
>
> This is for case mapping with ICU.
>
> BUG=v8:5271
> TEST=intl/general/case-mapping.js with --icu_case_mapping
>
> Committed: https://crrev.com/c7a2046670468b900b9dbbb4ce45beb5e0e717fd
> Cr-Commit-Position: refs/heads/master@{#38565}

TBR=littledan@chromium.org,jshin@chromium.org
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=v8:5271

Review-Url: https://codereview.chromium.org/2236393002
Cr-Commit-Position: refs/heads/master@{#38582}
2016-08-11 13:39:46 +00:00
jshin
c7a2046670 Throw when case mapping result > max string length
Throw 'Range Error: invalid string length' when the result of
case mapping is longer than the max string length (kMaxLength in
objects.h = 1 << 28 - 16).

This is for case mapping with ICU.

BUG=v8:5271
TEST=intl/general/case-mapping.js with --icu_case_mapping

Review-Url: https://codereview.chromium.org/2236593002
Cr-Commit-Position: refs/heads/master@{#38565}
2016-08-10 21:46:05 +00:00
jshin
b348d47bb9 Use ICU case conversion/transliterator for case conversion
When I18N is enabled, use ICU's case conversion API and transliteration
API [1] to implement String.prototype.to{Upper,Lower}Case and
String.prototype.toLocale{Upper,Lower}Case.

* ICU-based case conversion was implemented in runtime-i18n.cc/i18n.js
* The above 4 functions are overridden with those in i18n.js when
  --icu_case_mapping flag is turned on. To control the override by the flag,
  they're overriden in icu-case-mapping.js

Previously, toLocale{U,L}Case just called to{U,L}Case so that they didn't
support locale-sensitive case conversion for Turkic languages (az, tr),
Greek (el) and Lithuanian (lt).

Before ICU APIs for the most general case are called, a fast-path for Latin-1
is tried. It's taken from Blink and adopted as necessary. This fast path
is always tried for to{U,L}Case. For toLocale{U,L}Case, it's only taken
when a locale (explicitly specified or default) is not in {az, el, lt, tr}.

With these changes, a build with --icu_case_mapping=true passes a bunch
of tests in test262/intl402/Strings/* and intl/* that failed before.

Handling of pure ASCII strings (aligned at word boundary) are not as fast
as Unibrow's implementation that uses word-by-word case conversion. OTOH,
Latin-1 input handling is faster than Unibrow. General Unicode input
handling is slower but more accurate.

See https://docs.google.com/spreadsheets/d/1KJCJxKc1FxFXjwmYqABS0_2cNdPetvnd8gY8_HGSbrg/edit?usp=sharing for the benchmark.

This CL started with http://crrev.com/1544023002#ps200001 by littledan@,
but has changed significantly since.

[1] See why transliteration API is needed for uppercasing in Greek.
    http://bugs.icu-project.org/trac/ticket/10582

R=yangguo
BUG=v8:4476,v8:4477
LOG=Y
TEST=test262/{built-ins,intl402}/Strings/*, webkit/fast/js/*, mjsunit/string-case,
     intl/general/case*

Review-Url: https://codereview.chromium.org/1812673005
Cr-Commit-Position: refs/heads/master@{#36187}
2016-05-11 19:03:04 +00:00
yangguo@chromium.org
0dd69ec439 Allow identifier code points from supplementary multilingual planes.
ES5.1 section 6 ("Source Text"):
"Throughout the rest of this document, the phrase “code unit” and the
word “character” will be used to refer to a 16-bit unsigned value
used to represent a single 16-bit unit of text."

This changed in ES6 draft section 10.1 ("Source Text"):
"The ECMAScript code is expressed using Unicode, version 5.1 or later.
ECMAScript source text is a sequence of code points. All Unicode code
point values from U+0000 to U+10FFFF, including surrogate code points,
may occur in source text where permitted by the ECMAScript grammars."

This patch is to reflect this spec change.

BUG=v8:3617
LOG=Y
R=jochen@chromium.org

Review URL: https://codereview.chromium.org/640193002

git-svn-id: https://v8.googlecode.com/svn/branches/bleeding_edge@24510 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2014-10-10 07:13:46 +00:00
jochen@chromium.org
8bee9f0c3a Remove test that v8Intl symbol exists, as we don't define it anymore.
R=jkummerow@chromium.org

Review URL: https://codereview.chromium.org/21511002

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@16013 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2013-08-01 19:20:42 +00:00
jochen@chromium.org
c61c74d24f Import intl test suite from v8-i18n project
BUG=v8:2745
R=jkummerow@chromium.org

Review URL: https://codereview.chromium.org/18687003

git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@15584 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
2013-07-10 10:49:04 +00:00