2013-07-10 10:49:04 +00:00
|
|
|
# Copyright 2013 the V8 project authors. All rights reserved.
|
|
|
|
# Redistribution and use in source and binary forms, with or without
|
|
|
|
# modification, are permitted provided that the following conditions are
|
|
|
|
# met:
|
|
|
|
#
|
|
|
|
# * Redistributions of source code must retain the above copyright
|
|
|
|
# notice, this list of conditions and the following disclaimer.
|
|
|
|
# * Redistributions in binary form must reproduce the above
|
|
|
|
# copyright notice, this list of conditions and the following
|
|
|
|
# disclaimer in the documentation and/or other materials provided
|
|
|
|
# with the distribution.
|
|
|
|
# * Neither the name of Google Inc. nor the names of its
|
|
|
|
# contributors may be used to endorse or promote products derived
|
|
|
|
# from this software without specific prior written permission.
|
|
|
|
#
|
|
|
|
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
|
|
|
|
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
|
|
|
|
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
|
|
|
|
# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
|
|
|
|
# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
|
|
|
|
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
|
|
|
|
# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
|
|
|
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
|
|
|
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
|
|
|
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
|
|
|
|
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
|
|
|
|
|
|
|
import os
|
Use ICU case conversion/transliterator for case conversion
When I18N is enabled, use ICU's case conversion API and transliteration
API [1] to implement String.prototype.to{Upper,Lower}Case and
String.prototype.toLocale{Upper,Lower}Case.
* ICU-based case conversion was implemented in runtime-i18n.cc/i18n.js
* The above 4 functions are overridden with those in i18n.js when
--icu_case_mapping flag is turned on. To control the override by the flag,
they're overriden in icu-case-mapping.js
Previously, toLocale{U,L}Case just called to{U,L}Case so that they didn't
support locale-sensitive case conversion for Turkic languages (az, tr),
Greek (el) and Lithuanian (lt).
Before ICU APIs for the most general case are called, a fast-path for Latin-1
is tried. It's taken from Blink and adopted as necessary. This fast path
is always tried for to{U,L}Case. For toLocale{U,L}Case, it's only taken
when a locale (explicitly specified or default) is not in {az, el, lt, tr}.
With these changes, a build with --icu_case_mapping=true passes a bunch
of tests in test262/intl402/Strings/* and intl/* that failed before.
Handling of pure ASCII strings (aligned at word boundary) are not as fast
as Unibrow's implementation that uses word-by-word case conversion. OTOH,
Latin-1 input handling is faster than Unibrow. General Unicode input
handling is slower but more accurate.
See https://docs.google.com/spreadsheets/d/1KJCJxKc1FxFXjwmYqABS0_2cNdPetvnd8gY8_HGSbrg/edit?usp=sharing for the benchmark.
This CL started with http://crrev.com/1544023002#ps200001 by littledan@,
but has changed significantly since.
[1] See why transliteration API is needed for uppercasing in Greek.
http://bugs.icu-project.org/trac/ticket/10582
R=yangguo
BUG=v8:4476,v8:4477
LOG=Y
TEST=test262/{built-ins,intl402}/Strings/*, webkit/fast/js/*, mjsunit/string-case,
intl/general/case*
Review-Url: https://codereview.chromium.org/1812673005
Cr-Commit-Position: refs/heads/master@{#36187}
2016-05-11 19:01:41 +00:00
|
|
|
import re
|
2013-07-10 10:49:04 +00:00
|
|
|
|
|
|
|
from testrunner.local import testsuite
|
|
|
|
from testrunner.objects import testcase
|
|
|
|
|
Use ICU case conversion/transliterator for case conversion
When I18N is enabled, use ICU's case conversion API and transliteration
API [1] to implement String.prototype.to{Upper,Lower}Case and
String.prototype.toLocale{Upper,Lower}Case.
* ICU-based case conversion was implemented in runtime-i18n.cc/i18n.js
* The above 4 functions are overridden with those in i18n.js when
--icu_case_mapping flag is turned on. To control the override by the flag,
they're overriden in icu-case-mapping.js
Previously, toLocale{U,L}Case just called to{U,L}Case so that they didn't
support locale-sensitive case conversion for Turkic languages (az, tr),
Greek (el) and Lithuanian (lt).
Before ICU APIs for the most general case are called, a fast-path for Latin-1
is tried. It's taken from Blink and adopted as necessary. This fast path
is always tried for to{U,L}Case. For toLocale{U,L}Case, it's only taken
when a locale (explicitly specified or default) is not in {az, el, lt, tr}.
With these changes, a build with --icu_case_mapping=true passes a bunch
of tests in test262/intl402/Strings/* and intl/* that failed before.
Handling of pure ASCII strings (aligned at word boundary) are not as fast
as Unibrow's implementation that uses word-by-word case conversion. OTOH,
Latin-1 input handling is faster than Unibrow. General Unicode input
handling is slower but more accurate.
See https://docs.google.com/spreadsheets/d/1KJCJxKc1FxFXjwmYqABS0_2cNdPetvnd8gY8_HGSbrg/edit?usp=sharing for the benchmark.
This CL started with http://crrev.com/1544023002#ps200001 by littledan@,
but has changed significantly since.
[1] See why transliteration API is needed for uppercasing in Greek.
http://bugs.icu-project.org/trac/ticket/10582
R=yangguo
BUG=v8:4476,v8:4477
LOG=Y
TEST=test262/{built-ins,intl402}/Strings/*, webkit/fast/js/*, mjsunit/string-case,
intl/general/case*
Review-Url: https://codereview.chromium.org/1812673005
Cr-Commit-Position: refs/heads/master@{#36187}
2016-05-11 19:01:41 +00:00
|
|
|
FLAGS_PATTERN = re.compile(r"//\s+Flags:(.*)")
|
2013-07-10 10:49:04 +00:00
|
|
|
|
|
|
|
class IntlTestSuite(testsuite.TestSuite):
|
|
|
|
|
|
|
|
def __init__(self, name, root):
|
|
|
|
super(IntlTestSuite, self).__init__(name, root)
|
|
|
|
|
|
|
|
def ListTests(self, context):
|
|
|
|
tests = []
|
|
|
|
for dirname, dirs, files in os.walk(self.root):
|
|
|
|
for dotted in [x for x in dirs if x.startswith('.')]:
|
|
|
|
dirs.remove(dotted)
|
|
|
|
dirs.sort()
|
|
|
|
files.sort()
|
|
|
|
for filename in files:
|
|
|
|
if (filename.endswith(".js") and filename != "assert.js" and
|
2016-03-29 11:55:33 +00:00
|
|
|
filename != "utils.js" and filename != "regexp-assert.js" and
|
|
|
|
filename != "regexp-prepare.js"):
|
2015-09-17 13:00:57 +00:00
|
|
|
fullpath = os.path.join(dirname, filename)
|
|
|
|
relpath = fullpath[len(self.root) + 1 : -3]
|
|
|
|
testname = relpath.replace(os.path.sep, "/")
|
2013-07-10 10:49:04 +00:00
|
|
|
test = testcase.TestCase(self, testname)
|
|
|
|
tests.append(test)
|
|
|
|
return tests
|
|
|
|
|
|
|
|
def GetFlagsForTestCase(self, testcase, context):
|
Use ICU case conversion/transliterator for case conversion
When I18N is enabled, use ICU's case conversion API and transliteration
API [1] to implement String.prototype.to{Upper,Lower}Case and
String.prototype.toLocale{Upper,Lower}Case.
* ICU-based case conversion was implemented in runtime-i18n.cc/i18n.js
* The above 4 functions are overridden with those in i18n.js when
--icu_case_mapping flag is turned on. To control the override by the flag,
they're overriden in icu-case-mapping.js
Previously, toLocale{U,L}Case just called to{U,L}Case so that they didn't
support locale-sensitive case conversion for Turkic languages (az, tr),
Greek (el) and Lithuanian (lt).
Before ICU APIs for the most general case are called, a fast-path for Latin-1
is tried. It's taken from Blink and adopted as necessary. This fast path
is always tried for to{U,L}Case. For toLocale{U,L}Case, it's only taken
when a locale (explicitly specified or default) is not in {az, el, lt, tr}.
With these changes, a build with --icu_case_mapping=true passes a bunch
of tests in test262/intl402/Strings/* and intl/* that failed before.
Handling of pure ASCII strings (aligned at word boundary) are not as fast
as Unibrow's implementation that uses word-by-word case conversion. OTOH,
Latin-1 input handling is faster than Unibrow. General Unicode input
handling is slower but more accurate.
See https://docs.google.com/spreadsheets/d/1KJCJxKc1FxFXjwmYqABS0_2cNdPetvnd8gY8_HGSbrg/edit?usp=sharing for the benchmark.
This CL started with http://crrev.com/1544023002#ps200001 by littledan@,
but has changed significantly since.
[1] See why transliteration API is needed for uppercasing in Greek.
http://bugs.icu-project.org/trac/ticket/10582
R=yangguo
BUG=v8:4476,v8:4477
LOG=Y
TEST=test262/{built-ins,intl402}/Strings/*, webkit/fast/js/*, mjsunit/string-case,
intl/general/case*
Review-Url: https://codereview.chromium.org/1812673005
Cr-Commit-Position: refs/heads/master@{#36187}
2016-05-11 19:01:41 +00:00
|
|
|
source = self.GetSourceForTest(testcase)
|
2013-08-01 19:43:06 +00:00
|
|
|
flags = ["--allow-natives-syntax"] + context.mode_flags
|
Use ICU case conversion/transliterator for case conversion
When I18N is enabled, use ICU's case conversion API and transliteration
API [1] to implement String.prototype.to{Upper,Lower}Case and
String.prototype.toLocale{Upper,Lower}Case.
* ICU-based case conversion was implemented in runtime-i18n.cc/i18n.js
* The above 4 functions are overridden with those in i18n.js when
--icu_case_mapping flag is turned on. To control the override by the flag,
they're overriden in icu-case-mapping.js
Previously, toLocale{U,L}Case just called to{U,L}Case so that they didn't
support locale-sensitive case conversion for Turkic languages (az, tr),
Greek (el) and Lithuanian (lt).
Before ICU APIs for the most general case are called, a fast-path for Latin-1
is tried. It's taken from Blink and adopted as necessary. This fast path
is always tried for to{U,L}Case. For toLocale{U,L}Case, it's only taken
when a locale (explicitly specified or default) is not in {az, el, lt, tr}.
With these changes, a build with --icu_case_mapping=true passes a bunch
of tests in test262/intl402/Strings/* and intl/* that failed before.
Handling of pure ASCII strings (aligned at word boundary) are not as fast
as Unibrow's implementation that uses word-by-word case conversion. OTOH,
Latin-1 input handling is faster than Unibrow. General Unicode input
handling is slower but more accurate.
See https://docs.google.com/spreadsheets/d/1KJCJxKc1FxFXjwmYqABS0_2cNdPetvnd8gY8_HGSbrg/edit?usp=sharing for the benchmark.
This CL started with http://crrev.com/1544023002#ps200001 by littledan@,
but has changed significantly since.
[1] See why transliteration API is needed for uppercasing in Greek.
http://bugs.icu-project.org/trac/ticket/10582
R=yangguo
BUG=v8:4476,v8:4477
LOG=Y
TEST=test262/{built-ins,intl402}/Strings/*, webkit/fast/js/*, mjsunit/string-case,
intl/general/case*
Review-Url: https://codereview.chromium.org/1812673005
Cr-Commit-Position: refs/heads/master@{#36187}
2016-05-11 19:01:41 +00:00
|
|
|
flags_match = re.findall(FLAGS_PATTERN, source)
|
|
|
|
for match in flags_match:
|
|
|
|
flags += match.strip().split()
|
2013-07-10 10:49:04 +00:00
|
|
|
|
|
|
|
files = []
|
|
|
|
files.append(os.path.join(self.root, "assert.js"))
|
|
|
|
files.append(os.path.join(self.root, "utils.js"))
|
2016-03-29 11:55:33 +00:00
|
|
|
files.append(os.path.join(self.root, "regexp-prepare.js"))
|
2013-07-10 10:49:04 +00:00
|
|
|
files.append(os.path.join(self.root, testcase.path + self.suffix()))
|
2016-03-29 11:55:33 +00:00
|
|
|
files.append(os.path.join(self.root, "regexp-assert.js"))
|
2013-07-10 10:49:04 +00:00
|
|
|
|
|
|
|
flags += files
|
|
|
|
if context.isolates:
|
|
|
|
flags.append("--isolate")
|
|
|
|
flags += files
|
|
|
|
|
|
|
|
return testcase.flags + flags
|
|
|
|
|
Use ICU case conversion/transliterator for case conversion
When I18N is enabled, use ICU's case conversion API and transliteration
API [1] to implement String.prototype.to{Upper,Lower}Case and
String.prototype.toLocale{Upper,Lower}Case.
* ICU-based case conversion was implemented in runtime-i18n.cc/i18n.js
* The above 4 functions are overridden with those in i18n.js when
--icu_case_mapping flag is turned on. To control the override by the flag,
they're overriden in icu-case-mapping.js
Previously, toLocale{U,L}Case just called to{U,L}Case so that they didn't
support locale-sensitive case conversion for Turkic languages (az, tr),
Greek (el) and Lithuanian (lt).
Before ICU APIs for the most general case are called, a fast-path for Latin-1
is tried. It's taken from Blink and adopted as necessary. This fast path
is always tried for to{U,L}Case. For toLocale{U,L}Case, it's only taken
when a locale (explicitly specified or default) is not in {az, el, lt, tr}.
With these changes, a build with --icu_case_mapping=true passes a bunch
of tests in test262/intl402/Strings/* and intl/* that failed before.
Handling of pure ASCII strings (aligned at word boundary) are not as fast
as Unibrow's implementation that uses word-by-word case conversion. OTOH,
Latin-1 input handling is faster than Unibrow. General Unicode input
handling is slower but more accurate.
See https://docs.google.com/spreadsheets/d/1KJCJxKc1FxFXjwmYqABS0_2cNdPetvnd8gY8_HGSbrg/edit?usp=sharing for the benchmark.
This CL started with http://crrev.com/1544023002#ps200001 by littledan@,
but has changed significantly since.
[1] See why transliteration API is needed for uppercasing in Greek.
http://bugs.icu-project.org/trac/ticket/10582
R=yangguo
BUG=v8:4476,v8:4477
LOG=Y
TEST=test262/{built-ins,intl402}/Strings/*, webkit/fast/js/*, mjsunit/string-case,
intl/general/case*
Review-Url: https://codereview.chromium.org/1812673005
Cr-Commit-Position: refs/heads/master@{#36187}
2016-05-11 19:01:41 +00:00
|
|
|
def GetSourceForTest(self, testcase):
|
|
|
|
filename = os.path.join(self.root, testcase.path + self.suffix())
|
|
|
|
with open(filename) as f:
|
|
|
|
return f.read()
|
2013-07-10 10:49:04 +00:00
|
|
|
|
|
|
|
def GetSuite(name, root):
|
|
|
|
return IntlTestSuite(name, root)
|