v8/test/intl/testcfg.py
jshin b348d47bb9 Use ICU case conversion/transliterator for case conversion
When I18N is enabled, use ICU's case conversion API and transliteration
API [1] to implement String.prototype.to{Upper,Lower}Case and
String.prototype.toLocale{Upper,Lower}Case.

* ICU-based case conversion was implemented in runtime-i18n.cc/i18n.js
* The above 4 functions are overridden with those in i18n.js when
  --icu_case_mapping flag is turned on. To control the override by the flag,
  they're overriden in icu-case-mapping.js

Previously, toLocale{U,L}Case just called to{U,L}Case so that they didn't
support locale-sensitive case conversion for Turkic languages (az, tr),
Greek (el) and Lithuanian (lt).

Before ICU APIs for the most general case are called, a fast-path for Latin-1
is tried. It's taken from Blink and adopted as necessary. This fast path
is always tried for to{U,L}Case. For toLocale{U,L}Case, it's only taken
when a locale (explicitly specified or default) is not in {az, el, lt, tr}.

With these changes, a build with --icu_case_mapping=true passes a bunch
of tests in test262/intl402/Strings/* and intl/* that failed before.

Handling of pure ASCII strings (aligned at word boundary) are not as fast
as Unibrow's implementation that uses word-by-word case conversion. OTOH,
Latin-1 input handling is faster than Unibrow. General Unicode input
handling is slower but more accurate.

See https://docs.google.com/spreadsheets/d/1KJCJxKc1FxFXjwmYqABS0_2cNdPetvnd8gY8_HGSbrg/edit?usp=sharing for the benchmark.

This CL started with http://crrev.com/1544023002#ps200001 by littledan@,
but has changed significantly since.

[1] See why transliteration API is needed for uppercasing in Greek.
    http://bugs.icu-project.org/trac/ticket/10582

R=yangguo
BUG=v8:4476,v8:4477
LOG=Y
TEST=test262/{built-ins,intl402}/Strings/*, webkit/fast/js/*, mjsunit/string-case,
     intl/general/case*

Review-Url: https://codereview.chromium.org/1812673005
Cr-Commit-Position: refs/heads/master@{#36187}
2016-05-11 19:03:04 +00:00

87 lines
3.4 KiB
Python

# Copyright 2013 the V8 project authors. All rights reserved.
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are
# met:
#
# * Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# * Redistributions in binary form must reproduce the above
# copyright notice, this list of conditions and the following
# disclaimer in the documentation and/or other materials provided
# with the distribution.
# * Neither the name of Google Inc. nor the names of its
# contributors may be used to endorse or promote products derived
# from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
import os
import re
from testrunner.local import testsuite
from testrunner.objects import testcase
FLAGS_PATTERN = re.compile(r"//\s+Flags:(.*)")
class IntlTestSuite(testsuite.TestSuite):
def __init__(self, name, root):
super(IntlTestSuite, self).__init__(name, root)
def ListTests(self, context):
tests = []
for dirname, dirs, files in os.walk(self.root):
for dotted in [x for x in dirs if x.startswith('.')]:
dirs.remove(dotted)
dirs.sort()
files.sort()
for filename in files:
if (filename.endswith(".js") and filename != "assert.js" and
filename != "utils.js" and filename != "regexp-assert.js" and
filename != "regexp-prepare.js"):
fullpath = os.path.join(dirname, filename)
relpath = fullpath[len(self.root) + 1 : -3]
testname = relpath.replace(os.path.sep, "/")
test = testcase.TestCase(self, testname)
tests.append(test)
return tests
def GetFlagsForTestCase(self, testcase, context):
source = self.GetSourceForTest(testcase)
flags = ["--allow-natives-syntax"] + context.mode_flags
flags_match = re.findall(FLAGS_PATTERN, source)
for match in flags_match:
flags += match.strip().split()
files = []
files.append(os.path.join(self.root, "assert.js"))
files.append(os.path.join(self.root, "utils.js"))
files.append(os.path.join(self.root, "regexp-prepare.js"))
files.append(os.path.join(self.root, testcase.path + self.suffix()))
files.append(os.path.join(self.root, "regexp-assert.js"))
flags += files
if context.isolates:
flags.append("--isolate")
flags += files
return testcase.flags + flags
def GetSourceForTest(self, testcase):
filename = os.path.join(self.root, testcase.path + self.suffix())
with open(filename) as f:
return f.read()
def GetSuite(name, root):
return IntlTestSuite(name, root)