* Mongolian and Phags-pa characters have been given a Joining_Type
classification for contextual shaping. As a part of these additions,
one Phags-pa character has the Joining_Type value of L (Left Joining),
which no character had been assigned before.
* The unassigned code points in the Currency Symbols block have been
given the Bidi_Class property value ET and the Line_Break property
value PR, to help implementations support new currency symbols,
when they are encoded.
* Hebrew letters and basic punctuation marks have been assigned
the newly introduced Word_Break property values Hebrew_Letter,
Single_Quote, and Double_Quote.
* The Bidi_Class property has been extended with four new values
for directional isolates.
For more details, see http://www.unicode.org/versions/Unicode6.3.0/
Change-Id: Iad62d02edc58a8497898dcd6d6c70d5aece317ea
Reviewed-by: Lars Knoll <lars.knoll@digia.com>
The issue is already fixed in 5.0 but let's be nice and ensure the issue
won't be reintroduced later.
Task-number: QTBUG-30931
Change-Id: Ia6944acaf6e7217f8d0f1fa75d0e9977db11d892
Reviewed-by: Lars Knoll <lars.knoll@digia.com>
...and remove the outdated QUnicodeTables::Script enum.
QFontEngineData now has one extra slot that never used
(engines[QChar::Script_Inherited]). engines[QChar::Script_Unknown],
if accessed, would be set with a Box engine instance, and could be used
as a minor optimization some time later.
In order to preserve the existing behavior, we map all scripts up to Latin to Common.
Change-Id: Ide4182a0f8447b4bf25713ecc3fe8097b8fed040
Reviewed-by: Pierre Rossi <pierre.rossi@gmail.com>
Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>
...where the values are not aliased to Common script.
The old QUnicodeTables::Script enum was retained for compatibility reasons
until Qt internals are updated to use QChar::script().
Using QChar::Script instead of QUnicodeTables::Script would improve both
the text analysis (itemization, boundary finding) and the text shaping quality.
This also a required step for switching to Hurfbuzz-NG.
/* This adds 6668 more .rodata bytes */
Change-Id: I5aa3d12c550528d0052542436990f8d0779ea8e5
Reviewed-by: Eskil Abrahamsen Blomfeldt <eskil.abrahamsen-blomfeldt@digia.com>
Version 6.2 of the Unicode Standard is a special release
dedicated to the early publication of the newly encoded Turkish lira sign.
In addition, there are some significant changes to the Unicode algorithms
for text segmentation and line breaking to improve breaking for emoji symbols.
For more details, see http://www.unicode.org/versions/Unicode6.2.0/
Change-Id: I21cfd4f307e41b41a19d36cce87f7a44c2661bc2
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Reviewed-by: Lars Knoll <lars.knoll@digia.com>
Change copyrights and license headers from Nokia to Digia
Change-Id: If1cc974286d29fd01ec6c19dd4719a67f4c3f00e
Reviewed-by: Lars Knoll <lars.knoll@digia.com>
Reviewed-by: Sergio Ahumada <sergio.ahumada@digia.com>
Qt 5.0 beta requires changing the default to the 5.0 API, disabling
the deprecated code. However, tests should test (and often do) the
compatibility API too, so turn it back on.
Task-number: QTBUG-25053
Change-Id: I8129c3ef3cb58541c95a32d083850d9e7f768927
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
Reviewed-by: Olivier Goffart <ogoffart@woboq.com>
Instead of expanding the scripts table with script values for the code points
>= 0x10000, it has been merged with the properties table in order to
increase perfomance of the script itemization code (not affected yet).
(Stats: the properties table grew up in 97428-89800 = 7628 bytes;
the old scripts table was of size 7680 bytes)
The outdated ScriptsInitial.txt and ScriptsCorrections.txt file has been removed
(they were just empty, the "corrigendum" script corrections should be applied
to Scripts.txt directly, *no customization allowed*!).
More script testcases has been added - at least one per supported script.
Task-number: QTBUG-6530
Change-Id: I40a9e76f681e2dd552fd4c61af0808d043962e79
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
Update NormalizationTest.txt data file with one from UCD 6.1;
Add few more QChar::unicodeVersion() testcases;
Add some line break class mapping testcases;
Add some exceptional case mapping testcases;
Add script class mapping test;
Change-Id: I164394984abb2b893c8db62fb77e7bd87aa0850b
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
Chars that have a case conversion that converts
them into several characters can't be handled
by QChar::toUpper() etc and should get ignored. The code
didn't do that correctly.
Change-Id: I281d122e90bf49187b6449088d2fccef2ef75e86
Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
+ QChar::LastValidCodePoint enum value that supercede the UNICODE_LAST_CODEPOINT macro
replace uses of hardcoded values with the new API; remove leftovers
Change-Id: I1395c9840b85fcb6b08e241b131794a98773c952
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
inline all non-static members to a static ones (declared with QT_FASTCALL),
ushort converts automatically to uint and the conversion cost is minimal.
Task-Number: QTBUG-13052
Change-Id: I189a6f205736766adcd3de2d61cee71f30cc64f3
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
> http://www.unicode.org/versions/Unicode5.2.0/
D. Character Additions:
There are three new characters in the newly-encoded Kaithi script that will
require changes in implementations which make hard-coded assumptions about
composition during normalization. Most new characters added to the standard
with decompositions cannot be generated by the operations toNFC() or toNFKC),
but these three can. Implementers should check their code carefully
to ensure that it handles these three characters correctly.
U+1109A KAITHI LETTER DDDHA
U+1109C KAITHI LETTER RHA
U+110AB KAITHI LETTER VA
UCD 6.1 adds two more of them:
U+1112E CHAKMA VOWEL SIGN O
U+1112F CHAKMA VOWEL SIGN AU
Change-Id: I781a26848078d8b83a182b0fd4e681be2a6d9a27
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
results are now equals to results of ICU's u_isprint() for the entire set
of the Unicode code points
Change-Id: I763f4b37cccd285eb01543d486f25bd7ea011241
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
http://unicode.org/versions/corrigendum6.html:
> in Unicode 5.0, the list of characters with the Bidi_Mirrored property
> was made consistent for brackets and quotation marks, in preparation for
> new constraints on bidi mirroring. However, after publication of
> Unicode 5.0.0 it was discovered that this change adversely affected
> several quotation mark characters in deployed data.
Task-number: QTBUG-25169
Change-Id: Id49caf401af2d5a1e6dbcc32b2f350aa20b7f901
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
As in the past, to avoid rewriting various autotests that contain
line-number information, an extra blank line has been inserted at the
end of the license text to ensure that this commit does not change the
total number of lines in the license header.
Change-Id: I311e001373776812699d6efc045b5f742890c689
Reviewed-by: Rohan McGovern <rohan.mcgovern@nokia.com>
These comments were mostly empty or inaccurate. Appropriate naming of
tests and appropriate placement of tests within the directory tree
provide more reliable indicators of what is being tested.
Change-Id: Ib6bf373d9e79917e4ab1417ee5c1264a2c2d7027
Reviewed-by: Rohan McGovern <rohan.mcgovern@nokia.com>
In .pro files, removed wince/symbian-specific DEPLOYMENT cases and
replaced them with TESTDATA where appropriate.
In .cpp files, removed SRCDIR and relative paths to testdata and
replaced them with the QFINDTESTDATA macro where appropriate.
Modified test helper apps/libs to install themselves under the test
they relate to.
This change allows corelib tests to be correctly installed, along with
their testdata, via `make install'.
Change-Id: I5e202e2f3b577af7e39072d5c9fe13e0ca125304
Reviewed-by: Jason McDonald <jason.mcdonald@nokia.com>
QUnicodeTables::ligature() was removed from the API in 2006. The
commit that disabled the test also changed the code to call
QChar::ligature(), which has never existed.
Change-Id: I056c17c178a527b076538fb007404ff0b735ba02
Reviewed-by: Rohan McGovern <rohan.mcgovern@nokia.com>
qttest_p4.prf was added as a convenience for Qt's own autotests in Qt4.
It enables various crufty undocumented magic, of dubious value.
Stop using it, and explicitly enable the things from it which we want.
Change-Id: I7c1ffe9c8c294dbdc988e1582e580b1ed3f4593e
Reviewed-by: Jason McDonald <jason.mcdonald@nokia.com>
according to the Unicode specs, code point U+0085 should be treated
like a white space character (an exceptional Cc one)
Change-Id: Ib17ae0c4d3cdafe667cafa38b645138ef24c238c
Merge-request: 32
Reviewed-by: Jiang Jiang <jiang.jiang@nokia.com>
Reviewed-on: http://codereview.qt-project.org/6158
Reviewed-by: Qt Sanity Bot <qt_sanity_bot@ovi.com>
Symbian is not a supported platform for Qt5, so this code is no longer
required.
Change-Id: I1172e6a42d518490e63e9599bf10579df08259aa
Reviewed-on: http://codereview.qt-project.org/5657
Reviewed-by: Rohan McGovern <rohan.mcgovern@nokia.com>
Make it inline; add fast checks for typical spaces;
add fallback function that uses the fastcall calling
convention.
On ia32, this change makes isSpace ~340x faster for
ascii spaces, ~170x faster for non-space ascii
characters, and ~1.3x faster for non-ascii characters.
Note that this change is NOT binary compatible.
Also add an autotest with expected results from
before the optimization, to ensure that the behavior
is the same.
Change-Id: I9438d0ad3c9ba2e80560c4bee7eed05115265798
Reviewed-on: http://codereview.qt-project.org/4905
Reviewed-by: Qt Sanity Bot <qt_sanity_bot@ovi.com>
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
Make it inline; add fast checks for ascii letters
and digits; add fallback function that uses the
fastcall calling convention.
On ia32, this change makes isLetterOrNumber ~120x
faster for ascii letters and digits, ~150x faster
for non-letter/digit ascii characters, and ~1.3x
faster for non-ascii characters.
Note that this change is NOT binary compatible.
Also add an autotest with expected results from
before the optimization, to ensure that the
behavior is the same.
Change-Id: Ia4e13692f4dd79f6aa0b96da29449e0487971b0e
Reviewed-on: http://codereview.qt-project.org/4904
Reviewed-by: Qt Sanity Bot <qt_sanity_bot@ovi.com>
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
Make it inline; add fast checks for ascii letters;
add fallback function that uses the fastcall calling
convention.
On ia32, this change makes isLetter ~370x faster for
ascii letters, ~250x faster for non-letter ascii
characters, and ~1.5x faster for non-ascii characters.
Note that this change is NOT binary compatible.
Also add an autotest with expected results from
before the optimization, to ensure that the
behavior is the same.
Change-Id: I06f8d3d43114537cee5567e670898cef6494c20a
Reviewed-on: http://codereview.qt-project.org/4903
Reviewed-by: Qt Sanity Bot <qt_sanity_bot@ovi.com>
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
Make it inline; add fast checks for ascii digits;
add fallback function that uses the fastcall calling
convention.
On ia32, this change makes isDigit ~370x faster for
ascii digit characters, ~250x faster for non-digit
ascii characters, and ~1.5x faster for non-ascii
characters.
Note that this change is NOT binary compatible.
Also add an autotest with expected results from
before the optimization, to ensure that the
behavior is the same.
Change-Id: I718fadecda3f591d6f4c22374d8e476f4724fd83
Reviewed-on: http://codereview.qt-project.org/4902
Reviewed-by: Qt Sanity Bot <qt_sanity_bot@ovi.com>
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>