* Georgian lari currency symbol
* A large collection of CJK unified ideographs
* Emoji symbols and symbol modifiers
* Letters to support the Ik language in Uganda, Kulango in
the Côte d’Ivoire, and other languages of Africa
* A set of lowercase Cherokee syllables, forming case pairs
with the existing Cherokee characters
* The Ahom script for support of the Tai Ahom language in India
* Arabic letters to support Arwi—the Tamil language written in the Arabic script
For more details, see http://www.unicode.org/versions/Unicode8.0.0/
[ChangeLog][QtCore] Unicode data updated to v.8.0
Change-Id: If255f95c9c45655b721369a116299da3cabbba0a
Reviewed-by: Lars Knoll <lars.knoll@theqtcompany.com>
* Two newly adopted currency symbols:
the Azerbaijan manat and the Russia ruble
* Pictographic symbols (including many emoji), geometric symbols,
arrows, and ornaments originating from the Wingdings and Webdings sets
* Twenty-three new lesser-used and historic scripts
extending support for written languages of North America, China, India,
other Asian countries, and Africa
* Letters used in Teuthonista and other transcriptional systems,
and a new notational set, Duployan
For more details, see http://www.unicode.org/versions/Unicode7.0.0/
The Properties struct's .*Diff members were narrowed down
to signed 15 bits and the unicodeVersion has been expanded to 8 bits.
[ChangeLog][QtCore] Unicode data updated to v.7.0
Change-Id: I93ab6f79fa3b05f61abc7279f1d046834c1c1a0b
Reviewed-by: Lars Knoll <lars.knoll@digia.com>
Since Qt 5.0, static overloads of QChar has a uint parameter only,
so there is no more ambiguity between uint<->ushort and thus some tests
does not make sense anymore; avoid explicit cast to uint for the others.
Change-Id: Ibc7a2ac4de63d3f023a8dbb5e53211ef8521579d
Reviewed-by: Lars Knoll <lars.knoll@digia.com>
Qt copyrights are now in The Qt Company, so we could update the source
code headers accordingly. In the same go we should also fix the links to
point to qt.io.
Outdated header.LGPL removed (use header.LGPL21 instead)
Old header.LGPL3 renamed to header.LGPL3-COMM to match actual licensing
combination. New header.LGPL-COMM taken in the use file which were
using old header.LGPL3 (src/plugins/platforms/android/extract.cpp)
Added new header.LGPL3 containing Commercial + LGPLv3 + GPLv2 license
combination
Change-Id: I6f49b819a8a20cc4f88b794a8f6726d975e8ffbe
Reviewed-by: Matti Paaso <matti.paaso@theqtcompany.com>
Make sure the relational operators are in a constexpr'able form
by removing the use of the const/non-const-overloaded unicode()
function, which in the relational operators is resolved to the
non-const version, which isn't constexpr'able.
Replaced with direct member access for op== and op< (required
making them friends) and reformulating the other operators in
terms of these two.
Since I managed to introduce a bug while doing this change,
add a simple test for QChar operators, too.
Change-Id: I69f3da849e71abc2a17152f797694950914adebc
Reviewed-by: Alex Trotsenko <alex1973tr@gmail.com>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
NormalizationTest.txt does not exist in the project root, but under 'data'
directory. TESTDATA is converted to INSTALLS rules in testcase.prf.
INSTALLS rules generated in testcase.prf does not set 'no_check_exist'
CONFIG variable. Thus qmake will not install NormalizationTest.txt since
it cannot find it from defined location.
Even TESTDATA has been incorrectly defined, NormalizationTest.txt
has been found in majority of the platforms thanks to QFINDTESTDATA
flexibility. However it causes problems on sand-boxed platforms such
as WinRT.
Fixed by defining the relative path to NormalizationTest.txt in TESTDATA
so that qmake can find the file when processing INSTALLS variable.
Change-Id: Id9a28db2a00b17d2c0136e6ff32f421b21137898
Reviewed-by: Andrew Knight <andrew.knight@digia.com>
Reviewed-by: Liang Qi <liang.qi@digia.com>
This aimed to disctinct joining types "L", "T", and "U" from just "U".
Unicode 6.3.0 has introduced a character with joining type "L" and
Unicode 7.0 will add a few more characters of joining type "L", so
we'll have to deal with it anyways.
[ChangeLog][QtCore][QChar] Added JoiningType enum and joiningType()
method that deprecates the old QChar::Joining enum and joining() method.
Change-Id: I4be3a3f745d944e689feb9b62d4ca86d1cf371b0
Reviewed-by: Lars Knoll <lars.knoll@digia.com>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Mongolian and Phags-pa characters have been given a Joining_Type
classification for contextual shaping. As a part of these additions,
one Phags-pa character has the Joining_Type value of L (Left Joining),
which no character had been assigned before.
* The unassigned code points in the Currency Symbols block have been
given the Bidi_Class property value ET and the Line_Break property
value PR, to help implementations support new currency symbols,
when they are encoded.
* Hebrew letters and basic punctuation marks have been assigned
the newly introduced Word_Break property values Hebrew_Letter,
Single_Quote, and Double_Quote.
* The Bidi_Class property has been extended with four new values
for directional isolates.
For more details, see http://www.unicode.org/versions/Unicode6.3.0/
Change-Id: Iad62d02edc58a8497898dcd6d6c70d5aece317ea
Reviewed-by: Lars Knoll <lars.knoll@digia.com>
The issue is already fixed in 5.0 but let's be nice and ensure the issue
won't be reintroduced later.
Task-number: QTBUG-30931
Change-Id: Ia6944acaf6e7217f8d0f1fa75d0e9977db11d892
Reviewed-by: Lars Knoll <lars.knoll@digia.com>
...and remove the outdated QUnicodeTables::Script enum.
QFontEngineData now has one extra slot that never used
(engines[QChar::Script_Inherited]). engines[QChar::Script_Unknown],
if accessed, would be set with a Box engine instance, and could be used
as a minor optimization some time later.
In order to preserve the existing behavior, we map all scripts up to Latin to Common.
Change-Id: Ide4182a0f8447b4bf25713ecc3fe8097b8fed040
Reviewed-by: Pierre Rossi <pierre.rossi@gmail.com>
Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>
...where the values are not aliased to Common script.
The old QUnicodeTables::Script enum was retained for compatibility reasons
until Qt internals are updated to use QChar::script().
Using QChar::Script instead of QUnicodeTables::Script would improve both
the text analysis (itemization, boundary finding) and the text shaping quality.
This also a required step for switching to Hurfbuzz-NG.
/* This adds 6668 more .rodata bytes */
Change-Id: I5aa3d12c550528d0052542436990f8d0779ea8e5
Reviewed-by: Eskil Abrahamsen Blomfeldt <eskil.abrahamsen-blomfeldt@digia.com>
Version 6.2 of the Unicode Standard is a special release
dedicated to the early publication of the newly encoded Turkish lira sign.
In addition, there are some significant changes to the Unicode algorithms
for text segmentation and line breaking to improve breaking for emoji symbols.
For more details, see http://www.unicode.org/versions/Unicode6.2.0/
Change-Id: I21cfd4f307e41b41a19d36cce87f7a44c2661bc2
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Reviewed-by: Lars Knoll <lars.knoll@digia.com>
Change copyrights and license headers from Nokia to Digia
Change-Id: If1cc974286d29fd01ec6c19dd4719a67f4c3f00e
Reviewed-by: Lars Knoll <lars.knoll@digia.com>
Reviewed-by: Sergio Ahumada <sergio.ahumada@digia.com>
Qt 5.0 beta requires changing the default to the 5.0 API, disabling
the deprecated code. However, tests should test (and often do) the
compatibility API too, so turn it back on.
Task-number: QTBUG-25053
Change-Id: I8129c3ef3cb58541c95a32d083850d9e7f768927
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
Reviewed-by: Olivier Goffart <ogoffart@woboq.com>
Instead of expanding the scripts table with script values for the code points
>= 0x10000, it has been merged with the properties table in order to
increase perfomance of the script itemization code (not affected yet).
(Stats: the properties table grew up in 97428-89800 = 7628 bytes;
the old scripts table was of size 7680 bytes)
The outdated ScriptsInitial.txt and ScriptsCorrections.txt file has been removed
(they were just empty, the "corrigendum" script corrections should be applied
to Scripts.txt directly, *no customization allowed*!).
More script testcases has been added - at least one per supported script.
Task-number: QTBUG-6530
Change-Id: I40a9e76f681e2dd552fd4c61af0808d043962e79
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
Update NormalizationTest.txt data file with one from UCD 6.1;
Add few more QChar::unicodeVersion() testcases;
Add some line break class mapping testcases;
Add some exceptional case mapping testcases;
Add script class mapping test;
Change-Id: I164394984abb2b893c8db62fb77e7bd87aa0850b
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
Chars that have a case conversion that converts
them into several characters can't be handled
by QChar::toUpper() etc and should get ignored. The code
didn't do that correctly.
Change-Id: I281d122e90bf49187b6449088d2fccef2ef75e86
Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
+ QChar::LastValidCodePoint enum value that supercede the UNICODE_LAST_CODEPOINT macro
replace uses of hardcoded values with the new API; remove leftovers
Change-Id: I1395c9840b85fcb6b08e241b131794a98773c952
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
inline all non-static members to a static ones (declared with QT_FASTCALL),
ushort converts automatically to uint and the conversion cost is minimal.
Task-Number: QTBUG-13052
Change-Id: I189a6f205736766adcd3de2d61cee71f30cc64f3
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
> http://www.unicode.org/versions/Unicode5.2.0/
D. Character Additions:
There are three new characters in the newly-encoded Kaithi script that will
require changes in implementations which make hard-coded assumptions about
composition during normalization. Most new characters added to the standard
with decompositions cannot be generated by the operations toNFC() or toNFKC),
but these three can. Implementers should check their code carefully
to ensure that it handles these three characters correctly.
U+1109A KAITHI LETTER DDDHA
U+1109C KAITHI LETTER RHA
U+110AB KAITHI LETTER VA
UCD 6.1 adds two more of them:
U+1112E CHAKMA VOWEL SIGN O
U+1112F CHAKMA VOWEL SIGN AU
Change-Id: I781a26848078d8b83a182b0fd4e681be2a6d9a27
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
results are now equals to results of ICU's u_isprint() for the entire set
of the Unicode code points
Change-Id: I763f4b37cccd285eb01543d486f25bd7ea011241
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
http://unicode.org/versions/corrigendum6.html:
> in Unicode 5.0, the list of characters with the Bidi_Mirrored property
> was made consistent for brackets and quotation marks, in preparation for
> new constraints on bidi mirroring. However, after publication of
> Unicode 5.0.0 it was discovered that this change adversely affected
> several quotation mark characters in deployed data.
Task-number: QTBUG-25169
Change-Id: Id49caf401af2d5a1e6dbcc32b2f350aa20b7f901
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
As in the past, to avoid rewriting various autotests that contain
line-number information, an extra blank line has been inserted at the
end of the license text to ensure that this commit does not change the
total number of lines in the license header.
Change-Id: I311e001373776812699d6efc045b5f742890c689
Reviewed-by: Rohan McGovern <rohan.mcgovern@nokia.com>
These comments were mostly empty or inaccurate. Appropriate naming of
tests and appropriate placement of tests within the directory tree
provide more reliable indicators of what is being tested.
Change-Id: Ib6bf373d9e79917e4ab1417ee5c1264a2c2d7027
Reviewed-by: Rohan McGovern <rohan.mcgovern@nokia.com>
In .pro files, removed wince/symbian-specific DEPLOYMENT cases and
replaced them with TESTDATA where appropriate.
In .cpp files, removed SRCDIR and relative paths to testdata and
replaced them with the QFINDTESTDATA macro where appropriate.
Modified test helper apps/libs to install themselves under the test
they relate to.
This change allows corelib tests to be correctly installed, along with
their testdata, via `make install'.
Change-Id: I5e202e2f3b577af7e39072d5c9fe13e0ca125304
Reviewed-by: Jason McDonald <jason.mcdonald@nokia.com>
QUnicodeTables::ligature() was removed from the API in 2006. The
commit that disabled the test also changed the code to call
QChar::ligature(), which has never existed.
Change-Id: I056c17c178a527b076538fb007404ff0b735ba02
Reviewed-by: Rohan McGovern <rohan.mcgovern@nokia.com>
qttest_p4.prf was added as a convenience for Qt's own autotests in Qt4.
It enables various crufty undocumented magic, of dubious value.
Stop using it, and explicitly enable the things from it which we want.
Change-Id: I7c1ffe9c8c294dbdc988e1582e580b1ed3f4593e
Reviewed-by: Jason McDonald <jason.mcdonald@nokia.com>
according to the Unicode specs, code point U+0085 should be treated
like a white space character (an exceptional Cc one)
Change-Id: Ib17ae0c4d3cdafe667cafa38b645138ef24c238c
Merge-request: 32
Reviewed-by: Jiang Jiang <jiang.jiang@nokia.com>
Reviewed-on: http://codereview.qt-project.org/6158
Reviewed-by: Qt Sanity Bot <qt_sanity_bot@ovi.com>
Symbian is not a supported platform for Qt5, so this code is no longer
required.
Change-Id: I1172e6a42d518490e63e9599bf10579df08259aa
Reviewed-on: http://codereview.qt-project.org/5657
Reviewed-by: Rohan McGovern <rohan.mcgovern@nokia.com>
Make it inline; add fast checks for typical spaces;
add fallback function that uses the fastcall calling
convention.
On ia32, this change makes isSpace ~340x faster for
ascii spaces, ~170x faster for non-space ascii
characters, and ~1.3x faster for non-ascii characters.
Note that this change is NOT binary compatible.
Also add an autotest with expected results from
before the optimization, to ensure that the behavior
is the same.
Change-Id: I9438d0ad3c9ba2e80560c4bee7eed05115265798
Reviewed-on: http://codereview.qt-project.org/4905
Reviewed-by: Qt Sanity Bot <qt_sanity_bot@ovi.com>
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
Make it inline; add fast checks for ascii letters
and digits; add fallback function that uses the
fastcall calling convention.
On ia32, this change makes isLetterOrNumber ~120x
faster for ascii letters and digits, ~150x faster
for non-letter/digit ascii characters, and ~1.3x
faster for non-ascii characters.
Note that this change is NOT binary compatible.
Also add an autotest with expected results from
before the optimization, to ensure that the
behavior is the same.
Change-Id: Ia4e13692f4dd79f6aa0b96da29449e0487971b0e
Reviewed-on: http://codereview.qt-project.org/4904
Reviewed-by: Qt Sanity Bot <qt_sanity_bot@ovi.com>
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>