qt5base-lts

Author	SHA1	Message	Date
Ievgenii Meshcheriakov	2afe1a3c19	unicode: Generate tables for IDNA/UTS #46 Update the Unicode data processing tool to generate properties and mapping tables needed to implement UTS #46 (https://unicode.org/reports/tr46/). The implementation extends the standard to allow usage of underscores in URLs. This is done for compatibility with DNS-SD and SMB protocols. The data file needed to generate the new properties was taken from https://www.unicode.org/Public/idna/13.0.0/IdnaMappingTable.txt Task-number: QTBUG-85323 Change-Id: I2c303bf8a08aefb18a7491fb9b55385563bfa219 Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>	2021-08-26 16:55:05 +02:00
Giuseppe D'Angelo	a794c5e287	Unicode: fix the extended grapheme cluster algorithm UAX #29 in Unicode 11 changed the EGC algorithm to its current form. Although Qt has upgraded the Unicode tables all the way up to Unicode 13, the algorithm has never been adapted; in other words, it has been working by chance for years. Luckily, MOST of the cases were dealt with correctly, but emoji handling actually manages to break it. This commit: * Adds parsing of emoji-data.txt into the unicode table generator. That is necessary to extract the Extended_Pictographic property, which is used by the EGC algorithm. * Regenerates the tables. * Removes some obsoleted grapheme cluster break properties, and adds the ones added in the meanwhile. * Rewrites the EGC algorithm according to Unicode 13. This is done by simplifying a lot the lookup table. Some rules (GB11, GB12, GB13) can't be done by the table alone so some hand-rolled code is necessary in that case. * Thanks to these fixes, the complete upstream GraphemeBreakTest now passes. Remove the "edited" version that ignored some rows (because they were failing). Change-Id: Iaa07cb2e6d0ab9deac28397f46d9af189d2edf8b Pick-to: 6.1 6.0 5.15 Fixes: QTBUG-92822 Reviewed-by: Thiago Macieira <thiago.macieira@intel.com> Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>	2021-04-16 20:31:39 +02:00
Edward Welbourne	54f8be6cc0	Update UCD to Revision 26 Include WordBreakTest.html, since a test uses sample strings from it, albeit without actually reading the file. Had to comment out more of the new tests, as at Revision 24, pending an update to harfbuzz and the text boundary detection code. Task-number: QTBUG-79631 Task-number: QTBUG-79418 Task-number: QTBUG-82747 Change-Id: I0082294b09d67ffdc6a9b5c15acf77ad3b86f65f Reviewed-by: Lars Knoll <lars.knoll@qt.io>	2020-03-14 11:26:59 +01:00
Edward Welbourne	c3eb521a0f	Update UCD data to Unicode 12.1.0's Revision 24 Had to teach the update program to accept category Lm as for Joining_Transparent, for the sake of a new ArabicShaping.txt entry. Added three new Unicode versions, several new scripts and a new word-break class. Updated UCD's test data for tst_QTextBoundaryFinder. This left 57 tests failing; I have commented out the data rows for those tests, pending someone with more knowledge addressing this. Task-number: QTBUG-79631 Task-number: QTBUG-79418 Change-Id: Ic33d3b3551195d47a84d98e84020f57a68f0b201 Reviewed-by: Eskil Abrahamsen Blomfeldt <eskil.abrahamsen-blomfeldt@qt.io>	2019-10-30 17:38:02 +01:00
Lars Knoll	41b4e154d6	Update Text segmentation and line break data to Unicode 10.0 Also adjusted the text segmentation and line break algorithms so that they can handle the new data, and pass the test suite. Change-Id: Ib727fd80003e34e96458d7a681996de3fa3691e7 Reviewed-by: Eskil Abrahamsen Blomfeldt <eskil.abrahamsen-blomfeldt@qt.io>	2018-01-03 07:47:26 +00:00
Lars Knoll	8bfabb34de	Update most Unicode data to version 10.0 The text segmentation data is not being updated in this change, as it requires additional code changes. Updating those will come in a follow-up commit. Change-Id: I5d6b6bc96044e8dd0c25cf6f79756e7f68bf6e7c Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com> Reviewed-by: Eskil Abrahamsen Blomfeldt <eskil.abrahamsen-blomfeldt@qt.io>	2018-01-03 07:46:31 +00:00
Konstantin Ritt	a98b541f26	Update Unicode data files to v8.0 Change-Id: I0aa368cb07353924031a9af4f0bdc33692eb1053 Reviewed-by: Lars Knoll <lars.knoll@theqtcompany.com>	2015-11-05 08:24:58 +00:00
Konstantin Ritt	ecdd5648bd	Update UCD source files to v7.0 Change-Id: I47277963c926128ad0c4ac5141835e767bb440a7 Reviewed-by: Lars Knoll <lars.knoll@digia.com>	2015-03-27 16:39:53 +00:00
Konstantin Ritt	a6046be428	Update UCD source files up to Unicode 6.3.0 Change-Id: I9ab58a659af1e758b172a24aa95bce1fea89c33d Reviewed-by: Lars Knoll <lars.knoll@digia.com>	2014-01-14 15:38:43 +01:00
Konstantin Ritt	2672c4fa91	Update the Unicode Data and Algorithms up to Unicode 6.2 Version 6.2 of the Unicode Standard is a special release dedicated to the early publication of the newly encoded Turkish lira sign. In addition, there are some significant changes to the Unicode algorithms for text segmentation and line breaking to improve breaking for emoji symbols. For more details, see http://www.unicode.org/versions/Unicode6.2.0/ Change-Id: I21cfd4f307e41b41a19d36cce87f7a44c2661bc2 Reviewed-by: Thiago Macieira <thiago.macieira@intel.com> Reviewed-by: Lars Knoll <lars.knoll@digia.com>	2012-10-09 03:04:41 +02:00
Konstantin Ritt	d8c04d60db	Make QUnicodeTables::script() support SMP code points Instead of expanding the scripts table with script values for the code points >= 0x10000, it has been merged with the properties table in order to increase perfomance of the script itemization code (not affected yet). (Stats: the properties table grew up in 97428-89800 = 7628 bytes; the old scripts table was of size 7680 bytes) The outdated ScriptsInitial.txt and ScriptsCorrections.txt file has been removed (they were just empty, the "corrigendum" script corrections should be applied to Scripts.txt directly, no customization allowed!). More script testcases has been added - at least one per supported script. Task-number: QTBUG-6530 Change-Id: I40a9e76f681e2dd552fd4c61af0808d043962e79 Reviewed-by: Lars Knoll <lars.knoll@nokia.com>	2012-06-14 05:22:13 +02:00
Konstantin Ritt	c9100bcce7	Update the Unicode data files up to v6.1.0 Change-Id: I20b94634b1f4ebff10757c2348cfdbbd906e8797 Reviewed-by: Lars Knoll <lars.knoll@nokia.com>	2012-06-10 15:57:54 +02:00
Konstantin Ritt	73b24486ed	UCD-5.0: apply Corrigendum #6 http://unicode.org/versions/corrigendum6.html: > in Unicode 5.0, the list of characters with the Bidi_Mirrored property > was made consistent for brackets and quotation marks, in preparation for > new constraints on bidi mirroring. However, after publication of > Unicode 5.0.0 it was discovered that this change adversely affected > several quotation mark characters in deployed data. Task-number: QTBUG-25169 Change-Id: Id49caf401af2d5a1e6dbcc32b2f350aa20b7f901 Reviewed-by: Lars Knoll <lars.knoll@nokia.com>	2012-04-15 03:53:07 +02:00
Qt by Nokia	38be0d1383	Initial import from the monolithic Qt. This is the beginning of revision history for this module. If you want to look at revision history older than this, please refer to the Qt Git wiki for how to use Git history grafting. At the time of writing, this wiki is located here: http://qt.gitorious.org/qt/pages/GitIntroductionWithQt If you have already performed the grafting and you don't see any history beyond this commit, try running "git log" with the "--follow" argument. Branched from the monolithic repo, Qt master branch, at commit 896db169ea224deb96c59ce8af800d019de63f12	2011-04-27 12:05:43 +02:00

14 Commits