qt5base-lts/util/unicode
Marc Mutz effbf147a4 QUnicodeTables: use array for case folding tables
Instead of four pairs of :1 :15 bit fields, use an array of four :1,
:15 structs.  This allows to replace the case folding traits classes
with a simple enum that indexes into said array.

I don't know what the WASM #ifdef'ed code is supposed to effect (a :0
bit-field is only useful to separate adjacent bit-field into separate
memory locations for multi-threading), but I thought it safer to leave
it in, and that means the array must be a 64-bit block of its own, so
I had to move two fields around.

Saves ~4.5KiB in text size on optimized GCC 10 LTO Linux AMD64 builds.

Change-Id: Ib52cd7706342d5227b50b57545d073829c45da9a
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
2019-09-04 16:35:37 +00:00
..
codecs/big5 Remove usages of deprecated APIs from QtAlgorithms 2019-06-29 21:58:36 +02:00
data Update Text segmentation and line break data to Unicode 10.0 2018-01-03 07:47:26 +00:00
x11
.gitattributes
main.cpp QUnicodeTables: use array for case folding tables 2019-09-04 16:35:37 +00:00
README Move text-related code out of corelib/tools/ to corelib/text/ 2019-07-10 17:05:30 +02:00
unicode.pro
writingSystems.sh Updated license headers 2016-01-21 18:55:18 +00:00

Unicode is used to generate the unicode data in src/corelib/tools.

To update:
* Find the data (UAX #44, UCD; not the XML version) at
  ftp://www.unicode.org/Public/zipped/$Version/
* Unpack the zip file; for each file in data/, replace with the new
  version; find the *BreakProperty.txt in auxiliary/. (These last are
  only in the zip, not in the web-space's unpacked versions.)
* If needed, add an entry to enum QChar::UnicodeVersion for the new
  Unicode version
* In that case, also update main.cpp's initAgeMap and DATA_VERSION_S*
  to match
* Build this project. Its binary, unicode, ignores command-line
  options and assumes it is being run from this directory. When run,
  it produces lots of output. Hopefully that doesn't matter.
* Assertions may trigger: if so, study code and understand what's more
  complicated about this update; talk to folk named in the git logs,
  maybe push a WIP to gerrit to solicit advice. Some bit-field may
  need to be expanded, for example. In some cases QChar may need
  additions to some of its enums.
* Build with the modified code, fix any compilation issues.
* That may have updated qtbase/src/corelib/text/qunicodetables.cpp;
  if so the update matters; be sure to commit the changes to data/ at
  the same time and update tools/qt_attribution.json to match; use the
  UCD Revision number, rather than the Unicode standard number, as the
  Version, for all that qunicodetables.cpp uses the latter.

The script writingSystems.sh generates a list of writing systems,
ostensibly as a the basis for updating QFontDatabase::WritingSystem
enum; however, the Release 20 output of it contains many more writing
systems than are present in that enum, suggesting it has not been run
in a very long time. Further research needed.