Commit Graph

12 Commits

Author SHA1 Message Date
Edward Welbourne
c8e00e4f73 Use char16_t in favor of ushort for locale data tables
Change-Id: I890dd2b52c1b786db1081744c8ca343baba93de4
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
2020-02-17 14:55:42 +01:00
Edward Welbourne
ed2b110b6a Allow surrogate pairs for various "single character" locale data
Extract the character in its proper unicode form and encode it in a
new single_character_data table of locale data. Record each entry as
the range within that table that encodes it. Also added an assertion
in the generator script to check that the digits CLDR gives us are a
contiguous sequence in increasing order, as has been assumed by the
C++ code for some time. Lots of number-formatting code now has to take
account of how wide the digits are.

This leaves nowhere for updateSystemPrivate() to record values read
from sys_locale->query(), so we must always consult that function when
accessing these members of the systemData() object. Various internal
users of these single-character fields need the system-or-CLDR value
rather than the raw CLDR value, so move QLocalePrivate's methods to
supply them down to QLocaleData and ensure they check for system
values, where appropriate first.

This allows us to finally support the Chakma language and script, for
whose number system UTF-16 needs surrogate pairs.

Costs 10.8 kB in added data, much of it due to adding two new locales
that need surrogates to represent digits.

[ChangeLog][QtCore][QLocale] Various QLocale methods that returned
single QChar values now return QString values to accommodate those
locales which need a surrogate pair to represent the (single
character) return value.

Fixes: QTBUG-69324
Fixes: QTBUG-81053
Change-Id: I481722d6f5ee266164f09031679a851dfa6e7839
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
2020-02-17 14:55:24 +01:00
Edward Welbourne
c08a31634f Separate offsets from sizes in QLocale's data
This enables us to make the sizes quint8 and benefit from the
resulting packing, making the locale data smaller. The sizes for long
month-name lists (which concatenate twelve names with semicolon as
separator) can overflow an 8-bit member, so use quint16 where needed.

Re-ordered the data in QLocaleData and QCalendarLocale. Now all
long-short(-narrow) families arise in that order; and any standalone
is grouped with the one of the same length. (This cost 20 bytes in the
date-format table, which optimises out more duplication if short is
before long, but the saving in the (smaller) time-format table more
than make up for it; and 20 bytes isn't worth the confusion that being
inconsistent in ordering might cause.)

At the same time, drop trailing semicolons from list entries (which
join various names with semicolon) as they're not needed: we know
where the end of the list is, because we know the size of the string
that results from concatenation. The code that parses such lists can
even correctly handle empty entries at the end.

Saves 26 kB of data in the compiled binaries.

Task-number: QTBUG-81053
Change-Id: If6ccc96a6910828817aa605d10fd814f567ae1e8
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
2020-01-30 17:58:53 +01:00
Edward Welbourne
4e84a8b29f Deduplicate locale data tables
Some entries in tables were sub-strings (e.g. prefixes) of others.
Since we store start-index and length (with no need for terminators),
any entry that appears as a sub-string of an earlier entry can be
recorded without making a separate copy of its content, just by
recording where it appeared as a sub-string of an earlier entry.

(Sadly this doesn't apply to month- or day-names and their
short-forms: for those, we store ';'-joined lists.  Thus, although
each short-form is a prefix of its long-form, the short-form is stored
in a list with other short-forms; and this is not a prefix of the list
of matching long-forms.)

The savings are modest (780 bytes at present), but cost us nothing
except when running the python script that generates the data files
(it takes a little longer now), which usually only happens at a CLDR
update.

Change-Id: I05bdaa9283365707bac0190ae983b31f074dd6ed
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
2020-01-30 17:58:15 +01:00
Edward Welbourne
47d94dab0f Minor tidy-up in qlocalexml2cpp.py
Split a long line.
Use pythonic chained comparison to save some repetition.
Comment on a field not currently in actual use.
Say "zeros" rather than "0s" in one comment to match another.
Added a .h suffix to the main locale data tempfile to match the naming
of the tempfiles used for calendar data.

Simplify generation of the blank line between Language and Script; and
include a matching blank between Script and Country.
This adds one blank line to qlocale.h

Removed a stray space that misaligned locale data lines.
This produces a space-only change in the generated *_data_p.h files.

Change-Id: I974a9e8923c3dfd2178855d2cf1d6a5074e130b3
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
2020-01-30 17:56:35 +01:00
Edward Welbourne
6852ba815d Correct some references to corelib/tools/ to say corelib/text/
The Unicode data tables moved with QString and friends.
So did the locale data generated from CLDR.

This amends commit a9aa206b7b.

Change-Id: If12f0420b559dcb78993adc00e9f39751bca684a
Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>
2019-10-25 11:44:27 +02:00
Soroush Rabiei
7026645712 Add support for the Islamic Civil calendar
This has its own locale data, extracted from CLDR. This data may
potentially be shared with other variants on the Islamic calendar, so
is handled by a separate base-class, QHijriCalendar, on which such
variants may base their implementations.

[ChangeLog][QtCore][QCalendar] Added support for the Islamic Civil
calendar, controlled by feature islamiccivilcalendar, with locale data
that can be shared with other implementations, controlled by feature
hijricalendar.

Fixes: QTBUG-56675
Change-Id: Idf32d3da7034baa8ec5e66ef847e59a8a2f31cbd
Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>
2019-08-22 10:10:02 +00:00
Soroush Rabiei
e71bf9d5c7 Add support for the Jalali (Solar Hijri or Persian) calendar
This has its own locale data, extracted from CLDR.

[ChangeLog][QtCore][QCalendar] Added support for the Jalali (Persian
or Solar Hijri) calendar, controlled by feature jalalicalendar.

Fixes: QTBUG-58404
Change-Id: Id5c56a10db05a4fd612aafc01615273db81ec743
Reviewed-by: Paul Wicking <paul.wicking@qt.io>
Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>
2019-08-21 22:18:48 +02:00
Soroush Rabiei
aa8393c94f Add support for calendars beside Gregorian
Add QCalendarBackend as a base class for calendar implementations and
QCalendar as a facade via which to access it.

QDate's implicit implementation of the Gregorian calendar becomes
QGregorianCalendar and QDate methods now support choice of calendar.

Convert QLocale's CLDR data for month names to a locale-data component
of each supported calendar and relevant QLocale methods now support
choice of calendar. Adapt Python scripts for locale data generation to
extract month name data from CLDR (keeping on version v35.1) into the
new calendar-locale files. The locale data for the Gregorian calendar
is held in a Roman calendar base, for sharing with other calendars.

Add tests for basic uses of the new API.

[ChangeLog][QtCore][QCalendar] Added QCalendar to support diverse
calendars, supported by implementing QCalendarBackend.

[ChangeLog][QtCore][QDate] Allow choice of calendar in various
operations, with Gregorian remaining the default.

Done-with: Lars Knoll <lars.knoll@qt.io>
Done-with: Edward Welbourne <edward.welbourne@qt.io>
Fixes: QTBUG-17110
Fixes: QTBUG-950
Change-Id: I9d6278f394269a183aee8156e990cec4d5198ab8
Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>
2019-08-20 13:41:21 +02:00
Soroush Rabiei
c595878aa3 Extract a large format string as a module constant value
The template for the "This is a generated file" notice made a clumsy
intrusion in the code in which it appeared, so split it out as a
constant of the module and access it by name where it's used.

Change-Id: Ic4dfb8e873078c54410b191654d6c21d082c9016
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
2019-08-08 18:04:05 +02:00
Edward Welbourne
a9aa206b7b Move text-related code out of corelib/tools/ to corelib/text/
This includes byte array, string, char, unicode, locale, collation and
regular expressions.

Change-Id: I8b125fa52c8c513eb57a0f1298b91910e5a0d786
Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>
2019-07-10 17:05:30 +02:00
Edward Welbourne
248b6756da Rename util/locale_database/ to include the e that was missing
It was misnamed local_database, quite missing the point of its name.

Change-Id: I73a4fdf24f53daac12304de1f443636d89afacb2
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>
2019-05-20 20:42:10 +02:00