Commit Graph

24 Commits

Author SHA1 Message Date
Ievgenii Meshcheriakov
2d0c2c9f1c locale_database: Use pathlib to manipulate paths in Python code
pathlib's API is more modern and easier to use than os.path. It
also allows to distinguish between paths and other strings in type
annotations.

Task-number: QTBUG-83488
Pick-to: 6.2
Change-Id: Ie6d9b4e35596f7f6befa4c9635f4a65ea3b20025
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
2021-07-19 22:05:54 +02:00
Ievgenii Meshcheriakov
5ef5dce53b locale_database: Use argparse module to parse command line arguments
arparse is the standard way to parse command line arguments in Python.
It provides help and usage information for free and is easier to extend
than a custom argument parser.

Task-number: QTBUG-83488
Pick-to: 6.2
Change-Id: I1e4c9cd914449e083d01932bc871ef10d26f0bc2
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
2021-07-16 19:04:20 +02:00
Ievgenii Meshcheriakov
41458fafa0 locale_database: Use f-strings in Python code
Replace most uses of str.format() and string arithmetic by f-strings.
This results in more compact code and the code is easier to read
when using an appropriate editor.

Task-number: QTBUG-83488
Pick-to: 6.2
Change-Id: I3409f745b5d0324985cbd5690f5eda8d09b869ca
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
2021-07-16 19:04:20 +02:00
Ievgenii Meshcheriakov
b02d17c5c0 Convert CLDR scripts to Python 3
The convertion is moslty done using 2to3 script with manual cleanup
afterwards.

Task-number: QTBUG-83488
Pick-to: 6.2
Change-Id: I4d33b04e7269c55a83ff2deb876a23a78a89f39d
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
2021-07-15 17:06:53 +02:00
Edward Welbourne
1a49d7d1e0 Report unused enum members after CLDR data scan
We should at least know when members of QLocale's enums aren't adding
any value, and it may make sense to deprecate the unused ones.

Change-Id: Icf202f81d2a35904c13ccdc202d41985bcb3f2e6
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
2021-06-07 17:14:14 +02:00
Edward Welbourne
e51831260a Nomenclature change: s/countr/territor/g in locale scripts
Change the nomenclature used in the scripts and the QLocaleXML data
format to use "territory" and "territories" in place of "country" and
"countries". Does not change the generated source files.

Change-Id: I4b208d8d01ad2bfc70d289fa6551f7e0355df5ef
Reviewed-by: JiDe Zhang <zhangjide@uniontech.com>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
2021-05-26 18:00:01 +02:00
Edward Welbourne
181424d9b5 QLocaleXmlWriter.enumData(): move enumdata import to method from caller
The only reason cldr.py imported enumdata was so as to pass what it
imported to writer.enumData(); that method might as well do the import
itself.

Change-Id: Ie77dcd29058f926b8cca4deef35837f30505859f
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
2021-05-26 18:00:01 +02:00
Edward Welbourne
05e67fbcab Update to CLDR v38.1, adding Yukon Standard Time
No change to QLocale's data, one addition to the Windows time-zone
data. What was formerly "Us Mountain Standard time / Canada" is now
Yukon Standard Time.

Fixes: QTBUG-89784
Pick-to: 6.0 5.15
Change-Id: I4c9a23620e74ea379be8a4c5ba0896d35fe9b594
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
2021-01-27 15:00:57 +01:00
Dimitrios Apostolou
1e546595e9 Remove unused imports
As found by LGTM.com.

Change-Id: I1704f10f9bab1b11ab22824aca0cfcdcb47fef2f
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
2020-07-10 02:36:54 +02:00
Edward Welbourne
727afdf344 Fix parameter order in cldr2qlocalexml.py's usage()
Callers and definition were out of sync.

Change-Id: Icda26887cb64c61c7e373766f25559b0d450d112
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
2020-04-06 14:29:32 +02:00
Edward Welbourne
cabd8f860b Ensure we use UTF-8 for the emitted QLocaleXML data file
Python helpfully uses a sensible locale when stdout is a tty but uses
the system (not the filesystem) default encoding, which may be ascii
and unable to encode some of the data we need to save. So brute force
kludge it to ensure emit.encoding is UTF-8 when writing the output
we'll read as UTF-8 anyway.

(This matches dev's commit 0ef79d94f6
for the reworked version of the script.)

Task-number: QTBUG-79902
Change-Id: I60ddc896a308c06e01fa87e8e18e112faa17d601
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
2020-04-02 19:44:06 +01:00
Edward Welbourne
be3dfd7a71 Rework cldr2qlocalexml.py's reading of CLDR data
Move the code out to a CldrReader class in cldr.py, expand CldrAccess
with facilities that needs, expand ldml.py to include support for more
features, finally making xpathlite.py redundant. This initial commit
aims, though, to be bug-for-bug compatible with xpathlite in its
reading of the CLDR data.

It turns out we've been using draftier data than we were aware of
(which might not be a bad thing). The xpathlite code appeared to check
for draft attributes, but these only appear on leaf nodes and most
data were fetched by finding a parent and then scanning its children
without the draft check; only am/pm data was actually being excluded
based on draft values.  (We allowed contributed, for am/pm, in
addition to approved, which is all the xpathlite code allows
otherwise.) There are also some less equivocal bugs; I'll deal with
these in later commits.

Simplified number-system data look-ups; the old get_number_in_system()
was taking care of old LDML versions' placement of the number system
attribute; this is no longer needed. (It was also being used for a
currency value to which it was not appropriate, which is now handled
separately; this is one of the bugs mentioned above.) Ditched a
fall-back to nativeZeroDigit, which no longer exists in CLDR.

Change the command-line to take the root of the CLDR data tree, rather
than its common/main/ sub-directory. Support naming the file to which
to write output, as a second command-line argument, instead of always
writing to stdout (which remains the default) and leaving whoever runs
the script to redirect stdout.

Support (internally for now, while adding TODOs to give main() more
command-line options) separating the stderr output into its more and
less interesting parts; for now, continue producing both, but suppress
the least interesting entirely.

Task-number: QTBUG-81344
Change-Id: Ie611b47403a9452b51feaeeaaa0fbc8f7e84dc71
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
2020-04-02 19:43:18 +01:00
Edward Welbourne
c3dea1ffca Move some shared code to a localetools module
The time-zone script was importing two functions from the locale data
generation script. Move them to a separate module, to which I'll
shortly add some more shared utilities. Cleaned up some imports in the
process.

Combined qlocalexml2cpp's and xpathlit's error classes into a new
Error class in the new module and made it a bit more like a proper
python error class.

Task-number: QTBUG-81344
Change-Id: Idbe0139ba9aaa2f823b8f7216dee1d2539c18b75
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
2020-04-02 19:42:40 +01:00
Edward Welbourne
a20697a394 Rework cldr2qlocalexml.py in terms of a QLocaleXmlWriter class
Delegate the output of XML to a helper class provided by qlocalexml.py
and restructure the driver script so that it can be imported without
running anything. It now has a minimal __name__ == '__main__' block
that calls a main() function. This, for the moment, requires a global
via which it shares the CLDR directory with various other functions;
that shall go away in a later commit.

Task-number: QTBUG-81344
Change-Id: Ica2d3ec09f2d38ba42fd930258cc765283f29a71
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
2020-04-02 19:42:28 +01:00
Edward Welbourne
84382bde5c Rename the localexml module to qlocalexml
It implements interaction with the QLocaleXML file format type, so
rename it to match.

Task-number: QTBUG-81344
Change-Id: I46302d4ac1038cdfc5929e73b554b6d793814c56
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
2020-03-03 07:38:06 +01:00
Edward Welbourne
54413653d5 Rename the endonym members of the Locale type
All other members had camelCase names, but the endonyms had
prefix_endonym names, requiring munging where they were emitted to
XML. So just do that munging upstream in the attribute name of the
Locale objects. Makes no change to the data output by the scripts, not
even to the intermediate QLocaleXML file.

Task-number: QTBUG-81344
Change-Id: I01c15a822216281dc669b3e7ebda096d18b04f9b
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
2020-03-03 07:38:06 +01:00
Edward Welbourne
71fa90a37c Enable system locale to skip digit-grouping if configured to do so
On macOS it's possible to configure the system locale to not do digit
grouping (separating "thousands", in most western locales); it then
returns an empty string when asked for the grouping character, which
QLocale's system-configuration then ignored, falling back on using the
base UI locale's grouping separator. This could lead to the same
separator being used for decimal and grouping, which should never
happen, least of all when configured to not group at all.

In order to notice when this happens, query() must take care to return
an empty QString (as a QVariant, which is then non-null) when it *has*
a value for the locale property, and that value is empty, as opposed
to a null QVariant when it doesn't find a configured value. The caller
can then distinguish the two cases.

Furthermore, the group and decimal separators need to be distinct, so
we need to take care to avoid cases where the system overrides one
with what the CLDR has given for the other and doesn't over-ride that
other.

Only presently implemented for macOS and MS-Win, since the (other)
Unix implementation of the system locale returns single QChar values
for the numeric tokens - see QTBUG-69324, QTBUG-81053.

Fixes: QTBUG-80459
Change-Id: Ic3fbb0fb86e974604a60781378b09abc13bab15d
Reviewed-by: Ulf Hermann <ulf.hermann@qt.io>
2020-02-03 15:34:02 +01:00
Soroush Rabiei
7026645712 Add support for the Islamic Civil calendar
This has its own locale data, extracted from CLDR. This data may
potentially be shared with other variants on the Islamic calendar, so
is handled by a separate base-class, QHijriCalendar, on which such
variants may base their implementations.

[ChangeLog][QtCore][QCalendar] Added support for the Islamic Civil
calendar, controlled by feature islamiccivilcalendar, with locale data
that can be shared with other implementations, controlled by feature
hijricalendar.

Fixes: QTBUG-56675
Change-Id: Idf32d3da7034baa8ec5e66ef847e59a8a2f31cbd
Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>
2019-08-22 10:10:02 +00:00
Soroush Rabiei
e71bf9d5c7 Add support for the Jalali (Solar Hijri or Persian) calendar
This has its own locale data, extracted from CLDR.

[ChangeLog][QtCore][QCalendar] Added support for the Jalali (Persian
or Solar Hijri) calendar, controlled by feature jalalicalendar.

Fixes: QTBUG-58404
Change-Id: Id5c56a10db05a4fd612aafc01615273db81ec743
Reviewed-by: Paul Wicking <paul.wicking@qt.io>
Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>
2019-08-21 22:18:48 +02:00
Soroush Rabiei
aa8393c94f Add support for calendars beside Gregorian
Add QCalendarBackend as a base class for calendar implementations and
QCalendar as a facade via which to access it.

QDate's implicit implementation of the Gregorian calendar becomes
QGregorianCalendar and QDate methods now support choice of calendar.

Convert QLocale's CLDR data for month names to a locale-data component
of each supported calendar and relevant QLocale methods now support
choice of calendar. Adapt Python scripts for locale data generation to
extract month name data from CLDR (keeping on version v35.1) into the
new calendar-locale files. The locale data for the Gregorian calendar
is held in a Roman calendar base, for sharing with other calendars.

Add tests for basic uses of the new API.

[ChangeLog][QtCore][QCalendar] Added QCalendar to support diverse
calendars, supported by implementing QCalendarBackend.

[ChangeLog][QtCore][QDate] Allow choice of calendar in various
operations, with Gregorian remaining the default.

Done-with: Lars Knoll <lars.knoll@qt.io>
Done-with: Edward Welbourne <edward.welbourne@qt.io>
Fixes: QTBUG-17110
Fixes: QTBUG-950
Change-Id: I9d6278f394269a183aee8156e990cec4d5198ab8
Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>
2019-08-20 13:41:21 +02:00
Edward Welbourne
a9aa206b7b Move text-related code out of corelib/tools/ to corelib/text/
This includes byte array, string, char, unicode, locale, collation and
regular expressions.

Change-Id: I8b125fa52c8c513eb57a0f1298b91910e5a0d786
Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>
2019-07-10 17:05:30 +02:00
Edward Welbourne
13242673cf Tidy up in cldr2qtimezone.py and document the need to run it
It wasn't mentioned in cldr2qlocalexml.py's instructions, so I didn't
know to run it.  The data it used in an illustration was out of date.
Two tests could be combined with no loss.

Change-Id: I26e619e6210ea5b1258326fc4bc2b6aee9d6a999
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
2019-07-01 17:48:32 +02:00
Edward Welbourne
b7d8169f02 Suggest name, when available, for unknown codes
When parsing the CLDR data, we only handle language, script and
territory (which we call country) codes if they are known to our
enumdata.py tables.  When reporting the rest as unknown, in the
content of an actual locale definition (not the likely subtag data),
check whether en.xml can resolve the code for us; if it can, report
the full name it provides, as a hint to whoever's running the script
that an update to enumdata.py may be in order.

Change-Id: I9ca1d6922a91d45bc436f4b622e5557261897d7f
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>
2019-05-20 20:42:11 +02:00
Edward Welbourne
248b6756da Rename util/locale_database/ to include the e that was missing
It was misnamed local_database, quite missing the point of its name.

Change-Id: I73a4fdf24f53daac12304de1f443636d89afacb2
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>
2019-05-20 20:42:10 +02:00