Canonicalize space in lists of IANA time-zones

In the Windows zone-ID code, we tokenize() a text extracted from CLDR
data. However, a leading or trailing space (or a repeated internal
space) would then give an empty "IANA ID" for us to match, causing the
empty ID to be mapped to the Windows ID for the entry with the
superfluous space. This was uncovered by an entry with a trailing
space in CLDR v43's data.

Canonicalize spacing in the IANA ID lists extracted from CLDR so as to
ensure this doesn't happen. (We could pass Qt::SkipEmptyParts to the
tokenize() call, but fixing the issue when generating the data is
cheaper and more robust than fixing it at run-time every time it's
consulted.)

Task-number: QTBUG-111550
Change-Id: Ib3883419558d6574141e9ab0bc929ade2d73e020
Reviewed-by: Ievgenii Meshcheriakov <ievgenii.meshcheriakov@qt.io>
This commit is contained in:
Edward Welbourne 2023-08-01 14:47:22 +02:00
parent ee593cdde6
commit 69a0cec4d0

View File

@ -448,7 +448,7 @@ enumdata.py (keeping the old name as an alias):
wid, code = attrs['other'], attrs['territory']
data = dict(windowsId = wid,
territoryCode = code,
ianaList = attrs['type'])
ianaList = ' '.join(attrs['type'].split()))
try:
key = lookup[wid]