Canonicalize space in lists of IANA time-zones

In the Windows zone-ID code, we tokenize() a text extracted from CLDR data. However, a leading or trailing space (or a repeated internal space) would then give an empty "IANA ID" for us to match, causing the empty ID to be mapped to the Windows ID for the entry with the superfluous space. This was uncovered by an entry with a trailing space in CLDR v43's data. Canonicalize spacing in the IANA ID lists extracted from CLDR so as to ensure this doesn't happen. (We could pass Qt::SkipEmptyParts to the tokenize() call, but fixing the issue when generating the data is cheaper and more robust than fixing it at run-time every time it's consulted.) Task-number: QTBUG-111550 Change-Id: Ib3883419558d6574141e9ab0bc929ade2d73e020 Reviewed-by: Ievgenii Meshcheriakov <ievgenii.meshcheriakov@qt.io>
2023-08-01 14:47:22 +02:00 · 2023-08-01 14:47:22 +02:00 · 69a0cec4d0
commit 69a0cec4d0
parent ee593cdde6
1 changed files with 1 additions and 1 deletions
--- a/util/locale_database/cldr.py
+++ b/util/locale_database/cldr.py
@ -448,7 +448,7 @@ enumdata.py (keeping the old name as an alias):
            wid, code = attrs['other'], attrs['territory']
            data = dict(windowsId = wid,
                        territoryCode = code,
-                        ianaList = attrs['type'])
+                        ianaList = ' '.join(attrs['type'].split()))

            try:
                key = lookup[wid]