Commit Graph

1052 Commits

Author SHA1 Message Date
Mike FABIAN
25efda03df Enable transliteration rules with two input characters in scn_IT [BZ #32280]
Should work now because https://sourceware.org/bugzilla/show_bug.cgi?id=31859 has been fixed.
2024-10-16 17:15:39 +02:00
Mike FABIAN
a7b5eb821d Update to Unicode 16.0.0 [BZ #32168]
Unicode 16.0.0 Support: Character encoding, character type info, and
transliteration tables are all updated to Unicode 16.0.0, using
the generator scripts contributed by Mike FABIAN (Red Hat).

Changes in CHARMAP and WIDTH:

    Total added characters in newly generated CHARMAP: 5185
    Total removed characters in newly generated WIDTH: 1
    Total added characters in newly generated WIDTH: 170

The removed character from WIDTH is U+1171E AHOM CONSONANT SIGN MEDIAL RA.
It changed like this:

UnicodeData.txt 15.1.0: 1171E;AHOM CONSONANT SIGN MEDIAL RA;Mn;0;NSM;;;;;N;;;;;
UnicodeData.txt 16.0.0: 1171E;AHOM CONSONANT SIGN MEDIAL RA;Mc;0;L;;;;;N;;;;;

EastAsianWidth.txt 15.1.0: 1171D..1171F   ; N  # Mn     [3] AHOM CONSONANT SIGN MEDIAL LA..AHOM CONSONANT SIGN MEDIAL LIGATING RA
EastAsianWidth.txt 16.0.0: 1171E          ; N  # Mc         AHOM CONSONANT SIGN MEDIAL RA

I.e it changed from Mn (Mark Nonspacing) to Mc (Mark Spacing
combining). So it should now have width 1 instead of 0, therefore it
is OK that it was removed from WIDTH, characters not in WIDTH get
width 1 by default.

Nothing suspicious when browsing the list of the 170 added characters.

Changes in ctype:

    alpha: Added 4452 characters in new ctype which were not in old ctype
    combining: Added 51 characters in new ctype which were not in old ctype
    combining_level3: Added 43 characters in new ctype which were not in old ctype
    graph: Added 5185 characters in new ctype which were not in old ctype
    lower: Added 25 characters in new ctype which were not in old ctype
    print: Added 5185 characters in new ctype which were not in old ctype
    punct: Missing 33 characters of old ctype in new ctype
    punct: Added 766 characters in new ctype which were not in old ctype
    tolower: Added 27 characters in new ctype which were not in old ctype
    totitle: Added 27 characters in new ctype which were not in old ctype
    toupper: Added 27 characters in new ctype which were not in old ctype
    upper: Added 27 characters in new ctype which were not in old ctype

Nothing suspicous in the additions.

About the 33 characters removed from `punct`:

U+0363 - U+036F are identical in UnicodeData.txt. Difference in DerivedCoreProperties.txt:

DerivedCoreProperties.txt 15.1.0: not there.
DerivedCoreProperties.txt 16.0.0: 0363..036F    ; Alphabetic # Mn  [13] COMBINING LATIN SMALL LETTER A..COMBINING LATIN SMALL LETTER X

So that’s the reason why they are added to `alpha` and removed from `punct`.

Same for U+1DD3 - U+1DE6, they are identical in UnicodeData.txt but there is a difference in DerivedCoreProperties.txt:

DerivedCoreProperties.txt 15.1.0: 1DE7..1DF4    ; Alphabetic # Mn  [14] COMBINING LATIN SMALL LETTER ALPHA..COMBINING LATIN SMALL LETTER U WITH DIAERESIS
DerivedCoreProperties.txt 16.0.0: 1DD3..1DF4    ; Alphabetic # Mn  [34] COMBINING LATIN SMALL LETTER FLATTENED OPEN A ABOVE..COMBINING LATIN SMALL LETTER U WITH DIAERESIS

So they became `Alphabetic` and were thus added to `alpha` and removed from `punct`.

Resolves: BZ #32168

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-09-27 14:43:38 +02:00
Mike FABIAN
3ea79f5085 Define ISO 639-3 "ltg" (Latgalian) and add ltg_LV locale
Resolves: BZ # 31411

References:
https://iso639-3.sil.org/code/ltg
https://en.wikipedia.org/wiki/Latgalian_language
https://github.com/unicode-org/cldr/blob/main/common/main/ltg.xml
2024-06-17 10:53:16 +02:00
Mike FABIAN
10733d6a72 localedata: Lowercase day and abday in cs_CZ
Resolves: BZ # 25119

Also to sync with CLDR
2024-06-11 10:33:54 +02:00
David Paleino
eb37015879 localedata: add new locales scn_IT
Signed-off-by: David Paleino <dapal@debian.org>
2024-06-07 15:45:18 +02:00
Mike FABIAN
28bf4783d9 localedata: cv_RU: update translation
Resolves: BZ # 21271
2024-05-23 14:39:35 +02:00
Mike FABIAN
88dca8d5f8 localedata: fix weekdays in mdf_RU locale
From Кирилл Изместьев <izmestevks@basealt.ru>,
see: https://sourceware.org/bugzilla/show_bug.cgi?id=31530#c6
and the following comments.
2024-05-08 14:27:40 +02:00
Mike FABIAN
79fe4a0fa0 localedata: add mdf_RU locale
Resolves: BZ # 31530
2024-05-08 14:27:40 +02:00
Mike FABIAN
07fd072caf localedata: ssy_ER: Fix syntax error 2024-02-08 08:13:37 +01:00
Dragan Stanojević (Nevidljivi)
559010e471 localedata: hr_HR: change currency to EUR/€
Resolves: BZ # 29845
2024-02-08 08:13:37 +01:00
Mike FABIAN
30a61b1dd9 Change lv_LV collation to agree with the recent change in CLDR
Resolves: https://sourceware.org/bugzilla/show_bug.cgi?id=23774

See this change in CLDR committed on 2024-01-29:
635e2d3d05
2024-02-08 08:13:37 +01:00
Mike FABIAN
5176a830e7 localedata: Use consistent values for grouping and mon_grouping
Resolves: BZ # 31205

Adapt test cases in test-grouping_iterator.c
2024-01-25 11:41:02 +01:00
Mike FABIAN
8393f4f72b localedata: renamed: aa_ER@saaho -> ssy_ER
Resolves: BZ # 19956
2024-01-18 11:44:38 +01:00
Mike FABIAN
8e474d5e40 localedata: add crh_RU, Crimean Tartar language in the Cyrillic script as used in Russia.
Resolves: BZ # 24386
2024-01-18 09:18:57 +01:00
Mike FABIAN
ce787f36e6 localedata: tr_TR, ku_TR: Sync with CLDR: “Turkey” -> “Türkiye”
Resolves: BZ # 31257
2024-01-18 08:30:34 +01:00
Mike FABIAN
70e26de105 localedata: miq_NI: Shorten month names in abmon
Resolves: BZ # 23172
2024-01-18 07:56:24 +01:00
Mike FABIAN
ce77e6919f localedata: add gbm_IN locale
Resolves: BZ # 19479
2024-01-17 17:50:33 +01:00
Mike FABIAN
9d2703c109 localedata: anp_IN: Fix abbreviated month names
Resolves: BZ # 31239

The correct abbreviated month names were apparently given in the comment above `abmon`.
But the value of `abmon` was apparently just copied from the value of `mon` and this
mistake was hard to see because code point notation <Uxxxx> was used. After converting
to UTF-8 it was obvious that there was apparently a copy and paste mistake.
2024-01-15 23:12:48 +01:00
Mike FABIAN
fe6c8bab3a localedata: Remove redundant comments 2024-01-13 00:54:40 +01:00
Mike FABIAN
c0c259c3bd localedata: revert all the remaining locale sources to UTF-8 2024-01-11 15:04:25 +01:00
Mike FABIAN
e71c27b7ec localedata: am_ET ber_DZ en_GB en_PH en_US fil_PH kab_DZ om_ET om_KE ti_ET tl_PH: convert to UTF-8 2024-01-11 13:36:08 +01:00
Mike FABIAN
cb8e8b2e21 localedata: resolve cyclic dependencies
Resolves: BZ # 24006
2024-01-11 13:36:08 +01:00
Mike FABIAN
449aa2698c localedata: kv_RU: convert to UTF-8 2024-01-11 13:36:08 +01:00
Mike FABIAN
dff5023a87 localedata: add new locale kv_RU
Resolves: BZ # 30605
2024-01-11 13:36:08 +01:00
Mike FABIAN
46e713be57 localedata: su_ID: make lang_name agree with CLDR 2024-01-09 17:11:58 +01:00
Mike FABIAN
4cf0bd8431 localedata: add new locale su_ID
Resolves: BZ # 27312
2024-01-09 17:11:58 +01:00
Mike FABIAN
03f2265a37 localedata: add new locale zgh_MA
Resolves: BZ # 12908

https://iso639-3.sil.org/code/zgh
2024-01-09 17:11:58 +01:00
Mike FABIAN
ed97da8c7a localedata: tok: add yY and nN to yesexpr and noexpr
See: https://sourceware.org/bugzilla/show_bug.cgi?id=31221#c2
2024-01-09 12:08:14 +01:00
Mike FABIAN
2ddf2f8db1 localedata: tok: convert to UTF-8 2024-01-09 12:08:14 +01:00
Janet Blackquill
d3a2aecc1c localedata: add data for tok (Toki Pona)
Resolves: BZ # 31221

glibc can recognise its code, but does not have its data.
This patch remedies that.

Signed-off-by: Janet Blackquill <uhhadd@gmail.com>
2024-01-09 12:07:48 +01:00
Mike FABIAN
e171ad7d59 localedata: dz_BT, bo_CN: convert to UTF-8 2024-01-08 17:02:09 +01:00
Valery Ushakov
4c2b356be5 localedata: dz_BT, bo_CN: Fix spelling of "phur bu" in both Tibetan and Dzongkha
Resolves: BZ # 31086
2024-01-08 16:44:28 +01:00
Valery Ushakov
6b8419ba5f localedata: bo_CN: Fix spelling errors in Tibetan data
Resolves: BZ # 31086
2024-01-08 16:39:31 +01:00
Valery Ushakov
c4f648ed4d localedata: bo_CN: Fix incomplete edit in Tibetan yesexpr
Resolves: BZ # 31086
2024-01-08 16:08:07 +01:00
Valery Ushakov
460f26e51b localedata: dz_BT: Fix spelling errors in Dzongha data
Resolves: BZ # 31086
2024-01-08 16:04:59 +01:00
Mike FABIAN
6f87f46bf4 localedata: convert the remaining *_RU locales to UTF-8 2024-01-08 10:06:42 +01:00
Mike FABIAN
e9f5dc7e4a localedata: ru_RU, ru_UA: convert to UTF-8 2024-01-04 16:32:44 +01:00
Mike FABIAN
d61a2bd782 localedata: es_??: convert to UTF-8 2024-01-04 16:03:08 +01:00
Mike FABIAN
734abeda98 localedata: miq_NI: convert to UTF-8 2024-01-04 16:03:08 +01:00
Mike FABIAN
b31a01909c localedata: fy_DE: make this "Western Frisian" to agree with the language code "fy"
Resolves: BZ # 14522
2024-01-03 20:55:44 +01:00
Mike FABIAN
3c173c1f63 localedata: fy_DE, fy_NL: convert to UTF-8 2024-01-03 20:07:21 +01:00
Mike FABIAN
bec492c1da localedata: ast_ES: convert to UTF-8 2024-01-03 17:44:52 +01:00
Mike FABIAN
521e96c13f localedata: ast_ES: Remove wrong copyright text
Resolves: BZ # 27601
2024-01-03 17:43:55 +01:00
Mike FABIAN
5448a127e4 localedata: de_{AT,BE,CH,IT,LU}: convert to UTF-8 2024-01-03 13:54:34 +01:00
Mike FABIAN
a8f7f742be localedata: lv_LV, it_IT, it_CH: convert to UTF-8 2024-01-03 13:54:34 +01:00
Mike FABIAN
61171bb2b9 localedata: it_IT, lv_LV: currency symbol should follow the amount
Resolves: BZ # 28558
2024-01-03 13:54:34 +01:00
Mike FABIAN
fe316dad7c localedata: ms_MY should not use 12-hour format
Resolves: BZ # 29504
2024-01-03 11:07:27 +01:00
Mike FABIAN
b5b558ab4b localedata: es_ES: convert to UTF-8 2024-01-02 21:30:42 +01:00
Mike FABIAN
e3e98b0327 localedata: es_ES: Add am_pm strings
Resolves: BZ # 24013

Use <U202F> instead of a plain space because CLDR also uses that.
2024-01-02 21:30:42 +01:00
Mike FABIAN
67f371e882 localedata: convert uz_UZ and uz_UZ@cyrillic to UTF-8 2024-01-02 16:36:43 +01:00