glibc/localedata/charmaps
Mike FABIAN a7b5eb821d Update to Unicode 16.0.0 [BZ #32168]
Unicode 16.0.0 Support: Character encoding, character type info, and
transliteration tables are all updated to Unicode 16.0.0, using
the generator scripts contributed by Mike FABIAN (Red Hat).

Changes in CHARMAP and WIDTH:

    Total added characters in newly generated CHARMAP: 5185
    Total removed characters in newly generated WIDTH: 1
    Total added characters in newly generated WIDTH: 170

The removed character from WIDTH is U+1171E AHOM CONSONANT SIGN MEDIAL RA.
It changed like this:

UnicodeData.txt 15.1.0: 1171E;AHOM CONSONANT SIGN MEDIAL RA;Mn;0;NSM;;;;;N;;;;;
UnicodeData.txt 16.0.0: 1171E;AHOM CONSONANT SIGN MEDIAL RA;Mc;0;L;;;;;N;;;;;

EastAsianWidth.txt 15.1.0: 1171D..1171F   ; N  # Mn     [3] AHOM CONSONANT SIGN MEDIAL LA..AHOM CONSONANT SIGN MEDIAL LIGATING RA
EastAsianWidth.txt 16.0.0: 1171E          ; N  # Mc         AHOM CONSONANT SIGN MEDIAL RA

I.e it changed from Mn (Mark Nonspacing) to Mc (Mark Spacing
combining). So it should now have width 1 instead of 0, therefore it
is OK that it was removed from WIDTH, characters not in WIDTH get
width 1 by default.

Nothing suspicious when browsing the list of the 170 added characters.

Changes in ctype:

    alpha: Added 4452 characters in new ctype which were not in old ctype
    combining: Added 51 characters in new ctype which were not in old ctype
    combining_level3: Added 43 characters in new ctype which were not in old ctype
    graph: Added 5185 characters in new ctype which were not in old ctype
    lower: Added 25 characters in new ctype which were not in old ctype
    print: Added 5185 characters in new ctype which were not in old ctype
    punct: Missing 33 characters of old ctype in new ctype
    punct: Added 766 characters in new ctype which were not in old ctype
    tolower: Added 27 characters in new ctype which were not in old ctype
    totitle: Added 27 characters in new ctype which were not in old ctype
    toupper: Added 27 characters in new ctype which were not in old ctype
    upper: Added 27 characters in new ctype which were not in old ctype

Nothing suspicous in the additions.

About the 33 characters removed from `punct`:

U+0363 - U+036F are identical in UnicodeData.txt. Difference in DerivedCoreProperties.txt:

DerivedCoreProperties.txt 15.1.0: not there.
DerivedCoreProperties.txt 16.0.0: 0363..036F    ; Alphabetic # Mn  [13] COMBINING LATIN SMALL LETTER A..COMBINING LATIN SMALL LETTER X

So that’s the reason why they are added to `alpha` and removed from `punct`.

Same for U+1DD3 - U+1DE6, they are identical in UnicodeData.txt but there is a difference in DerivedCoreProperties.txt:

DerivedCoreProperties.txt 15.1.0: 1DE7..1DF4    ; Alphabetic # Mn  [14] COMBINING LATIN SMALL LETTER ALPHA..COMBINING LATIN SMALL LETTER U WITH DIAERESIS
DerivedCoreProperties.txt 16.0.0: 1DD3..1DF4    ; Alphabetic # Mn  [34] COMBINING LATIN SMALL LETTER FLATTENED OPEN A ABOVE..COMBINING LATIN SMALL LETTER U WITH DIAERESIS

So they became `Alphabetic` and were thus added to `alpha` and removed from `punct`.

Resolves: BZ #32168

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2024-09-27 14:43:38 +02:00
..
ANSI_X3.4-1968 locale: Remove obsolete repertoire map references 2015-07-21 03:54:02 -04:00
ANSI_X3.110-1983 locale: Remove obsolete repertoire map references 2015-07-21 03:54:02 -04:00
ARMSCII-8
ASMO_449
BIG5 Remove trailing whitespace from localedata. 2013-06-07 14:56:03 +00:00
BIG5-HKSCS Update BIG5-HKSCS charmap to HKSCS-2008 2013-06-11 17:02:59 +02:00
BRF
BS_4730 locale: Remove obsolete repertoire map references 2015-07-21 03:54:02 -04:00
BS_VIEWDATA locale: Remove obsolete repertoire map references 2015-07-21 03:54:02 -04:00
CP737
CP770 Add support for CP770, CP771, CP772, CP773, and CP774 2011-05-09 23:15:39 -04:00
CP771 Add support for CP770, CP771, CP772, CP773, and CP774 2011-05-09 23:15:39 -04:00
CP772 Add support for CP770, CP771, CP772, CP773, and CP774 2011-05-09 23:15:39 -04:00
CP773 Add support for CP770, CP771, CP772, CP773, and CP774 2011-05-09 23:15:39 -04:00
CP774 Add support for CP770, CP771, CP772, CP773, and CP774 2011-05-09 23:15:39 -04:00
CP775
CP949
CP1125
CP1250 locale: Remove obsolete repertoire map references 2015-07-21 03:54:02 -04:00
CP1251 locale: Remove obsolete repertoire map references 2015-07-21 03:54:02 -04:00
CP1252 locale: Remove obsolete repertoire map references 2015-07-21 03:54:02 -04:00
CP1253 locale: Remove obsolete repertoire map references 2015-07-21 03:54:02 -04:00
CP1254 Bug 21399: Fix CP1254 comment for U+00EC 2017-04-19 08:10:35 -04:00
CP1255 locale: Remove obsolete repertoire map references 2015-07-21 03:54:02 -04:00
CP1256 locale: Remove obsolete repertoire map references 2015-07-21 03:54:02 -04:00
CP1257
CP1258
CP10007 localedata: change M$ to Microsoft 2016-08-10 00:49:14 +08:00
CSA_Z243.4-1985-1
CSA_Z243.4-1985-2
CSA_Z243.4-1985-GR
CSN_369103 locale: Remove obsolete repertoire map references 2015-07-21 03:54:02 -04:00
CWI
DEC-MCS
DIN_66003
DS_2089
EBCDIC-AT-DE
EBCDIC-AT-DE-A
EBCDIC-CA-FR
EBCDIC-DK-NO locale: Remove obsolete repertoire map references 2015-07-21 03:54:02 -04:00
EBCDIC-DK-NO-A
EBCDIC-ES
EBCDIC-ES-A
EBCDIC-ES-S
EBCDIC-FI-SE
EBCDIC-FI-SE-A
EBCDIC-FR
EBCDIC-IS-FRISS
EBCDIC-IT
EBCDIC-PT
EBCDIC-UK
EBCDIC-US
ECMA-CYRILLIC
ES
ES2
EUC-JISX0213
EUC-JP
EUC-JP-MS
EUC-KR
EUC-TW
GB2312
GB18030 add GB18030-2022 charmap and test the entire GB18030 charmap [BZ #30243] 2023-08-29 19:02:30 +02:00
GB_1988-80 locale: Remove obsolete repertoire map references 2015-07-21 03:54:02 -04:00
GBK localedata: GBK: add mapping for 0x80->Euro sign [BZ #20864] 2016-11-26 17:20:22 -05:00
GEORGIAN-ACADEMY
GEORGIAN-PS
GOST_19768-74
GREEK7 locale: Remove obsolete repertoire map references 2015-07-21 03:54:02 -04:00
GREEK7-OLD
GREEK-CCITT locale: Remove obsolete repertoire map references 2015-07-21 03:54:02 -04:00
HP-GREEK8
HP-ROMAN8
HP-ROMAN9
HP-THAI8
HP-TURKISH8
IBM037
IBM038
IBM256 localedata: Use U+00AF MACRON in more EBCDIC charsets [BZ #27882] 2021-05-18 07:21:45 +02:00
IBM273 localedata: Make IBM273 compatible with ISO-8859-1 [BZ #23290] 2018-06-14 22:34:10 +02:00
IBM274
IBM275
IBM277 localedata: Use U+00AF MACRON in more EBCDIC charsets [BZ #27882] 2021-05-18 07:21:45 +02:00
IBM278 localedata: Use U+00AF MACRON in more EBCDIC charsets [BZ #27882] 2021-05-18 07:21:45 +02:00
IBM280 localedata: Use U+00AF MACRON in more EBCDIC charsets [BZ #27882] 2021-05-18 07:21:45 +02:00
IBM281
IBM284 localedata: Use U+00AF MACRON in more EBCDIC charsets [BZ #27882] 2021-05-18 07:21:45 +02:00
IBM285
IBM290
IBM297 localedata: Use U+00AF MACRON in more EBCDIC charsets [BZ #27882] 2021-05-18 07:21:45 +02:00
IBM420
IBM423
IBM424 localedata: Use U+00AF MACRON in more EBCDIC charsets [BZ #27882] 2021-05-18 07:21:45 +02:00
IBM437
IBM500
IBM850
IBM851
IBM852
IBM855
IBM856
IBM857
IBM858 Add new codepage charmaps/IBM858 [BZ #21084] 2017-09-14 15:50:57 +02:00
IBM860
IBM861
IBM862
IBM863
IBM864
IBM865
IBM866
IBM866NAV Remove trailing whitespace from localedata. 2013-06-07 14:56:03 +00:00
IBM868
IBM869
IBM870
IBM871
IBM874
IBM875 charmaps: IBM875: fix mapping of iota/upsilon variants [BZ #18453] 2016-05-07 19:55:55 -04:00
IBM880
IBM891
IBM903
IBM904
IBM905
IBM918
IBM922
IBM1004
IBM1026
IBM1047
IBM1124
IBM1129
IBM1132
IBM1133
IBM1160
IBM1161
IBM1162
IBM1163
IBM1164
IEC_P27-1
INIS
INIS-8 locale: Remove obsolete repertoire map references 2015-07-21 03:54:02 -04:00
INIS-CYRILLIC
INVARIANT
ISIRI-3342 locale: Remove obsolete repertoire map references 2015-07-21 03:54:02 -04:00
ISO_646.BASIC
ISO_646.IRV
ISO_2033-1983 locale: Remove obsolete repertoire map references 2015-07-21 03:54:02 -04:00
ISO_5427
ISO_5427-EXT locale: Remove obsolete repertoire map references 2015-07-21 03:54:02 -04:00
ISO_5428
ISO_6937
ISO_6937-2-25
ISO_6937-2-ADD
ISO_8859-1,GL Remove trailing whitespace from localedata. 2013-06-07 14:56:03 +00:00
ISO_8859-SUPP
ISO_10367-BOX
ISO_10646
ISO_11548-1
ISO-8859-1
ISO-8859-2
ISO-8859-3
ISO-8859-4
ISO-8859-5
ISO-8859-6
ISO-8859-7
ISO-8859-8
ISO-8859-9
ISO-8859-9E
ISO-8859-10
ISO-8859-11
ISO-8859-13
ISO-8859-14
ISO-8859-15
ISO-8859-16
ISO-IR-90
ISO-IR-197
ISO-IR-209
IT
JIS_C6220-1969-JP
JIS_C6220-1969-RO
JIS_C6229-1984-A
JIS_C6229-1984-B
JIS_C6229-1984-B-ADD
JIS_C6229-1984-HAND
JIS_C6229-1984-HAND-ADD
JIS_C6229-1984-KANA
JIS_X0201
JOHAB
JUS_I.B1.002
JUS_I.B1.003-MAC
JUS_I.B1.003-SERB
KOI8-R
KOI8-RU
KOI8-T
KOI8-U
KOI-8
KSC5636
LATIN-GREEK
LATIN-GREEK-1
MAC-CENTRALEUROPE
MAC-CYRILLIC
MAC-IS
MAC-SAMI
MAC-UK
MACINTOSH
MIK
MSZ_7795.3
NATS-DANO
NATS-DANO-ADD
NATS-SEFI
NATS-SEFI-ADD
NC_NC00-10
NEXTSTEP
NF_Z_62-010
NF_Z_62-010_1973
NS_4551-1
NS_4551-2
PT
PT2
PT154
RK1048
SAMI
SAMI-WS2
SEN_850200_B
SEN_850200_C
SHIFT_JIS
SHIFT_JISX0213
T.61-7BIT
T.61-8BIT
T.101-G2
TCVN5712-1
TIS-620
TSCII
UTF-8 Update to Unicode 16.0.0 [BZ #32168] 2024-09-27 14:43:38 +02:00
VIDEOTEX-SUPPL
VISCII
WINDOWS-31J