Update to Unicode 14.0.0 [BZ #28390]

Unicode 14.0.0 Support: Character encoding, character type info, and
transliteration tables are all updated to Unicode 14.0.0, using
the generator scripts contributed by Mike FABIAN (Red Hat).

Total added characters in newly generated CHARMAP: 838
Total removed characters in newly generated WIDTH: 1
    (Characters not in WIDTH get width 1 by default, i.e. these have width 1 now.)
    removed: <U1734> 0 : eaw=N category=Mc bidi=L   name=HANUNOO SIGN PAMUDPOD
    That seems intentional, the character had category Mn (Mark, nonspacing) before
    and now has Mc (Mark, spacing combining)
Total changed characters in newly generated WIDTH: 0
Total added characters in newly generated WIDTH: 175
This commit is contained in:
Mike FABIAN 2021-09-27 15:27:36 +02:00
parent eae81d7057
commit b517256015
15 changed files with 5103 additions and 2436 deletions

4
NEWS
View File

@ -9,6 +9,10 @@ Version 2.35
Major new features:
* Unicode 14.0.0 Support: Character encoding, character type info, and
transliteration tables are all updated to Unicode 14.0.0, using
generator scripts contributed by Mike FABIAN (Red Hat).
* Bump r_version in the debugger interface to 2 and add a new field,
r_next, support multiple namespaces.

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -9,7 +9,7 @@ comment_char %
% otherwise be governed by that license.
% Transliterations of encircled characters.
% Generated automatically from UnicodeData.txt by gen_translit_circle.py on 2020-06-25 for Unicode 13.0.0.
% Generated automatically from UnicodeData.txt by gen_translit_circle.py on 2021-09-27 for Unicode 14.0.0.
LC_CTYPE

View File

@ -9,7 +9,7 @@ comment_char %
% otherwise be governed by that license.
% Transliterations of CJK compatibility characters.
% Generated automatically from UnicodeData.txt by gen_translit_cjk_compat.py on 2020-06-25 for Unicode 13.0.0.
% Generated automatically from UnicodeData.txt by gen_translit_cjk_compat.py on 2021-09-27 for Unicode 14.0.0.
LC_CTYPE

View File

@ -10,7 +10,7 @@ comment_char %
% Transliterations that remove all combining characters (accents,
% pronounciation marks, etc.).
% Generated automatically from UnicodeData.txt by gen_translit_combining.py on 2020-06-25 for Unicode 13.0.0.
% Generated automatically from UnicodeData.txt by gen_translit_combining.py on 2021-09-27 for Unicode 14.0.0.
LC_CTYPE
@ -446,6 +446,40 @@ translit_start
<U06EC> ""
% ARABIC SMALL LOW MEEM
<U06ED> ""
% ARABIC SMALL HIGH WORD AL-JUZ
<U0898> ""
% ARABIC SMALL LOW WORD ISHMAAM
<U0899> ""
% ARABIC SMALL LOW WORD IMAALA
<U089A> ""
% ARABIC SMALL LOW WORD TASHEEL
<U089B> ""
% ARABIC MADDA WAAJIB
<U089C> ""
% ARABIC SUPERSCRIPT ALEF MOKHASSAS
<U089D> ""
% ARABIC DOUBLED MADDA
<U089E> ""
% ARABIC HALF MADDA OVER MADDA
<U089F> ""
% ARABIC SMALL HIGH FARSI YEH
<U08CA> ""
% ARABIC SMALL HIGH YEH BARREE WITH TWO DOTS BELOW
<U08CB> ""
% ARABIC SMALL HIGH WORD SAH
<U08CC> ""
% ARABIC SMALL HIGH ZAH
<U08CD> ""
% ARABIC LARGE ROUND DOT ABOVE
<U08CE> ""
% ARABIC LARGE ROUND DOT BELOW
<U08CF> ""
% ARABIC SUKUN BELOW
<U08D0> ""
% ARABIC LARGE CIRCLE BELOW
<U08D1> ""
% ARABIC LARGE ROUND DOT INSIDE CIRCLE BELOW
<U08D2> ""
% ARABIC SMALL LOW WAW
<U08D3> ""
% ARABIC SMALL HIGH WORD AR-RUB
@ -568,6 +602,34 @@ translit_start
<U1ABF> ""
% COMBINING LATIN SMALL LETTER TURNED W BELOW
<U1AC0> ""
% COMBINING LEFT PARENTHESIS ABOVE LEFT
<U1AC1> ""
% COMBINING RIGHT PARENTHESIS ABOVE RIGHT
<U1AC2> ""
% COMBINING LEFT PARENTHESIS BELOW LEFT
<U1AC3> ""
% COMBINING RIGHT PARENTHESIS BELOW RIGHT
<U1AC4> ""
% COMBINING SQUARE BRACKETS ABOVE
<U1AC5> ""
% COMBINING NUMBER SIGN ABOVE
<U1AC6> ""
% COMBINING INVERTED DOUBLE ARCH ABOVE
<U1AC7> ""
% COMBINING PLUS SIGN ABOVE
<U1AC8> ""
% COMBINING DOUBLE PLUS SIGN ABOVE
<U1AC9> ""
% COMBINING DOUBLE PLUS SIGN BELOW
<U1ACA> ""
% COMBINING TRIPLE ACUTE ACCENT
<U1ACB> ""
% COMBINING LATIN SMALL LETTER INSULAR G
<U1ACC> ""
% COMBINING LATIN SMALL LETTER INSULAR R
<U1ACD> ""
% COMBINING LATIN SMALL LETTER INSULAR T
<U1ACE> ""
% COMBINING DOTTED GRAVE ACCENT
<U1DC0> ""
% COMBINING DOTTED ACUTE ACCENT
@ -684,6 +746,8 @@ translit_start
<U1DF8> ""
% COMBINING WIDE INVERTED BRIDGE BELOW
<U1DF9> ""
% COMBINING DOT BELOW LEFT
<U1DFA> ""
% COMBINING DELETION MARK
<U1DFB> ""
% COMBINING DOUBLE INVERTED BREVE BELOW
@ -840,6 +904,14 @@ translit_start
<U00010F4F> ""
% SOGDIAN COMBINING STROKE BELOW
<U00010F50> ""
% OLD UYGHUR COMBINING DOT ABOVE
<U00010F82> ""
% OLD UYGHUR COMBINING DOT BELOW
<U00010F83> ""
% OLD UYGHUR COMBINING TWO DOTS ABOVE
<U00010F84> ""
% OLD UYGHUR COMBINING TWO DOTS BELOW
<U00010F85> ""
% COMBINING BINDU BELOW
<U0001133B> ""
% NEWA VOWEL SIGN AA
@ -1244,6 +1316,144 @@ translit_start
<U00016FF0> ""
% VIETNAMESE ALTERNATE READING MARK NHAY
<U00016FF1> ""
% ZNAMENNY COMBINING MARK GORAZDO NIZKO S KRYZHEM ON LEFT
<U0001CF00> ""
% ZNAMENNY COMBINING MARK NIZKO S KRYZHEM ON LEFT
<U0001CF01> ""
% ZNAMENNY COMBINING MARK TSATA ON LEFT
<U0001CF02> ""
% ZNAMENNY COMBINING MARK GORAZDO NIZKO ON LEFT
<U0001CF03> ""
% ZNAMENNY COMBINING MARK NIZKO ON LEFT
<U0001CF04> ""
% ZNAMENNY COMBINING MARK SREDNE ON LEFT
<U0001CF05> ""
% ZNAMENNY COMBINING MARK MALO POVYSHE ON LEFT
<U0001CF06> ""
% ZNAMENNY COMBINING MARK POVYSHE ON LEFT
<U0001CF07> ""
% ZNAMENNY COMBINING MARK VYSOKO ON LEFT
<U0001CF08> ""
% ZNAMENNY COMBINING MARK MALO POVYSHE S KHOKHLOM ON LEFT
<U0001CF09> ""
% ZNAMENNY COMBINING MARK POVYSHE S KHOKHLOM ON LEFT
<U0001CF0A> ""
% ZNAMENNY COMBINING MARK VYSOKO S KHOKHLOM ON LEFT
<U0001CF0B> ""
% ZNAMENNY COMBINING MARK GORAZDO NIZKO S KRYZHEM ON RIGHT
<U0001CF0C> ""
% ZNAMENNY COMBINING MARK NIZKO S KRYZHEM ON RIGHT
<U0001CF0D> ""
% ZNAMENNY COMBINING MARK TSATA ON RIGHT
<U0001CF0E> ""
% ZNAMENNY COMBINING MARK GORAZDO NIZKO ON RIGHT
<U0001CF0F> ""
% ZNAMENNY COMBINING MARK NIZKO ON RIGHT
<U0001CF10> ""
% ZNAMENNY COMBINING MARK SREDNE ON RIGHT
<U0001CF11> ""
% ZNAMENNY COMBINING MARK MALO POVYSHE ON RIGHT
<U0001CF12> ""
% ZNAMENNY COMBINING MARK POVYSHE ON RIGHT
<U0001CF13> ""
% ZNAMENNY COMBINING MARK VYSOKO ON RIGHT
<U0001CF14> ""
% ZNAMENNY COMBINING MARK MALO POVYSHE S KHOKHLOM ON RIGHT
<U0001CF15> ""
% ZNAMENNY COMBINING MARK POVYSHE S KHOKHLOM ON RIGHT
<U0001CF16> ""
% ZNAMENNY COMBINING MARK VYSOKO S KHOKHLOM ON RIGHT
<U0001CF17> ""
% ZNAMENNY COMBINING MARK TSATA S KRYZHEM
<U0001CF18> ""
% ZNAMENNY COMBINING MARK MALO POVYSHE S KRYZHEM
<U0001CF19> ""
% ZNAMENNY COMBINING MARK STRANNO MALO POVYSHE
<U0001CF1A> ""
% ZNAMENNY COMBINING MARK POVYSHE S KRYZHEM
<U0001CF1B> ""
% ZNAMENNY COMBINING MARK POVYSHE STRANNO
<U0001CF1C> ""
% ZNAMENNY COMBINING MARK VYSOKO S KRYZHEM
<U0001CF1D> ""
% ZNAMENNY COMBINING MARK MALO POVYSHE STRANNO
<U0001CF1E> ""
% ZNAMENNY COMBINING MARK GORAZDO VYSOKO
<U0001CF1F> ""
% ZNAMENNY COMBINING MARK ZELO
<U0001CF20> ""
% ZNAMENNY COMBINING MARK ON
<U0001CF21> ""
% ZNAMENNY COMBINING MARK RAVNO
<U0001CF22> ""
% ZNAMENNY COMBINING MARK TIKHAYA
<U0001CF23> ""
% ZNAMENNY COMBINING MARK BORZAYA
<U0001CF24> ""
% ZNAMENNY COMBINING MARK UDARKA
<U0001CF25> ""
% ZNAMENNY COMBINING MARK PODVERTKA
<U0001CF26> ""
% ZNAMENNY COMBINING MARK LOMKA
<U0001CF27> ""
% ZNAMENNY COMBINING MARK KUPNAYA
<U0001CF28> ""
% ZNAMENNY COMBINING MARK KACHKA
<U0001CF29> ""
% ZNAMENNY COMBINING MARK ZEVOK
<U0001CF2A> ""
% ZNAMENNY COMBINING MARK SKOBA
<U0001CF2B> ""
% ZNAMENNY COMBINING MARK RAZSEKA
<U0001CF2C> ""
% ZNAMENNY COMBINING MARK KRYZH ON LEFT
<U0001CF2D> ""
% ZNAMENNY COMBINING TONAL RANGE MARK MRACHNO
<U0001CF30> ""
% ZNAMENNY COMBINING TONAL RANGE MARK SVETLO
<U0001CF31> ""
% ZNAMENNY COMBINING TONAL RANGE MARK TRESVETLO
<U0001CF32> ""
% ZNAMENNY COMBINING MARK ZADERZHKA
<U0001CF33> ""
% ZNAMENNY COMBINING MARK DEMESTVENNY ZADERZHKA
<U0001CF34> ""
% ZNAMENNY COMBINING MARK OTSECHKA
<U0001CF35> ""
% ZNAMENNY COMBINING MARK PODCHASHIE
<U0001CF36> ""
% ZNAMENNY COMBINING MARK PODCHASHIE WITH VERTICAL STROKE
<U0001CF37> ""
% ZNAMENNY COMBINING MARK CHASHKA
<U0001CF38> ""
% ZNAMENNY COMBINING MARK CHASHKA POLNAYA
<U0001CF39> ""
% ZNAMENNY COMBINING MARK OBLACHKO
<U0001CF3A> ""
% ZNAMENNY COMBINING MARK SOROCHYA NOZHKA
<U0001CF3B> ""
% ZNAMENNY COMBINING MARK TOCHKA
<U0001CF3C> ""
% ZNAMENNY COMBINING MARK DVOETOCHIE
<U0001CF3D> ""
% ZNAMENNY COMBINING ATTACHING VERTICAL OMET
<U0001CF3E> ""
% ZNAMENNY COMBINING MARK CURVED OMET
<U0001CF3F> ""
% ZNAMENNY COMBINING MARK KRYZH
<U0001CF40> ""
% ZNAMENNY COMBINING LOWER TONAL RANGE INDICATOR
<U0001CF41> ""
% ZNAMENNY PRIZNAK MODIFIER LEVEL-2
<U0001CF42> ""
% ZNAMENNY PRIZNAK MODIFIER LEVEL-3
<U0001CF43> ""
% ZNAMENNY PRIZNAK MODIFIER DIRECTION FLIP
<U0001CF44> ""
% ZNAMENNY PRIZNAK MODIFIER KRYZH
<U0001CF45> ""
% ZNAMENNY PRIZNAK MODIFIER ROG
<U0001CF46> ""
% COMBINING GREEK MUSICAL TRISEME
<U0001D242> ""
% COMBINING GREEK MUSICAL TETRASEME
@ -1340,6 +1550,8 @@ translit_start
<U0001E135> ""
% NYIAKENG PUACHUE HMONG TONE-D
<U0001E136> ""
% TOTO SIGN RISING TONE
<U0001E2AE> ""
% WANCHO TONE TUP
<U0001E2EC> ""
% WANCHO TONE TUPNI

View File

@ -9,7 +9,7 @@ comment_char %
% otherwise be governed by that license.
% Transliterations of compatibility characters and ligatures.
% Generated automatically from UnicodeData.txt by gen_translit_compat.py on 2020-06-25 for Unicode 13.0.0.
% Generated automatically from UnicodeData.txt by gen_translit_compat.py on 2021-09-27 for Unicode 14.0.0.
LC_CTYPE
@ -1679,6 +1679,12 @@ translit_start
<UA69D> "<U044C>"
% MODIFIER LETTER US
<UA770> "<UA76F>"
% MODIFIER LETTER CAPITAL C
<UA7F2> "<U0043>"
% MODIFIER LETTER CAPITAL F
<UA7F3> "<U0046>"
% MODIFIER LETTER CAPITAL Q
<UA7F4> "<U0051>"
% MODIFIER LETTER CAPITAL H WITH STROKE
<UA7F8> "<U0126>"
% MODIFIER LETTER SMALL LIGATURE OE
@ -1799,6 +1805,118 @@ translit_start
<UFE4E> "<U005F>"
% WAVY LOW LINE
<UFE4F> "<U005F>"
% MODIFIER LETTER SUPERSCRIPT TRIANGULAR COLON
<U00010781> "<U02D0>"
% MODIFIER LETTER SUPERSCRIPT HALF TRIANGULAR COLON
<U00010782> "<U02D1>"
% MODIFIER LETTER SMALL AE
<U00010783> "<U00E6>"
% MODIFIER LETTER SMALL CAPITAL B
<U00010784> "<U0299>"
% MODIFIER LETTER SMALL B WITH HOOK
<U00010785> "<U0253>"
% MODIFIER LETTER SMALL DZ DIGRAPH
<U00010787> "<U02A3>"
% MODIFIER LETTER SMALL DZ DIGRAPH WITH RETROFLEX HOOK
<U00010788> "<UAB66>"
% MODIFIER LETTER SMALL DZ DIGRAPH WITH CURL
<U00010789> "<U02A5>"
% MODIFIER LETTER SMALL DEZH DIGRAPH
<U0001078A> "<U02A4>"
% MODIFIER LETTER SMALL D WITH TAIL
<U0001078B> "<U0256>"
% MODIFIER LETTER SMALL D WITH HOOK
<U0001078C> "<U0257>"
% MODIFIER LETTER SMALL D WITH HOOK AND TAIL
<U0001078D> "<U1D91>"
% MODIFIER LETTER SMALL REVERSED E
<U0001078E> "<U0258>"
% MODIFIER LETTER SMALL CLOSED REVERSED OPEN E
<U0001078F> "<U025E>"
% MODIFIER LETTER SMALL FENG DIGRAPH
<U00010790> "<U02A9>"
% MODIFIER LETTER SMALL RAMS HORN
<U00010791> "<U0264>"
% MODIFIER LETTER SMALL CAPITAL G
<U00010792> "<U0262>"
% MODIFIER LETTER SMALL G WITH HOOK
<U00010793> "<U0260>"
% MODIFIER LETTER SMALL CAPITAL G WITH HOOK
<U00010794> "<U029B>"
% MODIFIER LETTER SMALL H WITH STROKE
<U00010795> "<U0127>"
% MODIFIER LETTER SMALL CAPITAL H
<U00010796> "<U029C>"
% MODIFIER LETTER SMALL HENG WITH HOOK
<U00010797> "<U0267>"
% MODIFIER LETTER SMALL DOTLESS J WITH STROKE AND HOOK
<U00010798> "<U0284>"
% MODIFIER LETTER SMALL LS DIGRAPH
<U00010799> "<U02AA>"
% MODIFIER LETTER SMALL LZ DIGRAPH
<U0001079A> "<U02AB>"
% MODIFIER LETTER SMALL L WITH BELT
<U0001079B> "<U026C>"
% MODIFIER LETTER SMALL CAPITAL L WITH BELT
<U0001079C> "<U0001DF04>"
% MODIFIER LETTER SMALL L WITH RETROFLEX HOOK AND BELT
<U0001079D> "<UA78E>"
% MODIFIER LETTER SMALL LEZH
<U0001079E> "<U026E>"
% MODIFIER LETTER SMALL LEZH WITH RETROFLEX HOOK
<U0001079F> "<U0001DF05>"
% MODIFIER LETTER SMALL TURNED Y
<U000107A0> "<U028E>"
% MODIFIER LETTER SMALL TURNED Y WITH BELT
<U000107A1> "<U0001DF06>"
% MODIFIER LETTER SMALL O WITH STROKE
<U000107A2> "<U00F8>"
% MODIFIER LETTER SMALL CAPITAL OE
<U000107A3> "<U0276>"
% MODIFIER LETTER SMALL CLOSED OMEGA
<U000107A4> "<U0277>"
% MODIFIER LETTER SMALL Q
<U000107A5> "<U0071>"
% MODIFIER LETTER SMALL TURNED R WITH LONG LEG
<U000107A6> "<U027A>"
% MODIFIER LETTER SMALL TURNED R WITH LONG LEG AND RETROFLEX HOOK
<U000107A7> "<U0001DF08>"
% MODIFIER LETTER SMALL R WITH TAIL
<U000107A8> "<U027D>"
% MODIFIER LETTER SMALL R WITH FISHHOOK
<U000107A9> "<U027E>"
% MODIFIER LETTER SMALL CAPITAL R
<U000107AA> "<U0280>"
% MODIFIER LETTER SMALL TC DIGRAPH WITH CURL
<U000107AB> "<U02A8>"
% MODIFIER LETTER SMALL TS DIGRAPH
<U000107AC> "<U02A6>"
% MODIFIER LETTER SMALL TS DIGRAPH WITH RETROFLEX HOOK
<U000107AD> "<UAB67>"
% MODIFIER LETTER SMALL TESH DIGRAPH
<U000107AE> "<U02A7>"
% MODIFIER LETTER SMALL T WITH RETROFLEX HOOK
<U000107AF> "<U0288>"
% MODIFIER LETTER SMALL V WITH RIGHT HOOK
<U000107B0> "<U2C71>"
% MODIFIER LETTER SMALL CAPITAL Y
<U000107B2> "<U028F>"
% MODIFIER LETTER GLOTTAL STOP WITH STROKE
<U000107B3> "<U02A1>"
% MODIFIER LETTER REVERSED GLOTTAL STOP WITH STROKE
<U000107B4> "<U02A2>"
% MODIFIER LETTER BILABIAL CLICK
<U000107B5> "<U0298>"
% MODIFIER LETTER DENTAL CLICK
<U000107B6> "<U01C0>"
% MODIFIER LETTER LATERAL CLICK
<U000107B7> "<U01C1>"
% MODIFIER LETTER ALVEOLAR CLICK
<U000107B8> "<U01C2>"
% MODIFIER LETTER RETROFLEX CLICK WITH RETROFLEX HOOK
<U000107B9> "<U0001DF0A>"
% MODIFIER LETTER SMALL S WITH CURL
<U000107BA> "<U0001DF1E>"
% DIGIT ZERO FULL STOP
<U0001F100> "<U0030><U002E>"
% DIGIT ZERO COMMA

View File

@ -9,7 +9,7 @@ comment_char %
% otherwise be governed by that license.
% Transliterations of font equivalents.
% Generated automatically from UnicodeData.txt by gen_translit_font.py on 2020-06-25 for Unicode 13.0.0.
% Generated automatically from UnicodeData.txt by gen_translit_font.py on 2021-09-27 for Unicode 14.0.0.
LC_CTYPE

View File

@ -9,7 +9,7 @@ comment_char %
% otherwise be governed by that license.
% Transliterations of fractions.
% Generated automatically from UnicodeData.txt by gen_translit_fraction.py on 2020-06-25 for Unicode 13.0.0.
% Generated automatically from UnicodeData.txt by gen_translit_fraction.py on 2021-09-27 for Unicode 14.0.0.
% The replacements have been surrounded with spaces, because fractions are
% often preceded by a decimal number and followed by a unit or a math symbol.

File diff suppressed because it is too large Load Diff

View File

@ -1,11 +1,11 @@
# EastAsianWidth-13.0.0.txt
# Date: 2029-01-21, 18:14:00 GMT [KW, LI]
# © 2020 Unicode®, Inc.
# EastAsianWidth-14.0.0.txt
# Date: 2021-07-06, 09:58:53 GMT [KW, LI]
# © 2021 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use, see http://www.unicode.org/terms_of_use.html
# For terms of use, see https://www.unicode.org/terms_of_use.html
#
# Unicode Character Database
# For documentation, see http://www.unicode.org/reports/tr44/
# For documentation, see https://www.unicode.org/reports/tr44/
#
# East_Asian_Width Property
#
@ -37,7 +37,7 @@
# with ranges of code points, the code point count in square brackets.
#
# For more information, see UAX #11: East Asian Width,
# at http://www.unicode.org/reports/tr11/
# at https://www.unicode.org/reports/tr11/
#
# @missing: 0000..10FFFF; N
0000..001F;N # Cc [32] <control-0000>..<control-001F>
@ -273,7 +273,7 @@
0610..061A;N # Mn [11] ARABIC SIGN SALLALLAHOU ALAYHE WASSALLAM..ARABIC SMALL KASRA
061B;N # Po ARABIC SEMICOLON
061C;N # Cf ARABIC LETTER MARK
061E..061F;N # Po [2] ARABIC TRIPLE DOT PUNCTUATION MARK..ARABIC QUESTION MARK
061D..061F;N # Po [3] ARABIC END OF TEXT MARK..ARABIC QUESTION MARK
0620..063F;N # Lo [32] ARABIC LETTER KASHMIRI YEH..ARABIC LETTER FARSI YEH WITH THREE DOTS ABOVE
0640;N # Lm ARABIC TATWEEL
0641..064A;N # Lo [10] ARABIC LETTER FEH..ARABIC LETTER YEH
@ -331,9 +331,14 @@
0859..085B;N # Mn [3] MANDAIC AFFRICATION MARK..MANDAIC GEMINATION MARK
085E;N # Po MANDAIC PUNCTUATION
0860..086A;N # Lo [11] SYRIAC LETTER MALAYALAM NGA..SYRIAC LETTER MALAYALAM SSA
08A0..08B4;N # Lo [21] ARABIC LETTER BEH WITH SMALL V BELOW..ARABIC LETTER KAF WITH DOT BELOW
08B6..08C7;N # Lo [18] ARABIC LETTER BEH WITH SMALL MEEM ABOVE..ARABIC LETTER LAM WITH SMALL ARABIC LETTER TAH ABOVE
08D3..08E1;N # Mn [15] ARABIC SMALL LOW WAW..ARABIC SMALL HIGH SIGN SAFHA
0870..0887;N # Lo [24] ARABIC LETTER ALEF WITH ATTACHED FATHA..ARABIC BASELINE ROUND DOT
0888;N # Sk ARABIC RAISED ROUND DOT
0889..088E;N # Lo [6] ARABIC LETTER NOON WITH INVERTED SMALL V..ARABIC VERTICAL TAIL
0890..0891;N # Cf [2] ARABIC POUND MARK ABOVE..ARABIC PIASTRE MARK ABOVE
0898..089F;N # Mn [8] ARABIC SMALL HIGH WORD AL-JUZ..ARABIC HALF MADDA OVER MADDA
08A0..08C8;N # Lo [41] ARABIC LETTER BEH WITH SMALL V BELOW..ARABIC LETTER GRAF
08C9;N # Lm ARABIC SMALL FARSI YEH
08CA..08E1;N # Mn [24] ARABIC SMALL HIGH FARSI YEH..ARABIC SMALL HIGH SIGN SAFHA
08E2;N # Cf ARABIC DISPUTED END OF AYAH
08E3..08FF;N # Mn [29] ARABIC TURNED DAMMA BELOW..ARABIC MARK SIDEWAYS NOON GHUNNA
0900..0902;N # Mn [3] DEVANAGARI SIGN INVERTED CANDRABINDU..DEVANAGARI SIGN ANUSVARA
@ -490,6 +495,7 @@
0C0E..0C10;N # Lo [3] TELUGU LETTER E..TELUGU LETTER AI
0C12..0C28;N # Lo [23] TELUGU LETTER O..TELUGU LETTER NA
0C2A..0C39;N # Lo [16] TELUGU LETTER PA..TELUGU LETTER HA
0C3C;N # Mn TELUGU SIGN NUKTA
0C3D;N # Lo TELUGU SIGN AVAGRAHA
0C3E..0C40;N # Mn [3] TELUGU VOWEL SIGN AA..TELUGU VOWEL SIGN II
0C41..0C44;N # Mc [4] TELUGU VOWEL SIGN U..TELUGU VOWEL SIGN VOCALIC RR
@ -497,6 +503,7 @@
0C4A..0C4D;N # Mn [4] TELUGU VOWEL SIGN O..TELUGU SIGN VIRAMA
0C55..0C56;N # Mn [2] TELUGU LENGTH MARK..TELUGU AI LENGTH MARK
0C58..0C5A;N # Lo [3] TELUGU LETTER TSA..TELUGU LETTER RRRA
0C5D;N # Lo TELUGU LETTER NAKAARA POLLU
0C60..0C61;N # Lo [2] TELUGU LETTER VOCALIC RR..TELUGU LETTER VOCALIC LL
0C62..0C63;N # Mn [2] TELUGU VOWEL SIGN VOCALIC L..TELUGU VOWEL SIGN VOCALIC LL
0C66..0C6F;N # Nd [10] TELUGU DIGIT ZERO..TELUGU DIGIT NINE
@ -522,7 +529,7 @@
0CCA..0CCB;N # Mc [2] KANNADA VOWEL SIGN O..KANNADA VOWEL SIGN OO
0CCC..0CCD;N # Mn [2] KANNADA VOWEL SIGN AU..KANNADA SIGN VIRAMA
0CD5..0CD6;N # Mc [2] KANNADA LENGTH MARK..KANNADA AI LENGTH MARK
0CDE;N # Lo KANNADA LETTER FA
0CDD..0CDE;N # Lo [2] KANNADA LETTER NAKAARA POLLU..KANNADA LETTER FA
0CE0..0CE1;N # Lo [2] KANNADA LETTER VOCALIC RR..KANNADA LETTER VOCALIC LL
0CE2..0CE3;N # Mn [2] KANNADA VOWEL SIGN VOCALIC L..KANNADA VOWEL SIGN VOCALIC LL
0CE6..0CEF;N # Nd [10] KANNADA DIGIT ZERO..KANNADA DIGIT NINE
@ -709,11 +716,13 @@
16EB..16ED;N # Po [3] RUNIC SINGLE PUNCTUATION..RUNIC CROSS PUNCTUATION
16EE..16F0;N # Nl [3] RUNIC ARLAUG SYMBOL..RUNIC BELGTHOR SYMBOL
16F1..16F8;N # Lo [8] RUNIC LETTER K..RUNIC LETTER FRANKS CASKET AESC
1700..170C;N # Lo [13] TAGALOG LETTER A..TAGALOG LETTER YA
170E..1711;N # Lo [4] TAGALOG LETTER LA..TAGALOG LETTER HA
1700..1711;N # Lo [18] TAGALOG LETTER A..TAGALOG LETTER HA
1712..1714;N # Mn [3] TAGALOG VOWEL SIGN I..TAGALOG SIGN VIRAMA
1715;N # Mc TAGALOG SIGN PAMUDPOD
171F;N # Lo TAGALOG LETTER ARCHAIC RA
1720..1731;N # Lo [18] HANUNOO LETTER A..HANUNOO LETTER HA
1732..1734;N # Mn [3] HANUNOO VOWEL SIGN I..HANUNOO SIGN PAMUDPOD
1732..1733;N # Mn [2] HANUNOO VOWEL SIGN I..HANUNOO VOWEL SIGN U
1734;N # Mc HANUNOO SIGN PAMUDPOD
1735..1736;N # Po [2] PHILIPPINE SINGLE PUNCTUATION..PHILIPPINE DOUBLE PUNCTUATION
1740..1751;N # Lo [18] BUHID LETTER A..BUHID LETTER HA
1752..1753;N # Mn [2] BUHID VOWEL SIGN I..BUHID VOWEL SIGN U
@ -741,6 +750,7 @@
1807..180A;N # Po [4] MONGOLIAN SIBE SYLLABLE BOUNDARY MARKER..MONGOLIAN NIRUGU
180B..180D;N # Mn [3] MONGOLIAN FREE VARIATION SELECTOR ONE..MONGOLIAN FREE VARIATION SELECTOR THREE
180E;N # Cf MONGOLIAN VOWEL SEPARATOR
180F;N # Mn MONGOLIAN FREE VARIATION SELECTOR FOUR
1810..1819;N # Nd [10] MONGOLIAN DIGIT ZERO..MONGOLIAN DIGIT NINE
1820..1842;N # Lo [35] MONGOLIAN LETTER A..MONGOLIAN LETTER CHI
1843;N # Lm MONGOLIAN LETTER TODO LONG VOWEL SIGN
@ -796,7 +806,7 @@
1AA8..1AAD;N # Po [6] TAI THAM SIGN KAAN..TAI THAM SIGN CAANG
1AB0..1ABD;N # Mn [14] COMBINING DOUBLED CIRCUMFLEX ACCENT..COMBINING PARENTHESES BELOW
1ABE;N # Me COMBINING PARENTHESES OVERLAY
1ABF..1AC0;N # Mn [2] COMBINING LATIN SMALL LETTER W BELOW..COMBINING LATIN SMALL LETTER TURNED W BELOW
1ABF..1ACE;N # Mn [16] COMBINING LATIN SMALL LETTER W BELOW..COMBINING LATIN SMALL LETTER INSULAR T
1B00..1B03;N # Mn [4] BALINESE SIGN ULU RICEM..BALINESE SIGN SURANG
1B04;N # Mc BALINESE SIGN BISAH
1B05..1B33;N # Lo [47] BALINESE LETTER AKARA..BALINESE LETTER HA
@ -808,12 +818,13 @@
1B3D..1B41;N # Mc [5] BALINESE VOWEL SIGN LA LENGA TEDUNG..BALINESE VOWEL SIGN TALING REPA TEDUNG
1B42;N # Mn BALINESE VOWEL SIGN PEPET
1B43..1B44;N # Mc [2] BALINESE VOWEL SIGN PEPET TEDUNG..BALINESE ADEG ADEG
1B45..1B4B;N # Lo [7] BALINESE LETTER KAF SASAK..BALINESE LETTER ASYURA SASAK
1B45..1B4C;N # Lo [8] BALINESE LETTER KAF SASAK..BALINESE LETTER ARCHAIC JNYA
1B50..1B59;N # Nd [10] BALINESE DIGIT ZERO..BALINESE DIGIT NINE
1B5A..1B60;N # Po [7] BALINESE PANTI..BALINESE PAMENENG
1B61..1B6A;N # So [10] BALINESE MUSICAL SYMBOL DONG..BALINESE MUSICAL SYMBOL DANG GEDE
1B6B..1B73;N # Mn [9] BALINESE MUSICAL SYMBOL COMBINING TEGEH..BALINESE MUSICAL SYMBOL COMBINING GONG
1B74..1B7C;N # So [9] BALINESE MUSICAL SYMBOL RIGHT-HAND OPEN DUG..BALINESE MUSICAL SYMBOL LEFT-HAND OPEN PING
1B7D..1B7E;N # Po [2] BALINESE PANTI LANTANG..BALINESE PAMADA LANTANG
1B80..1B81;N # Mn [2] SUNDANESE SIGN PANYECEK..SUNDANESE SIGN PANGLAYAR
1B82;N # Mc SUNDANESE SIGN PANGWISAD
1B83..1BA0;N # Lo [30] SUNDANESE LETTER A..SUNDANESE LETTER HA
@ -872,8 +883,7 @@
1D79..1D7F;N # Ll [7] LATIN SMALL LETTER INSULAR G..LATIN SMALL LETTER UPSILON WITH STROKE
1D80..1D9A;N # Ll [27] LATIN SMALL LETTER B WITH PALATAL HOOK..LATIN SMALL LETTER EZH WITH RETROFLEX HOOK
1D9B..1DBF;N # Lm [37] MODIFIER LETTER SMALL TURNED ALPHA..MODIFIER LETTER SMALL THETA
1DC0..1DF9;N # Mn [58] COMBINING DOTTED GRAVE ACCENT..COMBINING WIDE INVERTED BRIDGE BELOW
1DFB..1DFF;N # Mn [5] COMBINING DELETION MARK..COMBINING RIGHT ARROWHEAD AND DOWN ARROWHEAD BELOW
1DC0..1DFF;N # Mn [64] COMBINING DOTTED GRAVE ACCENT..COMBINING RIGHT ARROWHEAD AND DOWN ARROWHEAD BELOW
1E00..1EFF;N # L& [256] LATIN CAPITAL LETTER A WITH RING BELOW..LATIN SMALL LETTER Y WITH LOOP
1F00..1F15;N # L& [22] GREEK SMALL LETTER ALPHA WITH PSILI..GREEK SMALL LETTER EPSILON WITH DASIA AND OXIA
1F18..1F1D;N # Lu [6] GREEK CAPITAL LETTER EPSILON WITH PSILI..GREEK CAPITAL LETTER EPSILON WITH DASIA AND OXIA
@ -965,7 +975,7 @@
20A9;H # Sc WON SIGN
20AA..20AB;N # Sc [2] NEW SHEQEL SIGN..DONG SIGN
20AC;A # Sc EURO SIGN
20AD..20BF;N # Sc [19] KIP SIGN..BITCOIN SIGN
20AD..20C0;N # Sc [20] KIP SIGN..SOM SIGN
20D0..20DC;N # Mn [13] COMBINING LEFT HARPOON ABOVE..COMBINING FOUR DOTS ABOVE
20DD..20E0;N # Me [4] COMBINING ENCLOSING CIRCLE..COMBINING ENCLOSING CIRCLE BACKSLASH
20E1;N # Mn COMBINING LEFT RIGHT ARROW ABOVE
@ -1338,8 +1348,7 @@
2B5A..2B73;N # So [26] SLANTED NORTH ARROW WITH HOOKED HEAD..DOWNWARDS TRIANGLE-HEADED ARROW TO BAR
2B76..2B95;N # So [32] NORTH WEST TRIANGLE-HEADED ARROW TO BAR..RIGHTWARDS BLACK ARROW
2B97..2BFF;N # So [105] SYMBOL FOR TYPE A ELECTRONICS..HELLSCHREIBER PAUSE SYMBOL
2C00..2C2E;N # Lu [47] GLAGOLITIC CAPITAL LETTER AZU..GLAGOLITIC CAPITAL LETTER LATINATE MYSLITE
2C30..2C5E;N # Ll [47] GLAGOLITIC SMALL LETTER AZU..GLAGOLITIC SMALL LETTER LATINATE MYSLITE
2C00..2C5F;N # L& [96] GLAGOLITIC CAPITAL LETTER AZU..GLAGOLITIC SMALL LETTER CAUDATE CHRIVI
2C60..2C7B;N # L& [28] LATIN CAPITAL LETTER L WITH DOUBLE BAR..LATIN LETTER SMALL CAPITAL TURNED E
2C7C..2C7D;N # Lm [2] LATIN SUBSCRIPT SMALL LETTER J..MODIFIER LETTER CAPITAL V
2C7E..2C7F;N # Lu [2] LATIN CAPITAL LETTER S WITH SWASH TAIL..LATIN CAPITAL LETTER Z WITH SWASH TAIL
@ -1407,7 +1416,16 @@
2E42;N # Ps DOUBLE LOW-REVERSED-9 QUOTATION MARK
2E43..2E4F;N # Po [13] DASH WITH LEFT UPTURN..CORNISH VERSE DIVIDER
2E50..2E51;N # So [2] CROSS PATTY WITH RIGHT CROSSBAR..CROSS PATTY WITH LEFT CROSSBAR
2E52;N # Po TIRONIAN SIGN CAPITAL ET
2E52..2E54;N # Po [3] TIRONIAN SIGN CAPITAL ET..MEDIEVAL QUESTION MARK
2E55;N # Ps LEFT SQUARE BRACKET WITH STROKE
2E56;N # Pe RIGHT SQUARE BRACKET WITH STROKE
2E57;N # Ps LEFT SQUARE BRACKET WITH DOUBLE STROKE
2E58;N # Pe RIGHT SQUARE BRACKET WITH DOUBLE STROKE
2E59;N # Ps TOP HALF LEFT PARENTHESIS
2E5A;N # Pe TOP HALF RIGHT PARENTHESIS
2E5B;N # Ps BOTTOM HALF LEFT PARENTHESIS
2E5C;N # Pe BOTTOM HALF RIGHT PARENTHESIS
2E5D;N # Pd OBLIQUE HYPHEN
2E80..2E99;W # So [26] CJK RADICAL REPEAT..CJK RADICAL RAP
2E9B..2EF3;W # So [89] CJK RADICAL CHOKE..CJK RADICAL C-SIMPLIFIED TURTLE
2F00..2FD5;W # So [214] KANGXI RADICAL ONE..KANGXI RADICAL FLUTE
@ -1485,8 +1503,7 @@
3300..33FF;W # So [256] SQUARE APAATO..SQUARE GAL
3400..4DBF;W # Lo [6592] CJK UNIFIED IDEOGRAPH-3400..CJK UNIFIED IDEOGRAPH-4DBF
4DC0..4DFF;N # So [64] HEXAGRAM FOR THE CREATIVE HEAVEN..HEXAGRAM FOR BEFORE COMPLETION
4E00..9FFC;W # Lo [20989] CJK UNIFIED IDEOGRAPH-4E00..CJK UNIFIED IDEOGRAPH-9FFC
9FFD..9FFF;W # Cn [3] <reserved-9FFD>..<reserved-9FFF>
4E00..9FFF;W # Lo [20992] CJK UNIFIED IDEOGRAPH-4E00..CJK UNIFIED IDEOGRAPH-9FFF
A000..A014;W # Lo [21] YI SYLLABLE IT..YI SYLLABLE E
A015;W # Lm YI SYLLABLE WU
A016..A48C;W # Lo [1143] YI SYLLABLE BIT..YI SYLLABLE YYR
@ -1525,8 +1542,11 @@ A788;N # Lm MODIFIER LETTER LOW CIRCUMFLEX ACCENT
A789..A78A;N # Sk [2] MODIFIER LETTER COLON..MODIFIER LETTER SHORT EQUALS SIGN
A78B..A78E;N # L& [4] LATIN CAPITAL LETTER SALTILLO..LATIN SMALL LETTER L WITH RETROFLEX HOOK AND BELT
A78F;N # Lo LATIN LETTER SINOLOGICAL DOT
A790..A7BF;N # L& [48] LATIN CAPITAL LETTER N WITH DESCENDER..LATIN SMALL LETTER GLOTTAL U
A7C2..A7CA;N # L& [9] LATIN CAPITAL LETTER ANGLICANA W..LATIN SMALL LETTER S WITH SHORT STROKE OVERLAY
A790..A7CA;N # L& [59] LATIN CAPITAL LETTER N WITH DESCENDER..LATIN SMALL LETTER S WITH SHORT STROKE OVERLAY
A7D0..A7D1;N # L& [2] LATIN CAPITAL LETTER CLOSED INSULAR G..LATIN SMALL LETTER CLOSED INSULAR G
A7D3;N # Ll LATIN SMALL LETTER DOUBLE THORN
A7D5..A7D9;N # L& [5] LATIN SMALL LETTER DOUBLE WYNN..LATIN SMALL LETTER SIGMOID S
A7F2..A7F4;N # Lm [3] MODIFIER LETTER CAPITAL C..MODIFIER LETTER CAPITAL Q
A7F5..A7F6;N # L& [2] LATIN CAPITAL LETTER REVERSED HALF H..LATIN SMALL LETTER REVERSED HALF H
A7F7;N # Lo LATIN EPIGRAPHIC LETTER SIDEWAYS I
A7F8..A7F9;N # Lm [2] MODIFIER LETTER CAPITAL H WITH STROKE..MODIFIER LETTER SMALL LIGATURE OE
@ -1682,15 +1702,17 @@ FB40..FB41;N # Lo [2] HEBREW LETTER NUN WITH DAGESH..HEBREW LETTER SAMEK
FB43..FB44;N # Lo [2] HEBREW LETTER FINAL PE WITH DAGESH..HEBREW LETTER PE WITH DAGESH
FB46..FB4F;N # Lo [10] HEBREW LETTER TSADI WITH DAGESH..HEBREW LIGATURE ALEF LAMED
FB50..FBB1;N # Lo [98] ARABIC LETTER ALEF WASLA ISOLATED FORM..ARABIC LETTER YEH BARREE WITH HAMZA ABOVE FINAL FORM
FBB2..FBC1;N # Sk [16] ARABIC SYMBOL DOT ABOVE..ARABIC SYMBOL SMALL TAH BELOW
FBB2..FBC2;N # Sk [17] ARABIC SYMBOL DOT ABOVE..ARABIC SYMBOL WASLA ABOVE
FBD3..FD3D;N # Lo [363] ARABIC LETTER NG ISOLATED FORM..ARABIC LIGATURE ALEF WITH FATHATAN ISOLATED FORM
FD3E;N # Pe ORNATE LEFT PARENTHESIS
FD3F;N # Ps ORNATE RIGHT PARENTHESIS
FD40..FD4F;N # So [16] ARABIC LIGATURE RAHIMAHU ALLAAH..ARABIC LIGATURE RAHIMAHUM ALLAAH
FD50..FD8F;N # Lo [64] ARABIC LIGATURE TEH WITH JEEM WITH MEEM INITIAL FORM..ARABIC LIGATURE MEEM WITH KHAH WITH MEEM INITIAL FORM
FD92..FDC7;N # Lo [54] ARABIC LIGATURE MEEM WITH JEEM WITH KHAH INITIAL FORM..ARABIC LIGATURE NOON WITH JEEM WITH YEH FINAL FORM
FDCF;N # So ARABIC LIGATURE SALAAMUHU ALAYNAA
FDF0..FDFB;N # Lo [12] ARABIC LIGATURE SALLA USED AS KORANIC STOP SIGN ISOLATED FORM..ARABIC LIGATURE JALLAJALALOUHOU
FDFC;N # Sc RIAL SIGN
FDFD;N # So ARABIC LIGATURE BISMILLAH AR-RAHMAN AR-RAHEEM
FDFD..FDFF;N # So [3] ARABIC LIGATURE BISMILLAH AR-RAHMAN AR-RAHEEM..ARABIC LIGATURE AZZA WA JALL
FE00..FE0F;A # Mn [16] VARIATION SELECTOR-1..VARIATION SELECTOR-16
FE10..FE16;W # Po [7] PRESENTATION FORM FOR VERTICAL COMMA..PRESENTATION FORM FOR VERTICAL QUESTION MARK
FE17;W # Ps PRESENTATION FORM FOR VERTICAL LEFT WHITE LENTICULAR BRACKET
@ -1839,9 +1861,20 @@ FFFD;A # So REPLACEMENT CHARACTER
10500..10527;N # Lo [40] ELBASAN LETTER A..ELBASAN LETTER KHE
10530..10563;N # Lo [52] CAUCASIAN ALBANIAN LETTER ALT..CAUCASIAN ALBANIAN LETTER KIW
1056F;N # Po CAUCASIAN ALBANIAN CITATION MARK
10570..1057A;N # Lu [11] VITHKUQI CAPITAL LETTER A..VITHKUQI CAPITAL LETTER GA
1057C..1058A;N # Lu [15] VITHKUQI CAPITAL LETTER HA..VITHKUQI CAPITAL LETTER RE
1058C..10592;N # Lu [7] VITHKUQI CAPITAL LETTER SE..VITHKUQI CAPITAL LETTER XE
10594..10595;N # Lu [2] VITHKUQI CAPITAL LETTER Y..VITHKUQI CAPITAL LETTER ZE
10597..105A1;N # Ll [11] VITHKUQI SMALL LETTER A..VITHKUQI SMALL LETTER GA
105A3..105B1;N # Ll [15] VITHKUQI SMALL LETTER HA..VITHKUQI SMALL LETTER RE
105B3..105B9;N # Ll [7] VITHKUQI SMALL LETTER SE..VITHKUQI SMALL LETTER XE
105BB..105BC;N # Ll [2] VITHKUQI SMALL LETTER Y..VITHKUQI SMALL LETTER ZE
10600..10736;N # Lo [311] LINEAR A SIGN AB001..LINEAR A SIGN A664
10740..10755;N # Lo [22] LINEAR A SIGN A701 A..LINEAR A SIGN A732 JE
10760..10767;N # Lo [8] LINEAR A SIGN A800..LINEAR A SIGN A807
10780..10785;N # Lm [6] MODIFIER LETTER SMALL CAPITAL AA..MODIFIER LETTER SMALL B WITH HOOK
10787..107B0;N # Lm [42] MODIFIER LETTER SMALL DZ DIGRAPH..MODIFIER LETTER SMALL V WITH RIGHT HOOK
107B2..107BA;N # Lm [9] MODIFIER LETTER SMALL CAPITAL Y..MODIFIER LETTER SMALL S WITH CURL
10800..10805;N # Lo [6] CYPRIOT SYLLABLE A..CYPRIOT SYLLABLE JA
10808;N # Lo CYPRIOT SYLLABLE JO
1080A..10835;N # Lo [44] CYPRIOT SYLLABLE KA..CYPRIOT SYLLABLE WO
@ -1920,6 +1953,9 @@ FFFD;A # So REPLACEMENT CHARACTER
10F46..10F50;N # Mn [11] SOGDIAN COMBINING DOT BELOW..SOGDIAN COMBINING STROKE BELOW
10F51..10F54;N # No [4] SOGDIAN NUMBER ONE..SOGDIAN NUMBER ONE HUNDRED
10F55..10F59;N # Po [5] SOGDIAN PUNCTUATION TWO VERTICAL BARS..SOGDIAN PUNCTUATION HALF CIRCLE WITH DOT
10F70..10F81;N # Lo [18] OLD UYGHUR LETTER ALEPH..OLD UYGHUR LETTER LESH
10F82..10F85;N # Mn [4] OLD UYGHUR COMBINING DOT ABOVE..OLD UYGHUR COMBINING TWO DOTS BELOW
10F86..10F89;N # Po [4] OLD UYGHUR PUNCTUATION BAR..OLD UYGHUR PUNCTUATION FOUR DOTS
10FB0..10FC4;N # Lo [21] CHORASMIAN LETTER ALEPH..CHORASMIAN LETTER TAW
10FC5..10FCB;N # No [7] CHORASMIAN NUMBER ONE..CHORASMIAN NUMBER ONE HUNDRED
10FE0..10FF6;N # Lo [23] ELYMAIC LETTER ALEPH..ELYMAIC LIGATURE ZAYIN-YODH
@ -1931,6 +1967,10 @@ FFFD;A # So REPLACEMENT CHARACTER
11047..1104D;N # Po [7] BRAHMI DANDA..BRAHMI PUNCTUATION LOTUS
11052..11065;N # No [20] BRAHMI NUMBER ONE..BRAHMI NUMBER ONE THOUSAND
11066..1106F;N # Nd [10] BRAHMI DIGIT ZERO..BRAHMI DIGIT NINE
11070;N # Mn BRAHMI SIGN OLD TAMIL VIRAMA
11071..11072;N # Lo [2] BRAHMI LETTER OLD TAMIL SHORT E..BRAHMI LETTER OLD TAMIL SHORT O
11073..11074;N # Mn [2] BRAHMI VOWEL SIGN OLD TAMIL SHORT E..BRAHMI VOWEL SIGN OLD TAMIL SHORT O
11075;N # Lo BRAHMI LETTER OLD TAMIL LLA
1107F;N # Mn BRAHMI NUMBER JOINER
11080..11081;N # Mn [2] KAITHI SIGN CANDRABINDU..KAITHI SIGN ANUSVARA
11082;N # Mc KAITHI SIGN VISARGA
@ -1942,6 +1982,7 @@ FFFD;A # So REPLACEMENT CHARACTER
110BB..110BC;N # Po [2] KAITHI ABBREVIATION SIGN..KAITHI ENUMERATION SIGN
110BD;N # Cf KAITHI NUMBER SIGN
110BE..110C1;N # Po [4] KAITHI SECTION MARK..KAITHI DOUBLE DANDA
110C2;N # Mn KAITHI VOWEL SIGN VOCALIC R
110CD;N # Cf KAITHI NUMBER SIGN ABOVE
110D0..110E8;N # Lo [25] SORA SOMPENG LETTER SAH..SORA SOMPENG LETTER MAE
110F0..110F9;N # Nd [10] SORA SOMPENG DIGIT ZERO..SORA SOMPENG DIGIT NINE
@ -2076,6 +2117,7 @@ FFFD;A # So REPLACEMENT CHARACTER
116B6;N # Mc TAKRI SIGN VIRAMA
116B7;N # Mn TAKRI SIGN NUKTA
116B8;N # Lo TAKRI LETTER ARCHAIC KHA
116B9;N # Po TAKRI ABBREVIATION SIGN
116C0..116C9;N # Nd [10] TAKRI DIGIT ZERO..TAKRI DIGIT NINE
11700..1171A;N # Lo [27] AHOM LETTER KA..AHOM LETTER ALTERNATE BA
1171D..1171F;N # Mn [3] AHOM CONSONANT SIGN MEDIAL LA..AHOM CONSONANT SIGN MEDIAL LIGATING RA
@ -2087,6 +2129,7 @@ FFFD;A # So REPLACEMENT CHARACTER
1173A..1173B;N # No [2] AHOM NUMBER TEN..AHOM NUMBER TWENTY
1173C..1173E;N # Po [3] AHOM SIGN SMALL SECTION..AHOM SIGN RULAI
1173F;N # So AHOM SYMBOL VI
11740..11746;N # Lo [7] AHOM LETTER CA..AHOM LETTER LLA
11800..1182B;N # Lo [44] DOGRA LETTER A..DOGRA LETTER RRA
1182C..1182E;N # Mc [3] DOGRA VOWEL SIGN AA..DOGRA VOWEL SIGN II
1182F..11837;N # Mn [9] DOGRA VOWEL SIGN U..DOGRA SIGN ANUSVARA
@ -2145,6 +2188,7 @@ FFFD;A # So REPLACEMENT CHARACTER
11A9A..11A9C;N # Po [3] SOYOMBO MARK TSHEG..SOYOMBO MARK DOUBLE SHAD
11A9D;N # Lo SOYOMBO MARK PLUTA
11A9E..11AA2;N # Po [5] SOYOMBO HEAD MARK WITH MOON AND SUN AND TRIPLE FLAME..SOYOMBO TERMINAL MARK-2
11AB0..11ABF;N # Lo [16] CANADIAN SYLLABICS NATTILIK HI..CANADIAN SYLLABICS SPA
11AC0..11AF8;N # Lo [57] PAU CIN HAU LETTER PA..PAU CIN HAU GLOTTAL STOP FINAL
11C00..11C08;N # Lo [9] BHAIKSUKI LETTER A..BHAIKSUKI LETTER VOCALIC L
11C0A..11C2E;N # Lo [37] BHAIKSUKI LETTER E..BHAIKSUKI LETTER HA
@ -2201,6 +2245,8 @@ FFFD;A # So REPLACEMENT CHARACTER
12400..1246E;N # Nl [111] CUNEIFORM NUMERIC SIGN TWO ASH..CUNEIFORM NUMERIC SIGN NINE U VARIANT FORM
12470..12474;N # Po [5] CUNEIFORM PUNCTUATION SIGN OLD ASSYRIAN WORD DIVIDER..CUNEIFORM PUNCTUATION SIGN DIAGONAL QUADCOLON
12480..12543;N # Lo [196] CUNEIFORM SIGN AB TIMES NUN TENU..CUNEIFORM SIGN ZU5 TIMES THREE DISH TENU
12F90..12FF0;N # Lo [97] CYPRO-MINOAN SIGN CM001..CYPRO-MINOAN SIGN CM114
12FF1..12FF2;N # Po [2] CYPRO-MINOAN SIGN CM301..CYPRO-MINOAN SIGN CM302
13000..1342E;N # Lo [1071] EGYPTIAN HIEROGLYPH A001..EGYPTIAN HIEROGLYPH AA032
13430..13438;N # Cf [9] EGYPTIAN HIEROGLYPH VERTICAL JOINER..EGYPTIAN HIEROGLYPH END SEGMENT
14400..14646;N # Lo [583] ANATOLIAN HIEROGLYPH A001..ANATOLIAN HIEROGLYPH A530
@ -2208,6 +2254,8 @@ FFFD;A # So REPLACEMENT CHARACTER
16A40..16A5E;N # Lo [31] MRO LETTER TA..MRO LETTER TEK
16A60..16A69;N # Nd [10] MRO DIGIT ZERO..MRO DIGIT NINE
16A6E..16A6F;N # Po [2] MRO DANDA..MRO DOUBLE DANDA
16A70..16ABE;N # Lo [79] TANGSA LETTER OZ..TANGSA LETTER ZA
16AC0..16AC9;N # Nd [10] TANGSA DIGIT ZERO..TANGSA DIGIT NINE
16AD0..16AED;N # Lo [30] BASSA VAH LETTER ENNI..BASSA VAH LETTER I
16AF0..16AF4;N # Mn [5] BASSA VAH COMBINING HIGH TONE..BASSA VAH COMBINING HIGH-LOW TONE
16AF5;N # Po BASSA VAH FULL STOP
@ -2240,8 +2288,11 @@ FFFD;A # So REPLACEMENT CHARACTER
18800..18AFF;W # Lo [768] TANGUT COMPONENT-001..TANGUT COMPONENT-768
18B00..18CD5;W # Lo [470] KHITAN SMALL SCRIPT CHARACTER-18B00..KHITAN SMALL SCRIPT CHARACTER-18CD5
18D00..18D08;W # Lo [9] TANGUT IDEOGRAPH-18D00..TANGUT IDEOGRAPH-18D08
1AFF0..1AFF3;W # Lm [4] KATAKANA LETTER MINNAN TONE-2..KATAKANA LETTER MINNAN TONE-5
1AFF5..1AFFB;W # Lm [7] KATAKANA LETTER MINNAN TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-5
1AFFD..1AFFE;W # Lm [2] KATAKANA LETTER MINNAN NASALIZED TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-8
1B000..1B0FF;W # Lo [256] KATAKANA LETTER ARCHAIC E..HENTAIGANA LETTER RE-2
1B100..1B11E;W # Lo [31] HENTAIGANA LETTER RE-3..HENTAIGANA LETTER N-MU-MO-2
1B100..1B122;W # Lo [35] HENTAIGANA LETTER RE-3..KATAKANA LETTER ARCHAIC WU
1B150..1B152;W # Lo [3] HIRAGANA LETTER SMALL WI..HIRAGANA LETTER SMALL WO
1B164..1B167;W # Lo [4] KATAKANA LETTER SMALL WI..KATAKANA LETTER SMALL N
1B170..1B2FB;W # Lo [396] NUSHU CHARACTER-1B170..NUSHU CHARACTER-1B2FB
@ -2253,6 +2304,9 @@ FFFD;A # So REPLACEMENT CHARACTER
1BC9D..1BC9E;N # Mn [2] DUPLOYAN THICK LETTER SELECTOR..DUPLOYAN DOUBLE MARK
1BC9F;N # Po DUPLOYAN PUNCTUATION CHINOOK FULL STOP
1BCA0..1BCA3;N # Cf [4] SHORTHAND FORMAT LETTER OVERLAP..SHORTHAND FORMAT UP STEP
1CF00..1CF2D;N # Mn [46] ZNAMENNY COMBINING MARK GORAZDO NIZKO S KRYZHEM ON LEFT..ZNAMENNY COMBINING MARK KRYZH ON LEFT
1CF30..1CF46;N # Mn [23] ZNAMENNY COMBINING TONAL RANGE MARK MRACHNO..ZNAMENNY PRIZNAK MODIFIER ROG
1CF50..1CFC3;N # So [116] ZNAMENNY NEUME KRYUK..ZNAMENNY NEUME PAUK
1D000..1D0F5;N # So [246] BYZANTINE MUSICAL SYMBOL PSILI..BYZANTINE MUSICAL SYMBOL GORGON NEO KATO
1D100..1D126;N # So [39] MUSICAL SYMBOL SINGLE BARLINE..MUSICAL SYMBOL DRUM CLEF-2
1D129..1D164;N # So [60] MUSICAL SYMBOL MULTIPLE MEASURE REST..MUSICAL SYMBOL ONE HUNDRED TWENTY-EIGHTH NOTE
@ -2266,7 +2320,7 @@ FFFD;A # So REPLACEMENT CHARACTER
1D185..1D18B;N # Mn [7] MUSICAL SYMBOL COMBINING DOIT..MUSICAL SYMBOL COMBINING TRIPLE TONGUE
1D18C..1D1A9;N # So [30] MUSICAL SYMBOL RINFORZANDO..MUSICAL SYMBOL DEGREE SLASH
1D1AA..1D1AD;N # Mn [4] MUSICAL SYMBOL COMBINING DOWN BOW..MUSICAL SYMBOL COMBINING SNAP PIZZICATO
1D1AE..1D1E8;N # So [59] MUSICAL SYMBOL PEDAL MARK..MUSICAL SYMBOL KIEVAN FLAT SIGN
1D1AE..1D1EA;N # So [61] MUSICAL SYMBOL PEDAL MARK..MUSICAL SYMBOL KORON
1D200..1D241;N # So [66] GREEK VOCAL NOTATION SYMBOL-1..GREEK INSTRUMENTAL NOTATION SYMBOL-54
1D242..1D244;N # Mn [3] COMBINING GREEK MUSICAL TRISEME..COMBINING GREEK MUSICAL PENTASEME
1D245;N # So GREEK MUSICAL LEIMMA
@ -2326,6 +2380,9 @@ FFFD;A # So REPLACEMENT CHARACTER
1DA87..1DA8B;N # Po [5] SIGNWRITING COMMA..SIGNWRITING PARENTHESIS
1DA9B..1DA9F;N # Mn [5] SIGNWRITING FILL MODIFIER-2..SIGNWRITING FILL MODIFIER-6
1DAA1..1DAAF;N # Mn [15] SIGNWRITING ROTATION MODIFIER-2..SIGNWRITING ROTATION MODIFIER-16
1DF00..1DF09;N # Ll [10] LATIN SMALL LETTER FENG DIGRAPH WITH TRILL..LATIN SMALL LETTER T WITH HOOK AND RETROFLEX HOOK
1DF0A;N # Lo LATIN LETTER RETROFLEX CLICK WITH RETROFLEX HOOK
1DF0B..1DF1E;N # Ll [20] LATIN SMALL LETTER ESH WITH DOUBLE BAR..LATIN SMALL LETTER S WITH CURL
1E000..1E006;N # Mn [7] COMBINING GLAGOLITIC LETTER AZU..COMBINING GLAGOLITIC LETTER ZHIVETE
1E008..1E018;N # Mn [17] COMBINING GLAGOLITIC LETTER ZEMLJA..COMBINING GLAGOLITIC LETTER HERU
1E01B..1E021;N # Mn [7] COMBINING GLAGOLITIC LETTER SHTA..COMBINING GLAGOLITIC LETTER YATI
@ -2337,10 +2394,16 @@ FFFD;A # So REPLACEMENT CHARACTER
1E140..1E149;N # Nd [10] NYIAKENG PUACHUE HMONG DIGIT ZERO..NYIAKENG PUACHUE HMONG DIGIT NINE
1E14E;N # Lo NYIAKENG PUACHUE HMONG LOGOGRAM NYAJ
1E14F;N # So NYIAKENG PUACHUE HMONG CIRCLED CA
1E290..1E2AD;N # Lo [30] TOTO LETTER PA..TOTO LETTER A
1E2AE;N # Mn TOTO SIGN RISING TONE
1E2C0..1E2EB;N # Lo [44] WANCHO LETTER AA..WANCHO LETTER YIH
1E2EC..1E2EF;N # Mn [4] WANCHO TONE TUP..WANCHO TONE KOINI
1E2F0..1E2F9;N # Nd [10] WANCHO DIGIT ZERO..WANCHO DIGIT NINE
1E2FF;N # Sc WANCHO NGUN SIGN
1E7E0..1E7E6;N # Lo [7] ETHIOPIC SYLLABLE HHYA..ETHIOPIC SYLLABLE HHYO
1E7E8..1E7EB;N # Lo [4] ETHIOPIC SYLLABLE GURAGE HHWA..ETHIOPIC SYLLABLE HHWE
1E7ED..1E7EE;N # Lo [2] ETHIOPIC SYLLABLE GURAGE MWI..ETHIOPIC SYLLABLE GURAGE MWEE
1E7F0..1E7FE;N # Lo [15] ETHIOPIC SYLLABLE GURAGE QWI..ETHIOPIC SYLLABLE GURAGE PWEE
1E800..1E8C4;N # Lo [197] MENDE KIKAKUI SYLLABLE M001 KI..MENDE KIKAKUI SYLLABLE M060 NYON
1E8C7..1E8CF;N # No [9] MENDE KIKAKUI DIGIT ONE..MENDE KIKAKUI DIGIT NINE
1E8D0..1E8D6;N # Mn [7] MENDE KIKAKUI COMBINING NUMBER TEENS..MENDE KIKAKUI COMBINING NUMBER MILLIONS
@ -2465,6 +2528,7 @@ FFFD;A # So REPLACEMENT CHARACTER
1F6D0..1F6D2;W # So [3] PLACE OF WORSHIP..SHOPPING TROLLEY
1F6D3..1F6D4;N # So [2] STUPA..PAGODA
1F6D5..1F6D7;W # So [3] HINDU TEMPLE..ELEVATOR
1F6DD..1F6DF;W # So [3] PLAYGROUND SLIDE..RING BUOY
1F6E0..1F6EA;N # So [11] HAMMER AND WRENCH..NORTHEAST-POINTING AIRPLANE
1F6EB..1F6EC;W # So [2] AIRPLANE DEPARTURE..AIRPLANE ARRIVING
1F6F0..1F6F3;N # So [4] SATELLITE..PASSENGER SHIP
@ -2472,6 +2536,7 @@ FFFD;A # So REPLACEMENT CHARACTER
1F700..1F773;N # So [116] ALCHEMICAL SYMBOL FOR QUINTESSENCE..ALCHEMICAL SYMBOL FOR HALF OUNCE
1F780..1F7D8;N # So [89] BLACK LEFT-POINTING ISOSCELES RIGHT TRIANGLE..NEGATIVE CIRCLED SQUARE
1F7E0..1F7EB;W # So [12] LARGE ORANGE CIRCLE..LARGE BROWN SQUARE
1F7F0;W # So HEAVY EQUALS SIGN
1F800..1F80B;N # So [12] LEFTWARDS ARROW WITH SMALL TRIANGLE ARROWHEAD..DOWNWARDS ARROW WITH LARGE TRIANGLE ARROWHEAD
1F810..1F847;N # So [56] LEFTWARDS ARROW WITH SMALL EQUILATERAL ARROWHEAD..DOWNWARDS HEAVY ARROW
1F850..1F859;N # So [10] LEFTWARDS SANS-SERIF ARROW..UP DOWN SANS-SERIF ARROW
@ -2483,25 +2548,25 @@ FFFD;A # So REPLACEMENT CHARACTER
1F93B;N # So MODERN PENTATHLON
1F93C..1F945;W # So [10] WRESTLERS..GOAL NET
1F946;N # So RIFLE
1F947..1F978;W # So [50] FIRST PLACE MEDAL..DISGUISED FACE
1F97A..1F9CB;W # So [82] FACE WITH PLEADING EYES..BUBBLE TEA
1F9CD..1F9FF;W # So [51] STANDING PERSON..NAZAR AMULET
1F947..1F9FF;W # So [185] FIRST PLACE MEDAL..NAZAR AMULET
1FA00..1FA53;N # So [84] NEUTRAL CHESS KING..BLACK CHESS KNIGHT-BISHOP
1FA60..1FA6D;N # So [14] XIANGQI RED GENERAL..XIANGQI BLACK SOLDIER
1FA70..1FA74;W # So [5] BALLET SHOES..THONG SANDAL
1FA78..1FA7A;W # So [3] DROP OF BLOOD..STETHOSCOPE
1FA78..1FA7C;W # So [5] DROP OF BLOOD..CRUTCH
1FA80..1FA86;W # So [7] YO-YO..NESTING DOLLS
1FA90..1FAA8;W # So [25] RINGED PLANET..ROCK
1FAB0..1FAB6;W # So [7] FLY..FEATHER
1FAC0..1FAC2;W # So [3] ANATOMICAL HEART..PEOPLE HUGGING
1FAD0..1FAD6;W # So [7] BLUEBERRIES..TEAPOT
1FA90..1FAAC;W # So [29] RINGED PLANET..HAMSA
1FAB0..1FABA;W # So [11] FLY..NEST WITH EGGS
1FAC0..1FAC5;W # So [6] ANATOMICAL HEART..PERSON WITH CROWN
1FAD0..1FAD9;W # So [10] BLUEBERRIES..JAR
1FAE0..1FAE7;W # So [8] MELTING FACE..BUBBLES
1FAF0..1FAF6;W # So [7] HAND WITH INDEX FINGER AND THUMB CROSSED..HEART HANDS
1FB00..1FB92;N # So [147] BLOCK SEXTANT-1..UPPER HALF INVERSE MEDIUM SHADE AND LOWER HALF BLOCK
1FB94..1FBCA;N # So [55] LEFT HALF INVERSE MEDIUM SHADE AND RIGHT HALF BLOCK..WHITE UP-POINTING CHEVRON
1FBF0..1FBF9;N # Nd [10] SEGMENTED DIGIT ZERO..SEGMENTED DIGIT NINE
20000..2A6DD;W # Lo [42718] CJK UNIFIED IDEOGRAPH-20000..CJK UNIFIED IDEOGRAPH-2A6DD
2A6DE..2A6FF;W # Cn [34] <reserved-2A6DE>..<reserved-2A6FF>
2A700..2B734;W # Lo [4149] CJK UNIFIED IDEOGRAPH-2A700..CJK UNIFIED IDEOGRAPH-2B734
2B735..2B73F;W # Cn [11] <reserved-2B735>..<reserved-2B73F>
20000..2A6DF;W # Lo [42720] CJK UNIFIED IDEOGRAPH-20000..CJK UNIFIED IDEOGRAPH-2A6DF
2A6E0..2A6FF;W # Cn [32] <reserved-2A6E0>..<reserved-2A6FF>
2A700..2B738;W # Lo [4153] CJK UNIFIED IDEOGRAPH-2A700..CJK UNIFIED IDEOGRAPH-2B738
2B739..2B73F;W # Cn [7] <reserved-2B739>..<reserved-2B73F>
2B740..2B81D;W # Lo [222] CJK UNIFIED IDEOGRAPH-2B740..CJK UNIFIED IDEOGRAPH-2B81D
2B81E..2B81F;W # Cn [2] <reserved-2B81E>..<reserved-2B81F>
2B820..2CEA1;W # Lo [5762] CJK UNIFIED IDEOGRAPH-2B820..CJK UNIFIED IDEOGRAPH-2CEA1

View File

@ -35,7 +35,7 @@
# files for making modifications.
UNICODE_VERSION = 13.0.0
UNICODE_VERSION = 14.0.0
PYTHON3 = python3
WGET = wget

View File

@ -1,6 +1,6 @@
# PropList-13.0.0.txt
# Date: 2019-11-27, 03:13:28 GMT
# © 2019 Unicode®, Inc.
# PropList-14.0.0.txt
# Date: 2021-08-12, 23:13:05 GMT
# © 2021 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use, see http://www.unicode.org/terms_of_use.html
#
@ -54,6 +54,7 @@
2E1A ; Dash # Pd HYPHEN WITH DIAERESIS
2E3A..2E3B ; Dash # Pd [2] TWO-EM DASH..THREE-EM DASH
2E40 ; Dash # Pd DOUBLE HYPHEN
2E5D ; Dash # Pd OBLIQUE HYPHEN
301C ; Dash # Pd WAVE DASH
3030 ; Dash # Pd WAVY DASH
30A0 ; Dash # Pd KATAKANA-HIRAGANA DOUBLE HYPHEN
@ -63,7 +64,7 @@ FE63 ; Dash # Pd SMALL HYPHEN-MINUS
FF0D ; Dash # Pd FULLWIDTH HYPHEN-MINUS
10EAD ; Dash # Pd YEZIDI HYPHENATION MARK
# Total code points: 29
# Total code points: 30
# ================================================
@ -126,7 +127,7 @@ FF63 ; Quotation_Mark # Pe HALFWIDTH RIGHT CORNER BRACKET
05C3 ; Terminal_Punctuation # Po HEBREW PUNCTUATION SOF PASUQ
060C ; Terminal_Punctuation # Po ARABIC COMMA
061B ; Terminal_Punctuation # Po ARABIC SEMICOLON
061E..061F ; Terminal_Punctuation # Po [2] ARABIC TRIPLE DOT PUNCTUATION MARK..ARABIC QUESTION MARK
061D..061F ; Terminal_Punctuation # Po [3] ARABIC END OF TEXT MARK..ARABIC QUESTION MARK
06D4 ; Terminal_Punctuation # Po ARABIC FULL STOP
0700..070A ; Terminal_Punctuation # Po [11] SYRIAC END OF PARAGRAPH..SYRIAC CONTRACTION
070C ; Terminal_Punctuation # Po SYRIAC HARKLEAN METOBELUS
@ -150,6 +151,7 @@ FF63 ; Quotation_Mark # Pe HALFWIDTH RIGHT CORNER BRACKET
1AA8..1AAB ; Terminal_Punctuation # Po [4] TAI THAM SIGN KAAN..TAI THAM SIGN SATKAANKUU
1B5A..1B5B ; Terminal_Punctuation # Po [2] BALINESE PANTI..BALINESE PAMADA
1B5D..1B5F ; Terminal_Punctuation # Po [3] BALINESE CARIK PAMUNGKAH..BALINESE CARIK PAREREN
1B7D..1B7E ; Terminal_Punctuation # Po [2] BALINESE PANTI LANTANG..BALINESE PAMADA LANTANG
1C3B..1C3F ; Terminal_Punctuation # Po [5] LEPCHA PUNCTUATION TA-ROL..LEPCHA PUNCTUATION TSHOOK
1C7E..1C7F ; Terminal_Punctuation # Po [2] OL CHIKI PUNCTUATION MUCAAD..OL CHIKI PUNCTUATION DOUBLE MUCAAD
203C..203D ; Terminal_Punctuation # Po [2] DOUBLE EXCLAMATION MARK..INTERROBANG
@ -159,6 +161,7 @@ FF63 ; Quotation_Mark # Pe HALFWIDTH RIGHT CORNER BRACKET
2E41 ; Terminal_Punctuation # Po REVERSED COMMA
2E4C ; Terminal_Punctuation # Po MEDIEVAL COMMA
2E4E..2E4F ; Terminal_Punctuation # Po [2] PUNCTUS ELEVATUS MARK..CORNISH VERSE DIVIDER
2E53..2E54 ; Terminal_Punctuation # Po [2] MEDIEVAL EXCLAMATION MARK..MEDIEVAL QUESTION MARK
3001..3002 ; Terminal_Punctuation # Po [2] IDEOGRAPHIC COMMA..IDEOGRAPHIC FULL STOP
A4FE..A4FF ; Terminal_Punctuation # Po [2] LISU PUNCTUATION COMMA..LISU PUNCTUATION FULL STOP
A60D..A60F ; Terminal_Punctuation # Po [3] VAI COMMA..VAI QUESTION MARK
@ -189,6 +192,7 @@ FF64 ; Terminal_Punctuation # Po HALFWIDTH IDEOGRAPHIC COMMA
10B3A..10B3F ; Terminal_Punctuation # Po [6] TINY TWO DOTS OVER ONE DOT PUNCTUATION..LARGE ONE RING OVER TWO RINGS PUNCTUATION
10B99..10B9C ; Terminal_Punctuation # Po [4] PSALTER PAHLAVI SECTION MARK..PSALTER PAHLAVI FOUR DOTS WITH DOT
10F55..10F59 ; Terminal_Punctuation # Po [5] SOGDIAN PUNCTUATION TWO VERTICAL BARS..SOGDIAN PUNCTUATION HALF CIRCLE WITH DOT
10F86..10F89 ; Terminal_Punctuation # Po [4] OLD UYGHUR PUNCTUATION BAR..OLD UYGHUR PUNCTUATION FOUR DOTS
11047..1104D ; Terminal_Punctuation # Po [7] BRAHMI DANDA..BRAHMI PUNCTUATION LOTUS
110BE..110C1 ; Terminal_Punctuation # Po [4] KAITHI SECTION MARK..KAITHI DOUBLE DANDA
11141..11143 ; Terminal_Punctuation # Po [3] CHAKMA DANDA..CHAKMA QUESTION MARK
@ -220,7 +224,7 @@ FF64 ; Terminal_Punctuation # Po HALFWIDTH IDEOGRAPHIC COMMA
1BC9F ; Terminal_Punctuation # Po DUPLOYAN PUNCTUATION CHINOOK FULL STOP
1DA87..1DA8A ; Terminal_Punctuation # Po [4] SIGNWRITING COMMA..SIGNWRITING COLON
# Total code points: 267
# Total code points: 276
# ================================================
@ -600,6 +604,7 @@ FF41..FF46 ; Hex_Digit # L& [6] FULLWIDTH LATIN SMALL LETTER A..FULLWIDTH L
1A6D..1A72 ; Other_Alphabetic # Mc [6] TAI THAM VOWEL SIGN OY..TAI THAM VOWEL SIGN THAM AI
1A73..1A74 ; Other_Alphabetic # Mn [2] TAI THAM VOWEL SIGN OA ABOVE..TAI THAM SIGN MAI KANG
1ABF..1AC0 ; Other_Alphabetic # Mn [2] COMBINING LATIN SMALL LETTER W BELOW..COMBINING LATIN SMALL LETTER TURNED W BELOW
1ACC..1ACE ; Other_Alphabetic # Mn [3] COMBINING LATIN SMALL LETTER INSULAR G..COMBINING LATIN SMALL LETTER INSULAR T
1B00..1B03 ; Other_Alphabetic # Mn [4] BALINESE SIGN ULU RICEM..BALINESE SIGN SURANG
1B04 ; Other_Alphabetic # Mc BALINESE SIGN BISAH
1B35 ; Other_Alphabetic # Mc BALINESE VOWEL SIGN TEDUNG
@ -686,10 +691,12 @@ FB1E ; Other_Alphabetic # Mn HEBREW POINT JUDEO-SPANISH VARIKA
11001 ; Other_Alphabetic # Mn BRAHMI SIGN ANUSVARA
11002 ; Other_Alphabetic # Mc BRAHMI SIGN VISARGA
11038..11045 ; Other_Alphabetic # Mn [14] BRAHMI VOWEL SIGN AA..BRAHMI VOWEL SIGN AU
11073..11074 ; Other_Alphabetic # Mn [2] BRAHMI VOWEL SIGN OLD TAMIL SHORT E..BRAHMI VOWEL SIGN OLD TAMIL SHORT O
11082 ; Other_Alphabetic # Mc KAITHI SIGN VISARGA
110B0..110B2 ; Other_Alphabetic # Mc [3] KAITHI VOWEL SIGN AA..KAITHI VOWEL SIGN II
110B3..110B6 ; Other_Alphabetic # Mn [4] KAITHI VOWEL SIGN U..KAITHI VOWEL SIGN AI
110B7..110B8 ; Other_Alphabetic # Mc [2] KAITHI VOWEL SIGN O..KAITHI VOWEL SIGN AU
110C2 ; Other_Alphabetic # Mn KAITHI VOWEL SIGN VOCALIC R
11100..11102 ; Other_Alphabetic # Mn [3] CHAKMA SIGN CANDRABINDU..CHAKMA SIGN VISARGA
11127..1112B ; Other_Alphabetic # Mn [5] CHAKMA VOWEL SIGN A..CHAKMA VOWEL SIGN UU
1112C ; Other_Alphabetic # Mc CHAKMA VOWEL SIGN E
@ -815,7 +822,7 @@ FB1E ; Other_Alphabetic # Mn HEBREW POINT JUDEO-SPANISH VARIKA
1F150..1F169 ; Other_Alphabetic # So [26] NEGATIVE CIRCLED LATIN CAPITAL LETTER A..NEGATIVE CIRCLED LATIN CAPITAL LETTER Z
1F170..1F189 ; Other_Alphabetic # So [26] NEGATIVE SQUARED LATIN CAPITAL LETTER A..NEGATIVE SQUARED LATIN CAPITAL LETTER Z
# Total code points: 1398
# Total code points: 1404
# ================================================
@ -824,7 +831,7 @@ FB1E ; Other_Alphabetic # Mn HEBREW POINT JUDEO-SPANISH VARIKA
3021..3029 ; Ideographic # Nl [9] HANGZHOU NUMERAL ONE..HANGZHOU NUMERAL NINE
3038..303A ; Ideographic # Nl [3] HANGZHOU NUMERAL TEN..HANGZHOU NUMERAL THIRTY
3400..4DBF ; Ideographic # Lo [6592] CJK UNIFIED IDEOGRAPH-3400..CJK UNIFIED IDEOGRAPH-4DBF
4E00..9FFC ; Ideographic # Lo [20989] CJK UNIFIED IDEOGRAPH-4E00..CJK UNIFIED IDEOGRAPH-9FFC
4E00..9FFF ; Ideographic # Lo [20992] CJK UNIFIED IDEOGRAPH-4E00..CJK UNIFIED IDEOGRAPH-9FFF
F900..FA6D ; Ideographic # Lo [366] CJK COMPATIBILITY IDEOGRAPH-F900..CJK COMPATIBILITY IDEOGRAPH-FA6D
FA70..FAD9 ; Ideographic # Lo [106] CJK COMPATIBILITY IDEOGRAPH-FA70..CJK COMPATIBILITY IDEOGRAPH-FAD9
16FE4 ; Ideographic # Mn KHITAN SMALL SCRIPT FILLER
@ -832,15 +839,15 @@ FA70..FAD9 ; Ideographic # Lo [106] CJK COMPATIBILITY IDEOGRAPH-FA70..CJK COM
18800..18CD5 ; Ideographic # Lo [1238] TANGUT COMPONENT-001..KHITAN SMALL SCRIPT CHARACTER-18CD5
18D00..18D08 ; Ideographic # Lo [9] TANGUT IDEOGRAPH-18D00..TANGUT IDEOGRAPH-18D08
1B170..1B2FB ; Ideographic # Lo [396] NUSHU CHARACTER-1B170..NUSHU CHARACTER-1B2FB
20000..2A6DD ; Ideographic # Lo [42718] CJK UNIFIED IDEOGRAPH-20000..CJK UNIFIED IDEOGRAPH-2A6DD
2A700..2B734 ; Ideographic # Lo [4149] CJK UNIFIED IDEOGRAPH-2A700..CJK UNIFIED IDEOGRAPH-2B734
20000..2A6DF ; Ideographic # Lo [42720] CJK UNIFIED IDEOGRAPH-20000..CJK UNIFIED IDEOGRAPH-2A6DF
2A700..2B738 ; Ideographic # Lo [4153] CJK UNIFIED IDEOGRAPH-2A700..CJK UNIFIED IDEOGRAPH-2B738
2B740..2B81D ; Ideographic # Lo [222] CJK UNIFIED IDEOGRAPH-2B740..CJK UNIFIED IDEOGRAPH-2B81D
2B820..2CEA1 ; Ideographic # Lo [5762] CJK UNIFIED IDEOGRAPH-2B820..CJK UNIFIED IDEOGRAPH-2CEA1
2CEB0..2EBE0 ; Ideographic # Lo [7473] CJK UNIFIED IDEOGRAPH-2CEB0..CJK UNIFIED IDEOGRAPH-2EBE0
2F800..2FA1D ; Ideographic # Lo [542] CJK COMPATIBILITY IDEOGRAPH-2F800..CJK COMPATIBILITY IDEOGRAPH-2FA1D
30000..3134A ; Ideographic # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
# Total code points: 101652
# Total code points: 101661
# ================================================
@ -885,6 +892,9 @@ FA70..FAD9 ; Ideographic # Lo [106] CJK COMPATIBILITY IDEOGRAPH-FA70..CJK COM
07EB..07F3 ; Diacritic # Mn [9] NKO COMBINING SHORT HIGH TONE..NKO COMBINING DOUBLE DOT ABOVE
07F4..07F5 ; Diacritic # Lm [2] NKO HIGH TONE APOSTROPHE..NKO LOW TONE APOSTROPHE
0818..0819 ; Diacritic # Mn [2] SAMARITAN MARK OCCLUSION..SAMARITAN MARK DAGESH
0898..089F ; Diacritic # Mn [8] ARABIC SMALL HIGH WORD AL-JUZ..ARABIC HALF MADDA OVER MADDA
08C9 ; Diacritic # Lm ARABIC SMALL FARSI YEH
08CA..08D2 ; Diacritic # Mn [9] ARABIC SMALL HIGH FARSI YEH..ARABIC LARGE ROUND DOT INSIDE CIRCLE BELOW
08E3..08FE ; Diacritic # Mn [28] ARABIC TURNED DAMMA BELOW..ARABIC DAMMA WITH DOT
093C ; Diacritic # Mn DEVANAGARI SIGN NUKTA
094D ; Diacritic # Mn DEVANAGARI SIGN VIRAMA
@ -901,6 +911,7 @@ FA70..FAD9 ; Ideographic # Lo [106] CJK COMPATIBILITY IDEOGRAPH-FA70..CJK COM
0B4D ; Diacritic # Mn ORIYA SIGN VIRAMA
0B55 ; Diacritic # Mn ORIYA SIGN OVERLINE
0BCD ; Diacritic # Mn TAMIL SIGN VIRAMA
0C3C ; Diacritic # Mn TELUGU SIGN NUKTA
0C4D ; Diacritic # Mn TELUGU SIGN VIRAMA
0CBC ; Diacritic # Mn KANNADA SIGN NUKTA
0CCD ; Diacritic # Mn KANNADA SIGN VIRAMA
@ -928,12 +939,16 @@ FA70..FAD9 ; Ideographic # Lo [106] CJK COMPATIBILITY IDEOGRAPH-FA70..CJK COM
108F ; Diacritic # Mc MYANMAR SIGN RUMAI PALAUNG TONE-5
109A..109B ; Diacritic # Mc [2] MYANMAR SIGN KHAMTI TONE-1..MYANMAR SIGN KHAMTI TONE-3
135D..135F ; Diacritic # Mn [3] ETHIOPIC COMBINING GEMINATION AND VOWEL LENGTH MARK..ETHIOPIC COMBINING GEMINATION MARK
1714 ; Diacritic # Mn TAGALOG SIGN VIRAMA
1715 ; Diacritic # Mc TAGALOG SIGN PAMUDPOD
17C9..17D3 ; Diacritic # Mn [11] KHMER SIGN MUUSIKATOAN..KHMER SIGN BATHAMASAT
17DD ; Diacritic # Mn KHMER SIGN ATTHACAN
1939..193B ; Diacritic # Mn [3] LIMBU SIGN MUKPHRENG..LIMBU SIGN SA-I
1A75..1A7C ; Diacritic # Mn [8] TAI THAM SIGN TONE-1..TAI THAM SIGN KHUEN-LUE KARAN
1A7F ; Diacritic # Mn TAI THAM COMBINING CRYPTOGRAMMIC DOT
1AB0..1ABD ; Diacritic # Mn [14] COMBINING DOUBLED CIRCUMFLEX ACCENT..COMBINING PARENTHESES BELOW
1ABE ; Diacritic # Me COMBINING PARENTHESES OVERLAY
1AC1..1ACB ; Diacritic # Mn [11] COMBINING LEFT PARENTHESIS ABOVE LEFT..COMBINING TRIPLE ACUTE ACCENT
1B34 ; Diacritic # Mn BALINESE SIGN REREKAN
1B44 ; Diacritic # Mc BALINESE ADEG ADEG
1B6B..1B73 ; Diacritic # Mn [9] BALINESE MUSICAL SYMBOL COMBINING TEGEH..BALINESE MUSICAL SYMBOL COMBINING GONG
@ -952,8 +967,7 @@ FA70..FAD9 ; Ideographic # Lo [106] CJK COMPATIBILITY IDEOGRAPH-FA70..CJK COM
1CF8..1CF9 ; Diacritic # Mn [2] VEDIC TONE RING ABOVE..VEDIC TONE DOUBLE RING ABOVE
1D2C..1D6A ; Diacritic # Lm [63] MODIFIER LETTER CAPITAL A..GREEK SUBSCRIPT SMALL LETTER CHI
1DC4..1DCF ; Diacritic # Mn [12] COMBINING MACRON-ACUTE..COMBINING ZIGZAG BELOW
1DF5..1DF9 ; Diacritic # Mn [5] COMBINING UP TACK ABOVE..COMBINING WIDE INVERTED BRIDGE BELOW
1DFD..1DFF ; Diacritic # Mn [3] COMBINING ALMOST EQUAL TO BELOW..COMBINING RIGHT ARROWHEAD AND DOWN ARROWHEAD BELOW
1DF5..1DFF ; Diacritic # Mn [11] COMBINING UP TACK ABOVE..COMBINING RIGHT ARROWHEAD AND DOWN ARROWHEAD BELOW
1FBD ; Diacritic # Sk GREEK KORONIS
1FBF..1FC1 ; Diacritic # Sk [3] GREEK PSILI..GREEK DIALYTIKA AND PERISPOMENI
1FCD..1FCF ; Diacritic # Sk [3] GREEK PSILI AND VARIA..GREEK PSILI AND PERISPOMENI
@ -1008,10 +1022,16 @@ FF70 ; Diacritic # Lm HALFWIDTH KATAKANA-HIRAGANA PROLONGED SOUND
FF9E..FF9F ; Diacritic # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDTH KATAKANA SEMI-VOICED SOUND MARK
FFE3 ; Diacritic # Sk FULLWIDTH MACRON
102E0 ; Diacritic # Mn COPTIC EPACT THOUSANDS MARK
10780..10785 ; Diacritic # Lm [6] MODIFIER LETTER SMALL CAPITAL AA..MODIFIER LETTER SMALL B WITH HOOK
10787..107B0 ; Diacritic # Lm [42] MODIFIER LETTER SMALL DZ DIGRAPH..MODIFIER LETTER SMALL V WITH RIGHT HOOK
107B2..107BA ; Diacritic # Lm [9] MODIFIER LETTER SMALL CAPITAL Y..MODIFIER LETTER SMALL S WITH CURL
10AE5..10AE6 ; Diacritic # Mn [2] MANICHAEAN ABBREVIATION MARK ABOVE..MANICHAEAN ABBREVIATION MARK BELOW
10D22..10D23 ; Diacritic # Lo [2] HANIFI ROHINGYA MARK SAKIN..HANIFI ROHINGYA MARK NA KHONNA
10D24..10D27 ; Diacritic # Mn [4] HANIFI ROHINGYA SIGN HARBAHAY..HANIFI ROHINGYA SIGN TASSI
10F46..10F50 ; Diacritic # Mn [11] SOGDIAN COMBINING DOT BELOW..SOGDIAN COMBINING STROKE BELOW
10F82..10F85 ; Diacritic # Mn [4] OLD UYGHUR COMBINING DOT ABOVE..OLD UYGHUR COMBINING TWO DOTS BELOW
11046 ; Diacritic # Mn BRAHMI VIRAMA
11070 ; Diacritic # Mn BRAHMI SIGN OLD TAMIL VIRAMA
110B9..110BA ; Diacritic # Mn [2] KAITHI SIGN VIRAMA..KAITHI SIGN NUKTA
11133..11134 ; Diacritic # Mn [2] CHAKMA VIRAMA..CHAKMA MAAYYAA
11173 ; Diacritic # Mn MAHAJANI SIGN NUKTA
@ -1049,18 +1069,24 @@ FFE3 ; Diacritic # Sk FULLWIDTH MACRON
16F8F..16F92 ; Diacritic # Mn [4] MIAO TONE RIGHT..MIAO TONE BELOW
16F93..16F9F ; Diacritic # Lm [13] MIAO LETTER TONE-2..MIAO LETTER REFORMED TONE-8
16FF0..16FF1 ; Diacritic # Mc [2] VIETNAMESE ALTERNATE READING MARK CA..VIETNAMESE ALTERNATE READING MARK NHAY
1AFF0..1AFF3 ; Diacritic # Lm [4] KATAKANA LETTER MINNAN TONE-2..KATAKANA LETTER MINNAN TONE-5
1AFF5..1AFFB ; Diacritic # Lm [7] KATAKANA LETTER MINNAN TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-5
1AFFD..1AFFE ; Diacritic # Lm [2] KATAKANA LETTER MINNAN NASALIZED TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-8
1CF00..1CF2D ; Diacritic # Mn [46] ZNAMENNY COMBINING MARK GORAZDO NIZKO S KRYZHEM ON LEFT..ZNAMENNY COMBINING MARK KRYZH ON LEFT
1CF30..1CF46 ; Diacritic # Mn [23] ZNAMENNY COMBINING TONAL RANGE MARK MRACHNO..ZNAMENNY PRIZNAK MODIFIER ROG
1D167..1D169 ; Diacritic # Mn [3] MUSICAL SYMBOL COMBINING TREMOLO-1..MUSICAL SYMBOL COMBINING TREMOLO-3
1D16D..1D172 ; Diacritic # Mc [6] MUSICAL SYMBOL COMBINING AUGMENTATION DOT..MUSICAL SYMBOL COMBINING FLAG-5
1D17B..1D182 ; Diacritic # Mn [8] MUSICAL SYMBOL COMBINING ACCENT..MUSICAL SYMBOL COMBINING LOURE
1D185..1D18B ; Diacritic # Mn [7] MUSICAL SYMBOL COMBINING DOIT..MUSICAL SYMBOL COMBINING TRIPLE TONGUE
1D1AA..1D1AD ; Diacritic # Mn [4] MUSICAL SYMBOL COMBINING DOWN BOW..MUSICAL SYMBOL COMBINING SNAP PIZZICATO
1E130..1E136 ; Diacritic # Mn [7] NYIAKENG PUACHUE HMONG TONE-B..NYIAKENG PUACHUE HMONG TONE-D
1E2AE ; Diacritic # Mn TOTO SIGN RISING TONE
1E2EC..1E2EF ; Diacritic # Mn [4] WANCHO TONE TUP..WANCHO TONE KOINI
1E8D0..1E8D6 ; Diacritic # Mn [7] MENDE KIKAKUI COMBINING NUMBER TEENS..MENDE KIKAKUI COMBINING NUMBER MILLIONS
1E944..1E946 ; Diacritic # Mn [3] ADLAM ALIF LENGTHENER..ADLAM GEMINATION MARK
1E948..1E94A ; Diacritic # Mn [3] ADLAM CONSONANT MODIFIER..ADLAM NUKTA
# Total code points: 882
# Total code points: 1064
# ================================================
@ -1088,6 +1114,7 @@ AA70 ; Extender # Lm MYANMAR MODIFIER LETTER KHAMTI REDUPLICATION
AADD ; Extender # Lm TAI VIET SYMBOL SAM
AAF3..AAF4 ; Extender # Lm [2] MEETEI MAYEK SYLLABLE REPETITION MARK..MEETEI MAYEK WORD REPETITION MARK
FF70 ; Extender # Lm HALFWIDTH KATAKANA-HIRAGANA PROLONGED SOUND MARK
10781..10782 ; Extender # Lm [2] MODIFIER LETTER SUPERSCRIPT TRIANGULAR COLON..MODIFIER LETTER SUPERSCRIPT HALF TRIANGULAR COLON
1135D ; Extender # Lo GRANTHA SIGN PLUTA
115C6..115C8 ; Extender # Po [3] SIDDHAM REPETITION MARK-1..SIDDHAM REPETITION MARK-3
11A98 ; Extender # Mn SOYOMBO GEMINATION MARK
@ -1097,7 +1124,7 @@ FF70 ; Extender # Lm HALFWIDTH KATAKANA-HIRAGANA PROLONGED SOUND
1E13C..1E13D ; Extender # Lm [2] NYIAKENG PUACHUE HMONG SIGN XW XW..NYIAKENG PUACHUE HMONG SYLLABLE LENGTHENER
1E944..1E946 ; Extender # Mn [3] ADLAM ALIF LENGTHENER..ADLAM GEMINATION MARK
# Total code points: 48
# Total code points: 50
# ================================================
@ -1121,8 +1148,12 @@ A69C..A69D ; Other_Lowercase # Lm [2] MODIFIER LETTER CYRILLIC HARD SIGN..M
A770 ; Other_Lowercase # Lm MODIFIER LETTER US
A7F8..A7F9 ; Other_Lowercase # Lm [2] MODIFIER LETTER CAPITAL H WITH STROKE..MODIFIER LETTER SMALL LIGATURE OE
AB5C..AB5F ; Other_Lowercase # Lm [4] MODIFIER LETTER SMALL HENG..MODIFIER LETTER SMALL U WITH LEFT HOOK
10780 ; Other_Lowercase # Lm MODIFIER LETTER SMALL CAPITAL AA
10783..10785 ; Other_Lowercase # Lm [3] MODIFIER LETTER SMALL AE..MODIFIER LETTER SMALL B WITH HOOK
10787..107B0 ; Other_Lowercase # Lm [42] MODIFIER LETTER SMALL DZ DIGRAPH..MODIFIER LETTER SMALL V WITH RIGHT HOOK
107B2..107BA ; Other_Lowercase # Lm [9] MODIFIER LETTER SMALL CAPITAL Y..MODIFIER LETTER SMALL S WITH CURL
# Total code points: 189
# Total code points: 244
# ================================================
@ -1211,7 +1242,7 @@ E0020..E007F ; Other_Grapheme_Extend # Cf [96] TAG SPACE..CANCEL TAG
# ================================================
3400..4DBF ; Unified_Ideograph # Lo [6592] CJK UNIFIED IDEOGRAPH-3400..CJK UNIFIED IDEOGRAPH-4DBF
4E00..9FFC ; Unified_Ideograph # Lo [20989] CJK UNIFIED IDEOGRAPH-4E00..CJK UNIFIED IDEOGRAPH-9FFC
4E00..9FFF ; Unified_Ideograph # Lo [20992] CJK UNIFIED IDEOGRAPH-4E00..CJK UNIFIED IDEOGRAPH-9FFF
FA0E..FA0F ; Unified_Ideograph # Lo [2] CJK COMPATIBILITY IDEOGRAPH-FA0E..CJK COMPATIBILITY IDEOGRAPH-FA0F
FA11 ; Unified_Ideograph # Lo CJK COMPATIBILITY IDEOGRAPH-FA11
FA13..FA14 ; Unified_Ideograph # Lo [2] CJK COMPATIBILITY IDEOGRAPH-FA13..CJK COMPATIBILITY IDEOGRAPH-FA14
@ -1219,14 +1250,14 @@ FA1F ; Unified_Ideograph # Lo CJK COMPATIBILITY IDEOGRAPH-FA1F
FA21 ; Unified_Ideograph # Lo CJK COMPATIBILITY IDEOGRAPH-FA21
FA23..FA24 ; Unified_Ideograph # Lo [2] CJK COMPATIBILITY IDEOGRAPH-FA23..CJK COMPATIBILITY IDEOGRAPH-FA24
FA27..FA29 ; Unified_Ideograph # Lo [3] CJK COMPATIBILITY IDEOGRAPH-FA27..CJK COMPATIBILITY IDEOGRAPH-FA29
20000..2A6DD ; Unified_Ideograph # Lo [42718] CJK UNIFIED IDEOGRAPH-20000..CJK UNIFIED IDEOGRAPH-2A6DD
2A700..2B734 ; Unified_Ideograph # Lo [4149] CJK UNIFIED IDEOGRAPH-2A700..CJK UNIFIED IDEOGRAPH-2B734
20000..2A6DF ; Unified_Ideograph # Lo [42720] CJK UNIFIED IDEOGRAPH-20000..CJK UNIFIED IDEOGRAPH-2A6DF
2A700..2B738 ; Unified_Ideograph # Lo [4153] CJK UNIFIED IDEOGRAPH-2A700..CJK UNIFIED IDEOGRAPH-2B738
2B740..2B81D ; Unified_Ideograph # Lo [222] CJK UNIFIED IDEOGRAPH-2B740..CJK UNIFIED IDEOGRAPH-2B81D
2B820..2CEA1 ; Unified_Ideograph # Lo [5762] CJK UNIFIED IDEOGRAPH-2B820..CJK UNIFIED IDEOGRAPH-2CEA1
2CEB0..2EBE0 ; Unified_Ideograph # Lo [7473] CJK UNIFIED IDEOGRAPH-2CEB0..CJK UNIFIED IDEOGRAPH-2EBE0
30000..3134A ; Unified_Ideograph # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
# Total code points: 92856
# Total code points: 92865
# ================================================
@ -1291,8 +1322,9 @@ E0001 ; Deprecated # Cf LANGUAGE TAG
1D62A..1D62B ; Soft_Dotted # L& [2] MATHEMATICAL SANS-SERIF ITALIC SMALL I..MATHEMATICAL SANS-SERIF ITALIC SMALL J
1D65E..1D65F ; Soft_Dotted # L& [2] MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL I..MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL J
1D692..1D693 ; Soft_Dotted # L& [2] MATHEMATICAL MONOSPACE SMALL I..MATHEMATICAL MONOSPACE SMALL J
1DF1A ; Soft_Dotted # L& LATIN SMALL LETTER I WITH STROKE AND RETROFLEX HOOK
# Total code points: 46
# Total code points: 47
# ================================================
@ -1330,7 +1362,7 @@ AABB..AABC ; Logical_Order_Exception # Lo [2] TAI VIET VOWEL AUE..TAI VIET
002E ; Sentence_Terminal # Po FULL STOP
003F ; Sentence_Terminal # Po QUESTION MARK
0589 ; Sentence_Terminal # Po ARMENIAN FULL STOP
061E..061F ; Sentence_Terminal # Po [2] ARABIC TRIPLE DOT PUNCTUATION MARK..ARABIC QUESTION MARK
061D..061F ; Sentence_Terminal # Po [3] ARABIC END OF TEXT MARK..ARABIC QUESTION MARK
06D4 ; Sentence_Terminal # Po ARABIC FULL STOP
0700..0702 ; Sentence_Terminal # Po [3] SYRIAC END OF PARAGRAPH..SYRIAC SUBLINEAR FULL STOP
07F9 ; Sentence_Terminal # Po NKO EXCLAMATION MARK
@ -1349,12 +1381,14 @@ AABB..AABC ; Logical_Order_Exception # Lo [2] TAI VIET VOWEL AUE..TAI VIET
1AA8..1AAB ; Sentence_Terminal # Po [4] TAI THAM SIGN KAAN..TAI THAM SIGN SATKAANKUU
1B5A..1B5B ; Sentence_Terminal # Po [2] BALINESE PANTI..BALINESE PAMADA
1B5E..1B5F ; Sentence_Terminal # Po [2] BALINESE CARIK SIKI..BALINESE CARIK PAREREN
1B7D..1B7E ; Sentence_Terminal # Po [2] BALINESE PANTI LANTANG..BALINESE PAMADA LANTANG
1C3B..1C3C ; Sentence_Terminal # Po [2] LEPCHA PUNCTUATION TA-ROL..LEPCHA PUNCTUATION NYET THYOOM TA-ROL
1C7E..1C7F ; Sentence_Terminal # Po [2] OL CHIKI PUNCTUATION MUCAAD..OL CHIKI PUNCTUATION DOUBLE MUCAAD
203C..203D ; Sentence_Terminal # Po [2] DOUBLE EXCLAMATION MARK..INTERROBANG
2047..2049 ; Sentence_Terminal # Po [3] DOUBLE QUESTION MARK..EXCLAMATION QUESTION MARK
2E2E ; Sentence_Terminal # Po REVERSED QUESTION MARK
2E3C ; Sentence_Terminal # Po STENOGRAPHIC FULL STOP
2E53..2E54 ; Sentence_Terminal # Po [2] MEDIEVAL EXCLAMATION MARK..MEDIEVAL QUESTION MARK
3002 ; Sentence_Terminal # Po IDEOGRAPHIC FULL STOP
A4FF ; Sentence_Terminal # Po LISU PUNCTUATION FULL STOP
A60E..A60F ; Sentence_Terminal # Po [2] VAI FULL STOP..VAI QUESTION MARK
@ -1375,6 +1409,7 @@ FF1F ; Sentence_Terminal # Po FULLWIDTH QUESTION MARK
FF61 ; Sentence_Terminal # Po HALFWIDTH IDEOGRAPHIC FULL STOP
10A56..10A57 ; Sentence_Terminal # Po [2] KHAROSHTHI PUNCTUATION DANDA..KHAROSHTHI PUNCTUATION DOUBLE DANDA
10F55..10F59 ; Sentence_Terminal # Po [5] SOGDIAN PUNCTUATION TWO VERTICAL BARS..SOGDIAN PUNCTUATION HALF CIRCLE WITH DOT
10F86..10F89 ; Sentence_Terminal # Po [4] OLD UYGHUR PUNCTUATION BAR..OLD UYGHUR PUNCTUATION FOUR DOTS
11047..11048 ; Sentence_Terminal # Po [2] BRAHMI DANDA..BRAHMI DOUBLE DANDA
110BE..110C1 ; Sentence_Terminal # Po [4] KAITHI SECTION MARK..KAITHI DOUBLE DANDA
11141..11143 ; Sentence_Terminal # Po [3] CHAKMA DANDA..CHAKMA QUESTION MARK
@ -1403,15 +1438,16 @@ FF61 ; Sentence_Terminal # Po HALFWIDTH IDEOGRAPHIC FULL STOP
1BC9F ; Sentence_Terminal # Po DUPLOYAN PUNCTUATION CHINOOK FULL STOP
1DA88 ; Sentence_Terminal # Po SIGNWRITING FULL STOP
# Total code points: 143
# Total code points: 152
# ================================================
180B..180D ; Variation_Selector # Mn [3] MONGOLIAN FREE VARIATION SELECTOR ONE..MONGOLIAN FREE VARIATION SELECTOR THREE
180F ; Variation_Selector # Mn MONGOLIAN FREE VARIATION SELECTOR FOUR
FE00..FE0F ; Variation_Selector # Mn [16] VARIATION SELECTOR-1..VARIATION SELECTOR-16
E0100..E01EF ; Variation_Selector # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256
# Total code points: 259
# Total code points: 260
# ================================================
@ -1644,8 +1680,17 @@ E0100..E01EF ; Variation_Selector # Mn [240] VARIATION SELECTOR-17..VARIATION S
2E42 ; Pattern_Syntax # Ps DOUBLE LOW-REVERSED-9 QUOTATION MARK
2E43..2E4F ; Pattern_Syntax # Po [13] DASH WITH LEFT UPTURN..CORNISH VERSE DIVIDER
2E50..2E51 ; Pattern_Syntax # So [2] CROSS PATTY WITH RIGHT CROSSBAR..CROSS PATTY WITH LEFT CROSSBAR
2E52 ; Pattern_Syntax # Po TIRONIAN SIGN CAPITAL ET
2E53..2E7F ; Pattern_Syntax # Cn [45] <reserved-2E53>..<reserved-2E7F>
2E52..2E54 ; Pattern_Syntax # Po [3] TIRONIAN SIGN CAPITAL ET..MEDIEVAL QUESTION MARK
2E55 ; Pattern_Syntax # Ps LEFT SQUARE BRACKET WITH STROKE
2E56 ; Pattern_Syntax # Pe RIGHT SQUARE BRACKET WITH STROKE
2E57 ; Pattern_Syntax # Ps LEFT SQUARE BRACKET WITH DOUBLE STROKE
2E58 ; Pattern_Syntax # Pe RIGHT SQUARE BRACKET WITH DOUBLE STROKE
2E59 ; Pattern_Syntax # Ps TOP HALF LEFT PARENTHESIS
2E5A ; Pattern_Syntax # Pe TOP HALF RIGHT PARENTHESIS
2E5B ; Pattern_Syntax # Ps BOTTOM HALF LEFT PARENTHESIS
2E5C ; Pattern_Syntax # Pe BOTTOM HALF RIGHT PARENTHESIS
2E5D ; Pattern_Syntax # Pd OBLIQUE HYPHEN
2E5E..2E7F ; Pattern_Syntax # Cn [34] <reserved-2E5E>..<reserved-2E7F>
3001..3003 ; Pattern_Syntax # Po [3] IDEOGRAPHIC COMMA..DITTO MARK
3008 ; Pattern_Syntax # Ps LEFT ANGLE BRACKET
3009 ; Pattern_Syntax # Pe RIGHT ANGLE BRACKET
@ -1682,11 +1727,12 @@ FE45..FE46 ; Pattern_Syntax # Po [2] SESAME DOT..WHITE SESAME DOT
0600..0605 ; Prepended_Concatenation_Mark # Cf [6] ARABIC NUMBER SIGN..ARABIC NUMBER MARK ABOVE
06DD ; Prepended_Concatenation_Mark # Cf ARABIC END OF AYAH
070F ; Prepended_Concatenation_Mark # Cf SYRIAC ABBREVIATION MARK
0890..0891 ; Prepended_Concatenation_Mark # Cf [2] ARABIC POUND MARK ABOVE..ARABIC PIASTRE MARK ABOVE
08E2 ; Prepended_Concatenation_Mark # Cf ARABIC DISPUTED END OF AYAH
110BD ; Prepended_Concatenation_Mark # Cf KAITHI NUMBER SIGN
110CD ; Prepended_Concatenation_Mark # Cf KAITHI NUMBER SIGN ABOVE
# Total code points: 11
# Total code points: 13
# ================================================

File diff suppressed because it is too large Load Diff