scuffed-code/tools/unicodetools/com/ibm/text/UCD/MakeUnicodeFiles.txt

384 lines
11 KiB
Plaintext
Raw Normal View History

Generate:
DeltaVersion: 13
2004-12-11 06:03:10 +00:00
CopyrightYear: 2005
2004-03-11 19:04:00 +00:00
File: auxiliary/GraphemeBreakProperty
Property: Grapheme_Cluster_Break
Format: skipValue=Other
File: auxiliary/WordBreakProperty
Property: Word_Break
Format: skipValue=Other
File: auxiliary/SentenceBreakProperty
Property: Sentence_Break
Format: skipValue=Other
2004-03-11 19:04:00 +00:00
File: Blocks
Property: Block
2004-04-10 16:49:19 +00:00
# Note: When comparing block names, casing, whitespace, hyphens,
# and underbars are ignored.
# For example, "Latin Extended-A" and "latin extended a" are equivalent.
# For more information on the comparison of property values,
# see UCD.html.
2004-12-11 06:03:10 +00:00
Format: valueList skipUnassigned=No_Block
2004-03-11 19:04:00 +00:00
File: CaseFolding
Property: SPECIAL
File: DerivedAge
Property: Age
Format: nameStyle=none noLabel skipValue=unassigned
Value: 1.1
# Assigned as of Unicode 1.1.0 (June, 1993)
# [excluding removed Hangul Syllables]
Value: 2.0
# Newly assigned in Unicode 2.0.0 (July, 1996)
Value: 2.1
# Newly assigned in Unicode 2.1.2 (May, 1998)
Value: 3.0
# Newly assigned in Unicode 3.0.0 (September, 1999)
Value: 3.1
# Newly assigned in Unicode 3.1.0 (March, 2001)
Value: 3.2
# Newly assigned in Unicode 3.2.0 (March, 2002)
Value: 4.0
# Newly assigned in Unicode 4.0.0 (April, 2003)
Value: 4.1
# Newly assigned in Unicode 4.1.0 (XXX, 2005)
2004-03-11 19:04:00 +00:00
File: extracted/DerivedBidiClass
Property: Bidi_Class
# Bidi Class (listing UnicodeData.txt, field 4: see UCD.html)
# Unlike other properties, unassigned code points in blocks reserved for right-to-left scripts are given either types R or AL.
# The unassigned characters that default to R are:
# Hebrew, Cypriot_Syllabary, Kharoshthi, and the ranges \u07C0-\u08FF \uFB1D-\uFB4F \U00010840-\U00010FFF
# The unassigned characters that default to AL are:
# Arabic, Syriac, Thaana, Arabic_Presentation_Forms_A, Arabic_Presentation_Forms_B, Arabic_Supplement,
# and the range \u0750-\u077F, minus the Noncharacter_Code_Points
# For all other cases:
2004-03-11 19:04:00 +00:00
Format: valueStyle=short skipUnassigned=Left_To_Right
File: extracted/DerivedBinaryProperties
Property: Bidi_Mirrored
# Bidi_Mirrored (listing UnicodeData.txt, field 9: see UCD.html)
File: extracted/DerivedCombiningClass
Property: Canonical_Combining_Class
# Combining Class (listing UnicodeData.txt, field 3: see UCD.html)
Format: nameStyle=none valueStyle=short skipUnassigned=Not_Reordered
File: DerivedCoreProperties
Property: Math
# Derived Property: Math
# Generated from: Sm + Other_Math
Property: Alphabetic
# Derived Property: Alphabetic
# Generated from: Lu+Ll+Lt+Lm+Lo+Nl + Other_Alphabetic
Property: Lowercase
# Derived Property: Lowercase
# Generated from: Ll + Other_Lowercase
Property: Uppercase
# Derived Property: Uppercase
# Generated from: Lu + Other_Uppercase
Property: ID_Start
# Derived Property: ID_Start
# Characters that can start an identifier.
# Generated from Lu+Ll+Lt+Lm+Lo+Nl+Other_ID_Start
# NOTE: See UAX #31 for more information
2004-03-11 19:04:00 +00:00
Property: ID_Continue
# Derived Property: ID_Continue
# Characters that can continue an identifier.
# Generated from: ID_Start + Mn+Mc+Nd+Pc + Other_ID_Continue
# NOTE: See UAX #31 for more information
2004-03-11 19:04:00 +00:00
Property: XID_Start
# Derived Property: XID_Start
# ID_Start modified for closure under NFKx
# Modified as described in UAX #15
# NOTE: Does NOT remove the non-NFKx characters.
# Merely ensures that if isIdentifer(string) then isIdentifier(NFKx(string))
# NOTE: See UAX #31 for more information
2004-03-11 19:04:00 +00:00
Property: XID_Continue
# Derived Property: XID_Continue
# Mod_ID_Continue modified for closure under NFKx
# Modified as described in UAX #15
# NOTE: Cf characters should be filtered out.
# NOTE: Does NOT remove the non-NFKx characters.
# Merely ensures that if isIdentifer(string) then isIdentifier(NFKx(string))
# NOTE: See UAX #31 for more information
2004-03-11 19:04:00 +00:00
Property: Default_Ignorable_Code_Point
# Derived Property: Default_Ignorable_Code_Point
# Generated from Other_Default_Ignorable_Code_Point + Cf + Cc + Cs + Noncharacters
# - White_Space - FFF9..FFFB (Annotation Characters)
2004-03-11 19:04:00 +00:00
Property: Grapheme_Extend
# Derived Property: Grapheme_Extend
# Generated from: Me + Mn + Other_Grapheme_Extend
# Note: depending on an application's interpretation of Co (private use),
# they may be either in Grapheme_Base, or in Grapheme_Extend, or in neither.
Property: Grapheme_Base
# Derived Property: Grapheme_Base
# Generated from: [0..10FFFF] - Cc - Cf - Cs - Co - Cn - Zl - Zp - Grapheme_Extend
# Note: depending on an application's interpretation of Co (private use),
# they may be either in Grapheme_Base, or in Grapheme_Extend, or in neither.
File: extracted/DerivedDecompositionType
Property: Decomposition_Type
Format: skipValue=None
# Decomposition_Type (from UnicodeData.txt, field 5: see UCD.html)
File: extracted/DerivedEastAsianWidth
Property: East_Asian_Width
Format: valueStyle=short skipUnassigned=Neutral
# East_Asian_Width (listing EastAsianWidth.txt, field 1)
File: extracted/DerivedGeneralCategory
Property: General_Category
Format: valueStyle=short noLabel
File: extracted/DerivedJoiningGroup
Property: Joining_Group
# Joining Group (listing ArabicShaping.txt, field 3)
Format: skipValue=No_Joining_Group
File: extracted/DerivedJoiningType
Property: Joining_Type
# Type T is derived, as described in ArabicShaping.txt
Format: valueStyle=short skipValue=Non_Joining
File: extracted/DerivedLineBreak
Property: Line_Break
Format: valueStyle=short skipUnassigned=Unknown
File: DerivedNormalizationProps
Property: FC_NFKC_Closure
# Derived Property: FC_NFKC_Closure
# Generated from computing: b = NFKC(Fold(a)); c = NFKC(Fold(b));
# Then if (c != b) add the mapping from a to c to the set of
# mappings that constitute the FC_NFKC_Closure list
# Uses the full case folding from CaseFolding.txt, without the T option.
Format: nameStyle=short
Property: Full_Composition_Exclusion
# Derived Property: Full_Composition_Exclusion
# Generated from: Composition Exclusions + Singletons + Non-Starter Decompositions
Property: NFD_QuickCheck
# Derived Property: NFD_QuickCheck
# Generated from computing decomposibles
Format: nameStyle=short valueStyle=short skipValue=Yes
Property: NFC_QuickCheck
# Derived Property: NFC_QuickCheck
# Generated from computing decomposibles (and characters that may compose with previous ones)
Format: nameStyle=short valueStyle=short skipValue=Yes
Property: NFKD_QuickCheck
# Derived Property: NFKD_QuickCheck
# Generated from computing decomposibles
Format: nameStyle=short valueStyle=short skipValue=Yes
Property: NFKC_QuickCheck
# Derived Property: NFKC_QuickCheck
# Generated from computing decomposibles (and characters that may compose with previous ones)
Format: nameStyle=short valueStyle=short skipValue=Yes
Property: Expands_On_NFD
# Derived Property: Expands_On_NFD
# Generated according to UAX #15.
# Characters whose normalized length is not one.
# WARNING: Normalization of STRINGS must use the algorithm in UAX #15 because characters may interact.
# The length of a normalized string is not necessarily the sum of the lengths of the normalized characters!
Property: Expands_On_NFC
# Derived Property: Expands_On_NFC
# Generated according to UAX #15.
# Characters whose normalized length is not one.
# WARNING: Normalization of STRINGS must use the algorithm in UAX #15 because characters may interact.
# The length of a normalized string is not necessarily the sum of the lengths of the normalized characters!
Property: Expands_On_NFKD
# Derived Property: Expands_On_NFKD
# Generated according to UAX #15.
# Characters whose normalized length is not one.
# WARNING: Normalization of STRINGS must use the algorithm in UAX #15 because characters may interact.
# The length of a normalized string is not necessarily the sum of the lengths of the normalized characters!
Property: Expands_On_NFKC
# Derived Property: Expands_On_NFKC
# Generated according to UAX #15.
# Characters whose normalized length is not one.
# WARNING: Normalization of STRINGS must use the algorithm in UAX #15 because characters may interact.
# The length of a normalized string is not necessarily the sum of the lengths of the normalized characters!
File: extracted/DerivedNumericType
Property: Numeric_Type
# Numeric Type (from UnicodeData.txt, field 6/7/8 plus Unihan.txt: see UCD.html)
Format: skipValue=None
File: extracted/DerivedNumericValues
Property: Numeric_Value
# Numeric Values (from UnicodeData.txt, field 6/7/8)
# WARNING: Certain valus, such as 0.16666667, are repeating fractions
# Although they are only printed with a limited number of decimal places
# in this file, they should be expressed to the limits of the precision
# available when used.
Format: sortNumeric
File: HangulSyllableType
Property: Hangul_Syllable_Type
Format: valueStyle=short skipValue=Not_Applicable
File: NormalizationTest
Property: SPECIAL
File: PropList
Property: White_Space
Property: Bidi_Control
Property: Join_Control
Property: Dash
Property: Hyphen
Property: Quotation_Mark
Property: Terminal_Punctuation
Property: Other_Math
Property: Hex_Digit
Property: ASCII_Hex_Digit
Property: Other_Alphabetic
Property: Ideographic
Property: Diacritic
Property: Extender
Property: Other_Lowercase
Property: Other_Uppercase
Property: Noncharacter_Code_Point
Property: Other_Grapheme_Extend
Property: Grapheme_Link
Property: IDS_Binary_Operator
Property: IDS_Trinary_Operator
Property: Radical
Property: Unified_Ideograph
Property: Other_Default_Ignorable_Code_Point
Property: Deprecated
Property: Soft_Dotted
Property: Logical_Order_Exception
Property: Other_ID_Start
Property: Other_ID_Continue
Property: STerm
Property: Variation_Selector
Property: Pattern_White_Space
Property: Pattern_Syntax
2004-03-11 19:04:00 +00:00
File: PropertyAliases
Property: SPECIAL
File: PropertyValueAliases
Property: SPECIAL
File: Scripts
Property: Script
Format: nameStyle=none skipUnassigned=Common
File: SpecialCasing
Property: SPECIAL
File: StandardizedVariants
Property: SPECIAL
HackName: noBreak
HackName: Arabic_Presentation_Forms-A
HackName: Arabic_Presentation_Forms-B
HackName: CJK_Symbols_and_Punctuation
HackName: Combining_Diacritical_Marks_for_Symbols
HackName: Enclosed_CJK_Letters_and_Months
HackName: Greek_and_Coptic
HackName: Halfwidth_and_Fullwidth_Forms
HackName: Latin-1_Supplement
HackName: Latin_Extended-A
HackName: Latin_Extended-B
HackName: Miscellaneous_Mathematical_Symbols-A
HackName: Miscellaneous_Mathematical_Symbols-B
HackName: Miscellaneous_Symbols_and_Arrows
HackName: Superscripts_and_Subscripts
HackName: Supplemental_Arrows-A
HackName: Supplemental_Arrows-B
HackName: Supplementary_Private_Use_Area-A
HackName: Supplementary_Private_Use_Area-B
HackName: Canadian-Aboriginal
2004-12-11 06:03:10 +00:00
#HackName: Old-Italic
2004-03-11 19:04:00 +00:00
FinalComments
Note that PropertyAliases sorts by the long name, while PropertyValueAliases
sorts by the short name
ArabicShaping
BidiMirroring
CompositionExclusions
EastAsianWidth
LineBreak
StandardizedVariants
UnicodeData