scuffed-code/tools/unicodetools/com/ibm/text/UCD/confusablesHeader.txt
Mark Davis 65e8ccde28 ICU-0 misc fixes
X-SVN-Rev: 17717
2005-05-27 21:43:46 +00:00

34 lines
2.0 KiB
Plaintext

# Confusables.txt
# Generated: %date%, MED
# This is a draft list of visually confusable characters, for use in conjunction with the
# recommendations in http://www.unicode.org/reports/tr36/
#
# To fold using this list, first perform NFKD (if not already performed),
# then map each source character to the target character(s), then perform NFKD again.
#
# The format the standard Unicode semicolon-delimited hex.
# <source> ; <target> ; <internal_info> # <comment>
#
# The characters may be visually distinguishable in many fonts, or at larger sizes.
# Some anomalies are also introduced by 'closure'. That is, there may be a sequence of
# characters where each is visually confusable from the next, but the start and end are
# visually distinguishable. But when the set is closed, these will all map to together.
#
# This is unlike normalization data. There may be no connection between characters other
# than visual confusability. This data should not be used except in assessing visual confusability.
#
# This list is not limited to Unicode Identifier characters (XID_Continue) although the primary
# application will be to such characters. It is also not limited to lowercase characters,
# although the recommendations are to lowercase for security.
#
# Note that a some characters have unusual characteristics, and are not yet accounted for.
# For example, U+302E (?) HANGUL SINGLE DOT TONE MARK and U+302F (?) HANGUL DOUBLE DOT TONE MARK
# appear to the left of the prevous character. So what looks like "a:b" can actually be "ab\u302F"
#
# WARNING: The data is not final; it is very draft at this point, put together from different
# sources that need to be reviewed for accuracy and completeness of the mappings.
# There are still clear errors in the data; do not use this in any implementations.
# Ignore the internal_info field; it will be removed.
#
# Thanks especially to Eric van der Poel for collecting information about fonts using shared glyphs.
# =================================