65e8ccde28
X-SVN-Rev: 17717
34 lines
2.0 KiB
Plaintext
34 lines
2.0 KiB
Plaintext
# Confusables.txt
|
|
# Generated: %date%, MED
|
|
# This is a draft list of visually confusable characters, for use in conjunction with the
|
|
# recommendations in http://www.unicode.org/reports/tr36/
|
|
#
|
|
# To fold using this list, first perform NFKD (if not already performed),
|
|
# then map each source character to the target character(s), then perform NFKD again.
|
|
#
|
|
# The format the standard Unicode semicolon-delimited hex.
|
|
# <source> ; <target> ; <internal_info> # <comment>
|
|
#
|
|
# The characters may be visually distinguishable in many fonts, or at larger sizes.
|
|
# Some anomalies are also introduced by 'closure'. That is, there may be a sequence of
|
|
# characters where each is visually confusable from the next, but the start and end are
|
|
# visually distinguishable. But when the set is closed, these will all map to together.
|
|
#
|
|
# This is unlike normalization data. There may be no connection between characters other
|
|
# than visual confusability. This data should not be used except in assessing visual confusability.
|
|
#
|
|
# This list is not limited to Unicode Identifier characters (XID_Continue) although the primary
|
|
# application will be to such characters. It is also not limited to lowercase characters,
|
|
# although the recommendations are to lowercase for security.
|
|
#
|
|
# Note that a some characters have unusual characteristics, and are not yet accounted for.
|
|
# For example, U+302E (?) HANGUL SINGLE DOT TONE MARK and U+302F (?) HANGUL DOUBLE DOT TONE MARK
|
|
# appear to the left of the prevous character. So what looks like "a:b" can actually be "ab\u302F"
|
|
#
|
|
# WARNING: The data is not final; it is very draft at this point, put together from different
|
|
# sources that need to be reviewed for accuracy and completeness of the mappings.
|
|
# There are still clear errors in the data; do not use this in any implementations.
|
|
# Ignore the internal_info field; it will be removed.
|
|
#
|
|
# Thanks especially to Eric van der Poel for collecting information about fonts using shared glyphs.
|
|
# ================================= |