This set of charts shows the Unicode Collation Algorithm values for Unicode characters. The characters are arranged in the following groups:
Null | Completely ignoreable (primary, secondary and tertiary levels) These include control codes and various formatting codes. |
---|---|
Ignorable | Ignorable at a primary level, but not at a secondary or
tertiary level. These include most accents and diacritics. |
Variable | Characters that may be set to ignorable by a programmatic
switch. These include spaces, punctuation marks, and most symbols. |
Common | Characters that are none of the above, but not considered
letters. These include numbers, currency symbols, etc. |
Letters | According to script |
Unsupported | Not explicitly supported in this version of UCA; uses code-point order |
The characters* within each group are arranged in cells. The color of the cell indicates the strength of the difference between that character and the previous character in the chart, as follows.
No Expansion | Expansion | |||
---|---|---|---|---|
a 0061 |
Primary difference | dz 01F3 |
Primary difference | |
á 00E1 |
Secondary Difference | DZ 01F1 |
Secondary Difference | |
A 0041 |
Tertiary difference | Dz 01F2 |
Tertiary difference | |
Å 212B |
Quarternary difference or no difference |
Quarternary difference or no difference |
Note: If tool-tips are enabled in your browser, then if you pause the mouse over any cell, you will see the name of the character and a representation of the sort key. In this representation, the separators between the weight levels are represented with "|".
* | In some cases, the UCA data table also includes contractions. They can be recognized by the multiple code point numbers, as in the following: |
ஔ 0B92 0BD7 |
---|