74 lines
2.3 KiB
Plaintext
74 lines
2.3 KiB
Plaintext
|
Unicode 4.0.1 update
|
||
|
|
||
|
*** related Jitterbugs
|
||
|
|
||
|
3170 RFE: Update to Unicode 4.0.1
|
||
|
3171 Add new Unicode 4.0.1 properties
|
||
|
3520 use Unicode 4.0.1 updates for break iteration
|
||
|
|
||
|
*** data files & enums & parser code
|
||
|
|
||
|
* file preparation
|
||
|
- ucdstrip: DerivedNormalizationProps.txt, NormalizationTest.txt, DerivedCoreProperties.txt
|
||
|
- ucdstrip and ucdmerge: EastAsianWidth.txt, LineBreak.txt
|
||
|
|
||
|
* file fixes
|
||
|
- fix UnicodeData.txt general categories of Ethiopic digits Nd->No
|
||
|
according to PRI #26
|
||
|
http://www.unicode.org/review/resolved-pri.html#pri26
|
||
|
- undone again because no corrigendum in sight;
|
||
|
instead modified tests to not check consistency on this for Unicode 4.0.1
|
||
|
|
||
|
* ucdterms.txt
|
||
|
- update from http://www.unicode.org/copyright.html
|
||
|
formatted for plain text
|
||
|
|
||
|
* uchar.h & uprops.h & uprops.c & genprops
|
||
|
- add UBLOCK_CYRILLIC_SUPPLEMENT because the block is renamed
|
||
|
- add U_LB_INSEPARABLE due to a spelling fix
|
||
|
+ put short name comment only on line with new constant
|
||
|
for genpname perl script parser
|
||
|
- new binary properties
|
||
|
+ STerm
|
||
|
+ Variation_Selector
|
||
|
|
||
|
* genpname
|
||
|
- fix genpname perl script so that it doesn't choke on more than 2 names per property value
|
||
|
- perl script: correctly calculate the maximum number of fields per row
|
||
|
|
||
|
* uscript.h
|
||
|
- new script code Hrkt=Katakana_Or_Hiragana
|
||
|
|
||
|
* gennorm.c track changes in DerivedNormalizationProps.txt
|
||
|
- "FNC" -> "FC_NFKC"
|
||
|
- single field "NFD_NO" -> two fields "NFD_QC; N" etc.
|
||
|
|
||
|
* genprops/props2.c track changes in DerivedNumericValues.txt
|
||
|
- changed from 3 columns to 2, dropping the numeric type
|
||
|
+ assume that the type is always numeric for Han characters,
|
||
|
and that only those are added in addition to what UnicodeData.txt lists
|
||
|
|
||
|
*** Unicode version numbers
|
||
|
- makedata.mak
|
||
|
- uchar.h
|
||
|
- configure.in
|
||
|
|
||
|
*** tests
|
||
|
- update test of default bidi classes according to PRI #28
|
||
|
/tsutil/cucdtst/TestUnicodeData
|
||
|
http://www.unicode.org/review/resolved-pri.html#pri28
|
||
|
- bidi tests: change exemplar character for ES depending on Unicode version
|
||
|
- change hardcoded expected property values where they change
|
||
|
|
||
|
*** other code
|
||
|
|
||
|
* name matching
|
||
|
- read UCD.html
|
||
|
|
||
|
* scripts
|
||
|
- use new Hrkt=Katakana_Or_Hiragana
|
||
|
|
||
|
* ZWJ & ZWNJ
|
||
|
- are now part of combining character sequences
|
||
|
- break iteration used to assume that LB classes did not overlap; now they do for ZWJ & ZWNJ
|