scuffed-code/icu4c/source/data/unidata/changes.txt

Unicode 4.0.1 update

*** related Jitterbugs

3170 RFE: Update to Unicode 4.0.1
3171 Add new Unicode 4.0.1 properties
3520 use Unicode 4.0.1 updates for break iteration

*** data files & enums & parser code

* file preparation
- ucdstrip: DerivedNormalizationProps.txt, NormalizationTest.txt, DerivedCoreProperties.txt
- ucdstrip and ucdmerge: EastAsianWidth.txt, LineBreak.txt

* file fixes
- fix UnicodeData.txt general categories of Ethiopic digits Nd->No
  according to PRI #26
  http://www.unicode.org/review/resolved-pri.html#pri26
- undone again because no corrigendum in sight;
  instead modified tests to not check consistency on this for Unicode 4.0.1

* ucdterms.txt
- update from http://www.unicode.org/copyright.html
  formatted for plain text

* uchar.h & uprops.h & uprops.c & genprops
- add UBLOCK_CYRILLIC_SUPPLEMENT because the block is renamed
- add U_LB_INSEPARABLE due to a spelling fix
  + put short name comment only on line with new constant
    for genpname perl script parser
- new binary properties
  + STerm
  + Variation_Selector

* genpname
- fix genpname perl script so that it doesn't choke on more than 2 names per property value
- perl script: correctly calculate the maximum number of fields per row

* uscript.h
- new script code Hrkt=Katakana_Or_Hiragana

* gennorm.c track changes in DerivedNormalizationProps.txt
- "FNC" -> "FC_NFKC"
- single field "NFD_NO" -> two fields "NFD_QC; N" etc.

* genprops/props2.c track changes in DerivedNumericValues.txt
- changed from 3 columns to 2, dropping the numeric type
  + assume that the type is always numeric for Han characters,
    and that only those are added in addition to what UnicodeData.txt lists

*** Unicode version numbers
- makedata.mak
- uchar.h
- configure.in

*** tests
- update test of default bidi classes according to PRI #28
  /tsutil/cucdtst/TestUnicodeData
  http://www.unicode.org/review/resolved-pri.html#pri28
- bidi tests: change exemplar character for ES depending on Unicode version
- change hardcoded expected property values where they change

*** other code

* name matching
- read UCD.html

* scripts
- use new Hrkt=Katakana_Or_Hiragana

* ZWJ & ZWNJ
- are now part of combining character sequences
- break iteration used to assume that LB classes did not overlap; now they do for ZWJ & ZWNJ
ICU-3170 change log for Unicode updates X-SVN-Rev: 15168 2004-05-06 02:47:53 +00:00			`Unicode 4.0.1 update`

			`*** related Jitterbugs`

			`3170 RFE: Update to Unicode 4.0.1`
			`3171 Add new Unicode 4.0.1 properties`
			`3520 use Unicode 4.0.1 updates for break iteration`

			`*** data files & enums & parser code`

			`* file preparation`
			`- ucdstrip: DerivedNormalizationProps.txt, NormalizationTest.txt, DerivedCoreProperties.txt`
			`- ucdstrip and ucdmerge: EastAsianWidth.txt, LineBreak.txt`

			`* file fixes`
			`- fix UnicodeData.txt general categories of Ethiopic digits Nd->No`
			`according to PRI #26`
			`http://www.unicode.org/review/resolved-pri.html#pri26`
			`- undone again because no corrigendum in sight;`
			`instead modified tests to not check consistency on this for Unicode 4.0.1`

			`* ucdterms.txt`
			`- update from http://www.unicode.org/copyright.html`
			`formatted for plain text`

			`* uchar.h & uprops.h & uprops.c & genprops`
			`- add UBLOCK_CYRILLIC_SUPPLEMENT because the block is renamed`
			`- add U_LB_INSEPARABLE due to a spelling fix`
			`+ put short name comment only on line with new constant`
			`for genpname perl script parser`
			`- new binary properties`
			`+ STerm`
			`+ Variation_Selector`

			`* genpname`
			`- fix genpname perl script so that it doesn't choke on more than 2 names per property value`
			`- perl script: correctly calculate the maximum number of fields per row`

			`* uscript.h`
			`- new script code Hrkt=Katakana_Or_Hiragana`

			`* gennorm.c track changes in DerivedNormalizationProps.txt`
			`- "FNC" -> "FC_NFKC"`
			`- single field "NFD_NO" -> two fields "NFD_QC; N" etc.`

			`* genprops/props2.c track changes in DerivedNumericValues.txt`
			`- changed from 3 columns to 2, dropping the numeric type`
			`+ assume that the type is always numeric for Han characters,`
			`and that only those are added in addition to what UnicodeData.txt lists`

			`*** Unicode version numbers`
			`- makedata.mak`
			`- uchar.h`
			`- configure.in`

			`*** tests`
			`- update test of default bidi classes according to PRI #28`
			`/tsutil/cucdtst/TestUnicodeData`
			`http://www.unicode.org/review/resolved-pri.html#pri28`
			`- bidi tests: change exemplar character for ES depending on Unicode version`
			`- change hardcoded expected property values where they change`

			`*** other code`

			`* name matching`
			`- read UCD.html`

			`* scripts`
			`- use new Hrkt=Katakana_Or_Hiragana`

			`* ZWJ & ZWNJ`
			`- are now part of combining character sequences`
			`- break iteration used to assume that LB classes did not overlap; now they do for ZWJ & ZWNJ`