changed readme to html

X-SVN-Rev: 16845
2004-11-12 23:18:01 +00:00 · 2004-11-12 23:18:01 +00:00 · 8ef16d955f
commit 8ef16d955f
parent 7ad388b8d6
1 changed files with 0 additions and 201 deletions
--- a/tools/unicodetools/readme.txt
+++ b/tools/unicodetools/readme.txt
@ -1,201 +0,0 @@
 /**
 *******************************************************************************
 * Copyright (C) 1996-2001, International Business Machines Corporation and    *
 * others. All Rights Reserved.                                                *
 *******************************************************************************
 *
 * $Source: /xsrl/Nsvn/icu/unicodetools/Attic/readme.txt,v $
 * $Date: 2003/04/25 01:27:27 $
 * $Revision: 1.11 $
 *
 *******************************************************************************
 */
 This file provides instructions for building and running the UnicodeTools, which
 can be used to:
 - build the Derived Unicode files in the UCD (Unicode Character Database),
 - build the transformed UCA (Unicode Collation Algorithm) files needed by ICU.
 - run consistency checks on beta releases of the UCD and the UCA.
 - build 4 chart folders on the unicode site
 WARNING!!
 - This is NOT production level code, and should never be used in programs.
 - The API is subject to change without notice, and will not be maintained.
 - The source is uncommented, and has many warts; since it is not production code,
  it has not been worth the time to clean it up.
 - It will probably not work on Unix or Mac without changing the file separator.
 - Currently it uses hard-coded directory names.
 - The contents of multiple versions of the UCD must be copied to a local directory.
 Instructions:
 0. You will need to get ICU4J on your system, using CVS.
 The rest of this will assume that you have set up CVS so that you load the ICU4J project into C:\ICU4J
 You need both the main icu4j and a subproject called unicodetools. See:
 http://oss.software.ibm.com/icu/develop/cvs.html
 Inside unicodetools, look at com/ibm/text. The main directories of interest are UCD, UCA and utility.
 0a. If you are using Eclipse for your IDE, look at the instructions on
 http://oss.software.ibm.com/icu/docs/eclipse_howto/eclipse_howto.htm
 Set up Eclipse to build two projects: ICU4J and UnicodeTools:
 Project Name: ICU4J
 Directory: C:\ICU4J\icu4j
 Default output folder = ICU4J/classes
 Project Name: UnicodeTools
 Directory: C:\ICU4J\unicodetools
 Default Output Folder: UnicodeTools/classes
 After Eclipse is set up with these, exclude certain files from UnicodeTools:
 Right-Click UnicodeTools > Properties > Java Build Path > Exclusions
 com/ibm/rbm/
 com/ibm/text/utility/UnicodeMapInt.java
 com/ibm/text/utility/TestUtility.java
 com/ibm/text/UCD/GenerateThaiBreaks-old.java/
 com/ibm/text/UCD/ProcessUnihan.java/
 com/ibm/text/UCA/WriteHTMLCollation.java/
 UnicodeTools must also include the ICU4J project, with
 Right-Click UnicodeTools > Properties > Java Build Path > Projects
 1. In UCD, you must edit UCD_Types.java at the top, to set the directories for the build:
    public static final String DATA_DIR = "C:\\DATA\\";
    public static final String UCD_DIR = BASE_DIR + "UCD\\";
    public static final String BIN_DIR = DATA_DIR + "BIN\\";
    public static final String GEN_DIR = DATA_DIR + "GEN\\";
 Make sure that each of these directories exist. Also make sure that the following
 exist:
 <GEN_DIR>/DerivedData
 <GEN_DIR>/DerivedData/ExtractedProperties
 <UCD_DIR>/EXTRAS-Update
 2. Download all of the UnicodeData files for each version into UCD_DIR.
 The folder names must be of the form: "3.2.0-Update", so rename the folders on the
 Unicode site to this format.
 2a. If you are downloading any "incomplete" release (one that does not contain
 a complete set of data files for that release, you need to also download the previous
 complete release). Most of the N.M-Update directorys are complete, *except*:
    4.0-Update, which does not contain a copy of Unihan.txt and some other files
    3.1-Update, which does not contain a copy of BidiMirroring.txt
 2b. If you are building any of the UCA tools, you need to get a copy of the UCA data file
 from http://www.unicode.org/reports/tr10/#AllKeys. The default location for this is:
        BASE_DIR + "Collation\\allkeys" + VERSION + ".txt".
 If you have it in a different location, change that value for KEYS in UCA.java, and 
 the value for BASE_DIR
 2c. Here is an example of the default directory structure with files:
 C://DATA/
        BIN/
        Collation/
            allkeys-3.1.1.txt
        GEN/
            DerivedData/
                ExtractedProperties
        UCD/
            3.0.0-Update/
                Unihan-3.2.0.txt
                ...
            3.0.1-Update/
                ...
            3.1.0-Update/
                ...
            3.1.1-Update/
                ...
            3.2.0-Update/
                ...
            4.0.0-Update/
                ArabicShaping-4.0.0d14b.txt
                BidiMirroring-4.0.0d1b.txt
                ...
            EXTRAS-Update/
 3. All of the following have "version X" in the options you give to Java (either on the 
 command line, or in the Eclipse 'run' options. If you want a specific version
 like 3.1.0, then you would write "version 3.1.1". If you want the latest version (4.0.0),
 you can omit the "version X".
 4. Running UCD, you will use com.ibm.text.UCD.Main as your main class.
 The Working directory has to be C:\ICU4J\unicodetools\com\ibm\text\UCD
 The same for UCA:
 main: com.ibm.text.UCD.Main
 directory: C:\ICU4J\unicodetools\com\ibm\text\UCA
 4a. For each version, the tools build a set of binary data in BIN that contain
 the information for that release. This is done automatically, or you can manually do it
 with the options
  version X build
 This builds an compressed format of all the UCD data (except blocks and Unihan)
 into the BIN directory. Don't worry about the voluminous console messages, unless one says
 "FAIL".
 You have to manually do this if you change any of the data files in that version!!
 4b. All of the generated files get a "d" version number, e.g. CaseFolding-4.0.0d3.txt.
 To change the D version on generated files, edit the link in GenerateData.java:
    static final int dVersion = 2; // change to fix the generated file D version. If less than zero, no "d"
 Note: if for any reason you modify the binary format of the BIN files, you also have
 to bump the value in that file:
    static final byte BINARY_FORMAT = 8; // bumped if binary format of UCD changes
 4c. To build all of the Unicode files for a particular version X, run
  version X all
 4d. To build a particular file, like CaseFolding, use that file name instead of all
  version X CaseFolding
 4e. To run basic consistency checking, run:
  version X verify
 Don't worry about any console messages except those that say FAIL.
 5. Running UCA, you will use com.ibm.text.UCA.Main as your main class.
 5a. To build all the UCA files used by ICU, use the option:
    java <UCA>Main ICU
 6. To build all the charts, use the UCA project, with options: normalizationChart caseChart scriptChart indexChart