changed readme to html
X-SVN-Rev: 16845
This commit is contained in:
parent
7ad388b8d6
commit
8ef16d955f
@ -1,201 +0,0 @@
|
|||||||
/**
|
|
||||||
*******************************************************************************
|
|
||||||
* Copyright (C) 1996-2001, International Business Machines Corporation and *
|
|
||||||
* others. All Rights Reserved. *
|
|
||||||
*******************************************************************************
|
|
||||||
*
|
|
||||||
* $Source: /xsrl/Nsvn/icu/unicodetools/Attic/readme.txt,v $
|
|
||||||
* $Date: 2003/04/25 01:27:27 $
|
|
||||||
* $Revision: 1.11 $
|
|
||||||
*
|
|
||||||
*******************************************************************************
|
|
||||||
*/
|
|
||||||
|
|
||||||
This file provides instructions for building and running the UnicodeTools, which
|
|
||||||
can be used to:
|
|
||||||
|
|
||||||
- build the Derived Unicode files in the UCD (Unicode Character Database),
|
|
||||||
- build the transformed UCA (Unicode Collation Algorithm) files needed by ICU.
|
|
||||||
- run consistency checks on beta releases of the UCD and the UCA.
|
|
||||||
- build 4 chart folders on the unicode site
|
|
||||||
|
|
||||||
|
|
||||||
WARNING!!
|
|
||||||
|
|
||||||
- This is NOT production level code, and should never be used in programs.
|
|
||||||
- The API is subject to change without notice, and will not be maintained.
|
|
||||||
- The source is uncommented, and has many warts; since it is not production code,
|
|
||||||
it has not been worth the time to clean it up.
|
|
||||||
- It will probably not work on Unix or Mac without changing the file separator.
|
|
||||||
- Currently it uses hard-coded directory names.
|
|
||||||
- The contents of multiple versions of the UCD must be copied to a local directory.
|
|
||||||
|
|
||||||
|
|
||||||
Instructions:
|
|
||||||
|
|
||||||
0. You will need to get ICU4J on your system, using CVS.
|
|
||||||
The rest of this will assume that you have set up CVS so that you load the ICU4J project into C:\ICU4J
|
|
||||||
|
|
||||||
You need both the main icu4j and a subproject called unicodetools. See:
|
|
||||||
|
|
||||||
http://oss.software.ibm.com/icu/develop/cvs.html
|
|
||||||
|
|
||||||
Inside unicodetools, look at com/ibm/text. The main directories of interest are UCD, UCA and utility.
|
|
||||||
|
|
||||||
0a. If you are using Eclipse for your IDE, look at the instructions on
|
|
||||||
|
|
||||||
http://oss.software.ibm.com/icu/docs/eclipse_howto/eclipse_howto.htm
|
|
||||||
|
|
||||||
Set up Eclipse to build two projects: ICU4J and UnicodeTools:
|
|
||||||
|
|
||||||
Project Name: ICU4J
|
|
||||||
Directory: C:\ICU4J\icu4j
|
|
||||||
Default output folder = ICU4J/classes
|
|
||||||
|
|
||||||
Project Name: UnicodeTools
|
|
||||||
Directory: C:\ICU4J\unicodetools
|
|
||||||
Default Output Folder: UnicodeTools/classes
|
|
||||||
|
|
||||||
After Eclipse is set up with these, exclude certain files from UnicodeTools:
|
|
||||||
|
|
||||||
Right-Click UnicodeTools > Properties > Java Build Path > Exclusions
|
|
||||||
com/ibm/rbm/
|
|
||||||
com/ibm/text/utility/UnicodeMapInt.java
|
|
||||||
com/ibm/text/utility/TestUtility.java
|
|
||||||
com/ibm/text/UCD/GenerateThaiBreaks-old.java/
|
|
||||||
com/ibm/text/UCD/ProcessUnihan.java/
|
|
||||||
com/ibm/text/UCA/WriteHTMLCollation.java/
|
|
||||||
|
|
||||||
UnicodeTools must also include the ICU4J project, with
|
|
||||||
|
|
||||||
Right-Click UnicodeTools > Properties > Java Build Path > Projects
|
|
||||||
|
|
||||||
1. In UCD, you must edit UCD_Types.java at the top, to set the directories for the build:
|
|
||||||
|
|
||||||
public static final String DATA_DIR = "C:\\DATA\\";
|
|
||||||
public static final String UCD_DIR = BASE_DIR + "UCD\\";
|
|
||||||
public static final String BIN_DIR = DATA_DIR + "BIN\\";
|
|
||||||
public static final String GEN_DIR = DATA_DIR + "GEN\\";
|
|
||||||
|
|
||||||
Make sure that each of these directories exist. Also make sure that the following
|
|
||||||
exist:
|
|
||||||
|
|
||||||
<GEN_DIR>/DerivedData
|
|
||||||
<GEN_DIR>/DerivedData/ExtractedProperties
|
|
||||||
<UCD_DIR>/EXTRAS-Update
|
|
||||||
|
|
||||||
|
|
||||||
2. Download all of the UnicodeData files for each version into UCD_DIR.
|
|
||||||
The folder names must be of the form: "3.2.0-Update", so rename the folders on the
|
|
||||||
Unicode site to this format.
|
|
||||||
|
|
||||||
|
|
||||||
2a. If you are downloading any "incomplete" release (one that does not contain
|
|
||||||
a complete set of data files for that release, you need to also download the previous
|
|
||||||
complete release). Most of the N.M-Update directorys are complete, *except*:
|
|
||||||
|
|
||||||
4.0-Update, which does not contain a copy of Unihan.txt and some other files
|
|
||||||
3.1-Update, which does not contain a copy of BidiMirroring.txt
|
|
||||||
|
|
||||||
|
|
||||||
2b. If you are building any of the UCA tools, you need to get a copy of the UCA data file
|
|
||||||
from http://www.unicode.org/reports/tr10/#AllKeys. The default location for this is:
|
|
||||||
|
|
||||||
BASE_DIR + "Collation\\allkeys" + VERSION + ".txt".
|
|
||||||
|
|
||||||
If you have it in a different location, change that value for KEYS in UCA.java, and
|
|
||||||
the value for BASE_DIR
|
|
||||||
|
|
||||||
|
|
||||||
2c. Here is an example of the default directory structure with files:
|
|
||||||
|
|
||||||
C://DATA/
|
|
||||||
|
|
||||||
BIN/
|
|
||||||
|
|
||||||
Collation/
|
|
||||||
allkeys-3.1.1.txt
|
|
||||||
|
|
||||||
GEN/
|
|
||||||
DerivedData/
|
|
||||||
ExtractedProperties
|
|
||||||
UCD/
|
|
||||||
3.0.0-Update/
|
|
||||||
Unihan-3.2.0.txt
|
|
||||||
...
|
|
||||||
3.0.1-Update/
|
|
||||||
...
|
|
||||||
3.1.0-Update/
|
|
||||||
...
|
|
||||||
3.1.1-Update/
|
|
||||||
...
|
|
||||||
3.2.0-Update/
|
|
||||||
...
|
|
||||||
4.0.0-Update/
|
|
||||||
ArabicShaping-4.0.0d14b.txt
|
|
||||||
BidiMirroring-4.0.0d1b.txt
|
|
||||||
...
|
|
||||||
EXTRAS-Update/
|
|
||||||
|
|
||||||
|
|
||||||
3. All of the following have "version X" in the options you give to Java (either on the
|
|
||||||
command line, or in the Eclipse 'run' options. If you want a specific version
|
|
||||||
like 3.1.0, then you would write "version 3.1.1". If you want the latest version (4.0.0),
|
|
||||||
you can omit the "version X".
|
|
||||||
|
|
||||||
|
|
||||||
4. Running UCD, you will use com.ibm.text.UCD.Main as your main class.
|
|
||||||
|
|
||||||
The Working directory has to be C:\ICU4J\unicodetools\com\ibm\text\UCD
|
|
||||||
|
|
||||||
The same for UCA:
|
|
||||||
main: com.ibm.text.UCD.Main
|
|
||||||
directory: C:\ICU4J\unicodetools\com\ibm\text\UCA
|
|
||||||
|
|
||||||
4a. For each version, the tools build a set of binary data in BIN that contain
|
|
||||||
the information for that release. This is done automatically, or you can manually do it
|
|
||||||
with the options
|
|
||||||
|
|
||||||
version X build
|
|
||||||
|
|
||||||
This builds an compressed format of all the UCD data (except blocks and Unihan)
|
|
||||||
into the BIN directory. Don't worry about the voluminous console messages, unless one says
|
|
||||||
"FAIL".
|
|
||||||
|
|
||||||
You have to manually do this if you change any of the data files in that version!!
|
|
||||||
|
|
||||||
4b. All of the generated files get a "d" version number, e.g. CaseFolding-4.0.0d3.txt.
|
|
||||||
To change the D version on generated files, edit the link in GenerateData.java:
|
|
||||||
|
|
||||||
static final int dVersion = 2; // change to fix the generated file D version. If less than zero, no "d"
|
|
||||||
|
|
||||||
Note: if for any reason you modify the binary format of the BIN files, you also have
|
|
||||||
to bump the value in that file:
|
|
||||||
|
|
||||||
static final byte BINARY_FORMAT = 8; // bumped if binary format of UCD changes
|
|
||||||
|
|
||||||
|
|
||||||
4c. To build all of the Unicode files for a particular version X, run
|
|
||||||
|
|
||||||
version X all
|
|
||||||
|
|
||||||
|
|
||||||
4d. To build a particular file, like CaseFolding, use that file name instead of all
|
|
||||||
|
|
||||||
version X CaseFolding
|
|
||||||
|
|
||||||
|
|
||||||
4e. To run basic consistency checking, run:
|
|
||||||
|
|
||||||
version X verify
|
|
||||||
|
|
||||||
Don't worry about any console messages except those that say FAIL.
|
|
||||||
|
|
||||||
|
|
||||||
5. Running UCA, you will use com.ibm.text.UCA.Main as your main class.
|
|
||||||
|
|
||||||
5a. To build all the UCA files used by ICU, use the option:
|
|
||||||
|
|
||||||
java <UCA>Main ICU
|
|
||||||
|
|
||||||
6. To build all the charts, use the UCA project, with options: normalizationChart caseChart scriptChart indexChart
|
|
Loading…
Reference in New Issue
Block a user