ReadMe: International Components for Unicode for Java

Version: 2.0 December 05 2001


COPYRIGHT:
Copyright (c) 2001 International Business Machines Corporation and others. All Rights Reserved.


Contents

Introduction to ICU4J (International Components for Unicode for Java)

Today's global market demands programs that support a wide variety of languages and national conventions.  Customers prefer software and web pages tailored to their needs \226 studies confirm that this leads to increased sales.  Java provides a strong foundation for global programs, and IBM and the ICU4J team played a key role in providing globalization technology to Sun for use in Java.

But Java does not yet provide all the features that some products require.  ICU4J is an add-on library that extends Java's globalization technology by providing the following tools:

In some cases, the above support has been rolled into a later release of Java. For example, the Thai word-break is now in Java 1.4. However, if you are using Java 1.2, you can use the ICU4J package until you upgrade to 1.4.

License Information

The ICU projects (ICU4C and ICU4J) now use the X license.  The X license is a non-viral and recommended free software license that is compatible with the GNU GPL license.  This became effective with release 1.8.1 of ICU4C and release 1.3.1 of ICU4J in mid-2001. All new ICU releases will adopt the X license; previous ICU releases continue to utilize the IPL (IBM Public LIcense).  Users of previous releases of ICU who want to adopt new ICU releases will need to accept the terms and conditions of the X license.

The main effect of the change is to provide GPL compatibility.  The X license is listed as GPL compatible, see the gnu page at http://www.gnu.org/philosophy/license-list.html#GPLCompatibleLicenses

The text of the X license is available at http://www.x.org/terms.htm. The IBM version contains the essential text of the license, omitting the X-specific trademarks and copyright notices.

For more details please see the press announcement and the Project FAQ.

Platform Dependencies

Parts of ICU4J depend on functionality that is only available in Java2 (JDK1.2) or later, although some components work under 1.1.  However, all components should be compiled using a Java 1.2.x or 1.3.x compiler, as even components that run using a 1.1.x JVM may require language features that are only present in Java2.  Currently 1.1.x is unsupported and untested, and you use the components on a 1.1.x system at your own risk.

ICU4J is currently not compatible with JDK1.4 (in beta as of this writing). We anticipate adding support for JDK1.4 in a future release.

How to Download ICU4J

There are two ways to download the ICU4J releases.

For more details on how to download ICU4J directly from the web site, please also see http://oss.software.ibm.com/icu4j/download/index.html

The Structure and Contents of ICU4J

Below, $Root is the placement of the icu directory in your file system, like "drive:\...\icu4j" in your environment. "drive:\..." stands for any drive and any directory on that drive that you chose to install icu4j into.

The following files describe the code drop:

readme.html
(this file)
A description of ICU4J (International Components for Unicode for Java)
license.html The X license, used by ICU4J
build.bat A convenience bat file for building ICU4J with Ant on Windows
build.sh A convenience sh file for building ICU4J with Ant on Unix
build.xml Ant build file. See How to Install and Build for more information
buildall.bat A bat file for building ICU4J with Javac (not recommended)

The source directories mirror the package structure of the code.  They contain source code and data files:

$Root/richeditDist/ Documentation and stubs used to build the RichEdit demo jar file in package com.ibm.richtext.demo.
$Root/src/data/ Various data files used to generate ICU4J classes.  Most of the files contain Unicode information that is available from http://www.unicode.org. Used only by tools and tests in package com.ibm.tools and com.ibm.test.
$Root/src/com/ibm/data Data files used by the BreakIterator tests and demos in package com.ibm.demo.rbbi and com.ibm.test.rbbi.
$Root/src/com/ibm/demo Demonstration applications and applets for calendar, holidays, rule-based breakiterators, rule-based number formatters, and transliterators. This package is gradually being deprecated, the contents will be shifted to com.ibm.icu.demo.
$Root/src/com/ibm/icu/demo New location of demonstration applications and applets. The classes in com.ibm.demo will eventually be moved here.
$Root/src/com/ibm/icu/internal Contains the class com.ibm.icu.UInfo which is used in the normalization builder, com.ibm.tools.normalizer.NormalizerBuilder.
$Root/src/com/ibm/icu/test New folder for tests. The classes in package com.ibm.test will eventually be shifted here.
$Root/src/com/ibm/icu/math Mathematic manipulation.
$Root/src/com/ibm/icu/richtext Styled text editing package. This includes demos, tests, and GUIs for editing and displaying styled text. The richtext package provides a scrollable display, typing, arrow-key support, tabs, alignment and justification, word- and sentence-selection (by double-clicking and triple-clicking, respectively), text styles, clipboard operations (cut, copy and paste) and a log of changes for undo-redo. Richtext uses Java's TextLayout and complex text support (provided to Sun by the ICU4J team).
$Root/src/com/ibm/test Tests for various ICU4J components.  For information about running the tests, see $Root/src/com/ibm/test/TestAll.java. The package will gradually be deprecated, and the classes moved to com.ibm.icu.test.
$Root/src/com/ibm/text Main package, containing the following components:
  • Arabic shaping
  • Break iteration
  • Date formatting
  • Number formatting
  • Transliteration
  • Normalization
  • String manipulation
  • String search
  • Unicode compression
  • Unicode character properties
  • Unicode sets
$Root/src/com/ibm/textlayout Text layout, used by the styled text editing package.
$Root/src/com/ibm/tools Various tools used to generate ICU4J classes and data for the following modules:
  • Unicode compression
  • Normalization
  • Rule-based iterators
  • Transliteration
$Root/src/com/ibm/util Calendars, time zones and other utility classes

Building ICU4J creates and populates the following directories:

$Root/classes contains all class files
$Root/doc contains JavaDoc for all packages

Data organization:

Data is stored in various locations in ICU4J:

Where to get Documentation

The complete API documentation is available on the ICU4J web site:

How to Install and Build

To install ICU4J, simply place the prebuilt jar file icu4j.jar on your Java CLASSPATH.  No other files are needed.

To build ICU4J, you will need a Java2 JDK and the Ant build system. We strongly recommend using the Ant build system to build ICU4J:

Once the JDK and Ant are installed, building is just a matter of typing ant in the ICU4J root directory. This causes the Ant build system to perform a build as specified by the file build.xml, located in the ICU4J root directory. You can give Ant options like -verbose, and you can specify targets. Ant will only build what's been changed and will resolve dependencies properly. For example:

F:\icu4j>ant tests
Buildfile: build.xml
Project base dir set to: F:\icu4j
Executing Target: core
Compiling 71 source files to F:\icu4j\classes
Executing Target: tests
Compiling 24 source files to F:\icu4j\classes
Completed in 19 seconds

The following are some targets that you can give after ant. For more targets, see the build.xml file:

all Build all targets.
core Build the main class files in the subdirectory classes. If no target is specified, core is assumed.
tests Build the test class files.
demos Build the demos.
tools Build the tools.
docs Run javadoc over the main class files, generating an HTML documentation tree in the subdirectory doc.
jar Create a jar archive icu4j.jar in the root ICU4J directory containing the main class files.
zip Create a zip archive of the source, docs, and jar file for distribution, excluding unwanted things like CVS directories and emacs backup files. The zip file icu4jYYYYMMDD.zip will be created in the directory above the root ICU4J directory, where YYYYMMDD is today's date. Any existing file of that name will be overwritten.
richeditZip Create a zip archive of the richedit docs and the richedit jar file (which contains only the classes needed by richedit) for distribution, excluding unwanted things like CVS directories and emacs backup files. The zip file richedit.zip will be created in the ./richeditDist subdirectory. Any existing file of that name will be overwritten.
zipsrc Like the zip target, without the docs and the jar file. The zip file icu4jsrcYYYYMMDD.zip will be created in the directory above the root ICU4J directory.
clean Remove all built targets, leaving the source.

For more information, read the Ant documentation and the build.xml file.

After doing a build it is a good idea to run all the icu4j tests by typing
"java -classpath $Root/classes -DUnicodeData=$Root/src/data/unicode com.ibm.test.TestAll".

(If you are allergic to build systems, as an alternative to using Ant you can build by running javac and javadoc directly. This is not recommended. You may have to manually create destination directories.)

Trying Out ICU4J

Note: the demos provided with ICU4J are for the most part undocumented. This list can show you where to look, but you'll have to experiment a bit. The demos (with the exception of richedit) are unsupported and may change or disappear without notice.

The icu4j.jar file contains only the core ICU4J classes, not the demo classes, so unless you build ICU4J there is little to try out. But you can try out the richedit package, since a GUI is included in the core. To run it, type:

java -classpath icu4j.jar com.ibm.richtext.demo.EditDemo [-swing][file]
This will use an awt GUI, or a swing GUI if -swing is passed on the command line. It will open a text file if one is provided, otherwise it will open a blank page. Click to type.

You can add tabs to the tab ruler by clicking in the ruler while holding down the control key. Clicking on an existing tab changes between left, right, center, and decimal tabs. Dragging a tab moves it, dragging it off the ruler removes it.

You can experiment with complex text by using the keymap functions. Please note that these are mainly for demo purposes, for real work with Arabic or Hebrew you will want to use an input method. You will need to use a font that supports Arabic or Hebrew, 'Lucida Sans' (provided with Java) supports these languages.

The other demo programs are not supported and exist only to let you experiment with the ICU4J classes. First, build ICU4J using ant all. Then try one of the following:

Where to Find More Information

http://oss.software.ibm.com/icu4j is a pointer to general information about the International Components for Unicode in Java

http://www.ibm.com/developer/unicode is a pointer to information on how to make applications global.

Submitting Comments, Requesting Features and Reporting Bugs

Your comments are important to making ICU4J successful.  We are committed to fixing any bugs, and will use your feedback to help plan future releases.

To submit comments, request features and report bugs, contact us through the ICU4J mailing list.
While we are not able to respond individually to each comment, we do review all comments.

Thanks for your interest in ICU4J!


Copyright © 2001 International Business Machines Corporation and others. All Rights Reserved.
10275 N De Anza Blvd., Cupertino, CA 95014