Version: 2001-Aug-02
Copyright © 1995-2001 International Business Machines Corporation
and others. All Rights Reserved.
Today's software market is a global one in which it is desirable to develop and maintain one application that supports a wide variety of national languages. International Components for Unicode provides the following tools to help you write language independent applications:
It is possible to support additional locales by adding more locale data files, with no code changes. Please refer to POSIX programmer's Guide for details on what the ISO locale ID means.
This document will go into more detail on how to build and install ICU on your machine. Once you start using ICU, the Where To Find More Information section of this document will be very helpful resource.
Your comments are important to making this release successful. We are committed to fixing any bugs, and will also use your feedback to help plan future releases.
IMPORTANT: Please make sure you understand the Copyright and License Information.
For more news about this release, see the online release notes.
The ICU 2.0 data has been upgraded to support Unicode 3.1. This means that the character property data and normalization has changed. Recent versions of ICU already supported Unicode 3.0 data with UTF-16 surrogate pairs.
The ICU projects (ICU4C and ICU4J) have changed their licenses from the IPL (IBM Public License) to the X license. The X license is a non-viral and recommended free software license that is compatible with the GNU GPL license. This is effective starting with release 1.8.1 of ICU4C and release 1.3.1 of ICU4J. All previous ICU releases will continue to utilize the IPL. New ICU releases will adopt the X license. The users of previous releases of ICU will need to accept the terms and conditions of the X license in order to adopt the new ICU releases.
The main effect of the change is to provide GPL compatibility. The X license is listed as GPL compatible, see the gnu page at http://www.gnu.org/philosophy/license-list.html#GPLCompatibleLicenses.
The text of the X license is available at http://www.x.org/terms.htm. The IBM version contains the essential text of the license, omitting the X-specific trademarks and copyright notices.
For more details please see the press announcement and the Project FAQ.
The collation framework has been reimplemented to make it faster,
Unicode Collation Algorithm compliant, and to make the locale-specific
collation data smaller (by separating it from the shared UCA data).
Sort keys and even some collation results have changed from ICU 1.6
and ICU 1.7.
For details, see our collation design
document.
There are two ways to download the ICU releases,
For more details on how to download ICU directly from the web site, please also see http:/oss.software.ibm.com/icu/download/
Below, $Root is the placement of the icu directory in your file system, like "drive:\...\icu" in your environment. "drive:\..." stands for any drive and any directory on that drive that you chose to install icu into.
readme.html | Describes the International Components for Unicode (this file) |
license.html | Contains IBM's public license |
$Root/source/common/ | The core Unicode and support functionality, such as resource bundles, character properties, locales, codepage conversion, normalization, Unicode properties, Locale, and UnicodeString. |
$Root/source/i18n/ | Modules in i18n are generally the more data-driven, that is to say resource bundle driven, components. These deal with higher level internationalization issues such as formatting, collation, text break analysis, and transliteration. |
$Root/source/test/intltest/ | A test suite including all C++ APIs. For information about running the test suite, see the users' guide. |
$Root/source/test/cintltst/ | A test suite written in C, including all C APIs. For information about running the test suite, see the users' guide. |
$Root/data/ |
This directory contains the source data in text format, which is
compiled into binary form during the ICU build process. The output
from these files is stored in $Root/source/data/build while
awaiting further packaging.
|
$Root/source/data | This directory is where the final, packaged version of the ICU binary data ends up. If the ICU_DATA environment variable is used, then it should be set to this directory. The intermediate individual data files (.res, .cnv) are kept in the subdirectory "$Root/source/data/build" prior to packaging. |
$Root/source/tools | Tools for generating the data files. Data files are generated by invoking $Root/source/data/build/makedata.bat on Win32 or $Root/source/make on Unix. |
$Root/source/samples | Various sample programs that use ICU |
$Root/source/extra | Non-supported API additions. Currently, it contains the 'ustdio' file i/o library |
$Root/source/layout | Contains the ICU layout engine (not a rasterizer). |
$Root/packaging $Root/debian |
These directories contain scripts and tools for packaging the final ICU build for various release platforms. |
$Root/source/config | Contains helper makefiles for platform specific build commands. Used by 'configure'. |
$Root/source/allinone | Contains top-level ICU project files, for instance to build all of ICU under one MSVC project. |
The platform dependencies have been mostly isolated into the following files in the common library. This information can be useful if you are porting ICU to a new platform.
It is possible to build each library individually. They must be built
in the following order:
Operating system | Compiler | Testing frequency |
---|---|---|
Windows 98/NT/2000 | Microsoft Visual C++ 6.0 | Reference platform |
Red Hat Linux 6.1 | gcc 2.91.66 | Reference platform |
AIX 4.3.3 | xlC 3.6.4 | Reference platform |
Solaris 2.6 | Workshop Pro CC 4.2 | Reference platform |
HP/UX 11.01 | aCC A.12.10 | Reference platform |
AIX 5.1.0 L | Visual Age C++ 5.0 | Regularly tested |
Solaris 2.7 | Workshop Pro CC 6.0 | Regularly tested |
Solaris 2.6 | gcc 2.91.66 | Regularly tested |
FreeBSD 4.4 | gcc 2.95.3 | Regularly tested |
HP/UX 11.01 | CC A.03.10 | Regularly tested |
OS/390 (zSeries) | CC | Regularly tested |
AS/400 (zSeries) V5R1 | iCC | Rarely tested |
NetBSD, OpenBSD | Rarely tested | |
SGI/IRIX | Rarely tested | |
PTX | Rarely tested | |
OS/2 | Visual Age | Rarely tested |
Macintosh | Needs help to port |
Key to testing frequency
Building International Components for Unicode requires:
The steps are:
Using MSDEV At The Command Line Note: You can build ICU from the command line. Assuming that you have properly installed Microsoft Visual C++ to support command line execution, you can run the following command, 'msdev $Root\source\allinone\allinone.dsw /MAKE "ALL"'.
Setting Active Configuration Note: To set the active configuration, two different possibilities are:
Batch Configuration Note: If you want to build the Debug and Release configurations at the same time, choose "Build" menu and select "Batch Build..." instead (and mark all configurations as checked), then click the button named "Rebuild All". The "all" workspace will build all the test programs as well as the tools for generating binary locale data files. The "makedata" project will be run automatically to convert the locale data files from text format into icudata.dll.
Building International Components for Unicode on Unix requires:
A UNIX C++ compiler, (gcc, cc, xlc_r, etc...) installed on the target machine. A recent version of GNU make (3.7+). For a list of OS/390 tools please view the OS/390 build section of this document for further details.
The steps are:
Some platforms use package management tools to control the installation and uninstallation of files on the system, as well as the integrity of the system configuration. You may want to check if ICU can be packaged for your package management tools by looking into the "packaging" directory. (Please note that if you are using a snapshot of ICU from CVS, it is probable that the packaging scripts or related files are not up to date with the contents of ICU at this time, so use them with caution.)
If you are building on the OS/390 UNIX System Services platform, it is important that you understand a few details:
iconv -f IBM-1047 -t ISO8859-1 uni-text.bin >
uni-text.bin
OS/390 supports both native S/390 hexadecimal floating point and,
(with Version 2.6 and later) IEEE binary floating point. This is a
compile time option. Applications built with IEEE should use ICU dlls
that are built with IEEE (and vice versa). The environment variable
IEEE390=1 will cause the OS/390 version of ICU to be built with IEEE
floating point. The default is native hexadecimal floating point.
Important: Currently (ICU 1.4.2), native floating point
support is sufficient for codepage conversion, resource bundle and
UnicodeString operations, but the Format APIs, especially
ChoiceFormat, require IEEE binary floating point.
Examples for configuring ICU:
Debug build: IEEE390=1 ./configure
Release build: CFLAGS=-2 IEEE390=1 ./configure
By default, ICU builds its libraries into the HFS. However, there is a 390-specific switch to build some libraries into PDS files. The switch is the environmental variable OS390BATCH, and if set, the following libraries are built into PDS files: libicuucXX.dll, libicudtXXe.dll, libicudtXXe_390.dll, and libtestdata.dll. Turning on OS390BATCH does not turn off the normal HFS build, thus the HFS dlls will always be created.
The names of the PDS files are determined by the value of the environmental variables LOADMOD and LOADEXP. These variables must contain the target PDS names whenever the OS390BATCH variable is set. LOADMOD is the library (.dll) target dataset and LOADEXP is the side deck (.x) target dataset.
The PDS member names are as follows:
IXMICUUC --> libicuucXX.dll IXMICUDA --> libicudtXXe.dll IXMICUD1 --> libicudtXXe_390.dll IXMICUTE --> libtestdata.dll
Example PDS attributes are as follows:
Data Set Name . . . : USER.ICU.LOAD General Data Management class. . : **None** Storage class . . . : BASE Volume serial . . . : TSO007 Device type . . . . : 3390 Data class. . . . . : LOAD Organization . . . : PO Record format . . . : U Record length . . . : 0 Block size . . . . : 32760 1st extent cylinders: 40 Secondary cylinders : 59 Data set name type : PDS Data Set Name . . . : USER.ICU.EXP General Data Management class. . : **None** Storage class . . . : BASE Volume serial . . . : TSO007 Device type . . . . : 3390 Data class. . . . . : **None** Organization . . . : PO Record format . . . : FB Record length . . . : 80 Block size . . . . : 3200 1st extent cylinders: 3 Secondary cylinders : 3 Data set name type : PDS
ICU Reference Release 1.8.1 contains partial support for the 400 platform, but additional work by the user is currently needed to get it to build properly. A future release of ICU should work out-of-the-box under OS/400.
CRTLIB LIB(libraryname)
ADDENVVAR ENVVAR(ICU_DATA) VALUE('/icu/source/data') ADDENVVAR ENVVAR(CC) VALUE('/usr/bin/icc') ADDENVVAR ENVVAR(CXX) VALUE('/usr/bin/icc') ADDENVVAR ENVVAR(MAKE) VALUE('/usr/bin/gmake') ADDENVVAR ENVVAR(OUTPUTDIR) VALUE('libraryname')libraryname identifies target as400 library for *module, *pgm and *srvpgm objects.
DSPSYSVAL SYSVAL(QUTCOFFSET)
CHGSYSVAL SYSVAL(QUTCOFFSET) VALUE('-0800')You should change -0800 to -0700 for daylight savings.
If you are building on the Win32 platform, it is important that you understand a few of the following build details.
As delivered, the International Components for Unicode build as several DLLs. These DLLs are placed in the "icu\bin" directory. You must add this directory to the PATH environment variable in your system, or any executables you build will not be able to access International Components for Unicode libraries. Alternatively, you can copy the DLL files into a directory already in your PATH, but we do not recommend this. You can wind up with multiple copies of the DLL and wind up using the wrong one.
All the DLLs link with the C runtime library "Debug Multithreaded DLL"
or "Multithreaded DLL." (This is changed through the Project Settings
dialog, on the C/C++ tab, under Code Generation.) It is important that
any executable or other DLL you build which uses the International
Components for Unicode DLLs links with these runtime libraries as well.
If you do not do this, you will get random memory errors when you run the
executable.
If you are building on a Unix platform, it is important that you add the location of your ICU libraries (including the data library) to your LD_LIBRARY_PATH environment variable. The ICU libraries may not link or load properly without doing this.
Some deprecated C APIs can be enabled without recompiling the ICU libraries. This can be achieved by defining certain symbols before including the ICU header files. For example, to enable deprecated C APIs for formatting.
#ifndef U_USE_DEPRECATED_FORMAT_API
# define U_USE_DEPRECATED_FORMAT_API 1
#endif
#include "unicode/udat.h"
int main(){
UDateFormat *def, *fr, *fr_pat ;
UErrorCode status = U_ZERO_ERROR;
UChar temp[30];
fr = udat_open(UDAT_FULL, UDAT_DEFAULT, "fr_FR", NULL,0, &status);
if(U_FAILURE(status)){
printf("Error creating the french dateformat using full time style\n %s\n",
myErrorName(status) );
}
/* This is supposed to open default date format,
but later on it treats it like it is "en_US".
This is very bad when you try to run the tests
on a machine where the default locale is NOT "en_US"
*/
def = udat_open(UDAT_SHORT, UDAT_SHORT, "en_US", NULL, 0, &status);
if(U_FAILURE(status)){
.... /* handle the error */
}
}
Deprecated C++ APIs cannot be enbaled without recompiling ICU libraries. Every service has a specific symbol that should be defined to enable the deprecated API of that service. For example: To enable deprecated APIs in Transliteration service U_USE_DEPRECATED_TRANSLITERATOR_API symbol should be defined before compiling ICU.
http://oss.software.ibm.com/icu/ | International Components for Unicode homepage |
http://oss.software.ibm.com/icu/userguide/icufaq.html | Frequently asked questions about ICU |
http://oss.software.ibm.com/icu/download | Download the latest version of ICU and documentation |
http://oss.software.ibm.com/icu/apiref/ | API Documentation in HTML form |
http://oss.software.ibm.com/icu/userguide/ | Draft User's Guide Documentation in HTML form |
http://oss.software.ibm.com/icu/userguide/icu.pdf | Draft User's Guide Documentation in PDF form |
http://www.ibm.com/developer/unicode/ | Information on how to make applications global. |
To submit comments, request features and report bugs, please contact us. The best forum is the ICU mailing list. See the information on how to browse and join the list. If you find a bug in the code that has not been submitted and/or fixed yet, then please submit a jitterbug.
Copyright © 1997-2001 International Business Machines Corporation
and others. All Rights Reserved.
IBM Center for Emerging Technologies Silicon Valley,
10275 N De Anza Blvd., Cupertino, CA 95014
All rights reserved.