scuffed-code/icu4c/docs/codestds.html

<!DOCTYPE html public "-//W3C//DTD HTML 4.0 Transitional//EN">
<html>
<head>
<title>ICU Coding Guidelines</title>
<meta http-equiv=Content-Type content="text/html; charset=iso-8859-1">
</head>
<body bgcolor="#FFFFFF">
 <h1>International Components for Unicode</h1>
 <h2>ICU Coding Guidelines</h2>
  <hr>
 <p>This page describes guidelines for writing code for the International
   Components for Unicode and how to add it to the project.</p>
 <ul>
   <li><a href="#coding">General Guidelines</a></li>
   <li><a href="#addfiles">Adding files to ICU</a></li>
   <li><a href="#tools">Build Tools</a></li>
 </ul>
 <a name="coding"></a>
 <h2 align="center">General Guidelines</h2>
 <ul>
   <li>Constants (#define, enum items, const): uppercase. <code>UBREAKITERATOR_DONE</code>,
     <code>UBIDI_DEFAULT_LTR</code>, <code>ULESS</code>.</li>
   <li>Variables and functions: mixed-case, starting with lowercase. <code>getLength()</code>.</li>
   <li>Types (class, struct, enum, union): mixed-case, starting with uppercase.
     <code>class DateFormatSymbols</code>.</li>
   <li>We use getProperty() and setProperty() functions.</li>
   <li>getLength(), getSomethingAt(index/offset).</li>
   <li>Where we return a number of items, it is <code>countItems()</code>
     - <i>not</i> getItemCount() (even if we do not need to actually <i>count</i>
     in the implementation of that member function).</li>
   <li>Ranges of indexes: we specify a range of indexes by having <i>start</i>
     and <i>limit</i> parameters, with names or suffixes like that. Such
     a range contains indexes from start to limit-1, i.e., it is an interval
     that is left-closed and right-open. Mathematically, [start..limit[.</li>
   <li>Functions that take a buffer (pointer) and a length argument with
     a default value so that the function determines the length of the input
     itself (for text, calling u_strlen()), that default value should be
     -1. Any other negative or undefined value constitutes an error.</li>
   <li>Primitive types: they are defined by utypes.h or a header file that
     it includes. The most common ones are uint8_t, uint16_t, uint32_t, int8_t,
     int16_t, int32_t, UTextOffset (signed), and UErrorCode.</li>
   <li>File names (.h, .c, .cpp, data files, etc.): 8.3, all lowercase.</li>
   <li>Language extensions and standards: do not use any features (language
     extensions or library functions) that are proprietary and do not work
     on all C or C++ compilers.<br>
     For example, in Microsoft Visual C++, you should go to Project Settings(alt-f7)->All
     Configurations->C/C++->Customize, and set Disable Language Extensions.</li>
   <li>Tabs and indentation: no tab characters (\x09), save files with spaces
     instead.<br>
     Indentation is of size 4.</li>
   <li>Documentation: We use Javadoc-style in-file documentation with <a href="http://www.zib.de/Visual/software/doc++/index.html">doc++</a>.
     We have a printed version of the manual in our "library", and our files
     use this.</li>
   <li>You should not place multiple statements into one line. You should
     not have if() or loop heads followed by their bodies on the same line.</li>
   <li>Placements of curly braces {}: each of us subscribes to different
     philosophies. Please try to do it reasonably and consistently. It is
     a good idea to use the style of a file when you modify it, instead of
     mixing in your favorite style.<br>
     The one thing that we ask for is to not have if() and loop bodies without
     curly braces.</li>
   <li>Function declarations should have one line with the return type and
     all the import, extern, and export declarations, and the function name
     and signature at the beginning of the next line.
     <pre>
U_CAPI int32_t U_EXPORT2
u_formatMessage(...);
</pre>
   </li>
   <li>Error codes: The ICU API functions for both C and C++ are using a
     pointer or a reference to UErrorCode whereever we expect things to possibly
     go wrong.
     <p>In C, this is a pointer, and it must be checked for <code>NULL</code>.
       In C++, this is a reference. In both cases, it must be checked for
       an error code already being in there:</p>
     <pre>
U_CAPI const UBiDiLevel * U_EXPORT2
ubidi_getLevels(UBiDi *pBiDi, UErrorCode *pErrorCode) {
    UTextOffset start, length;

    if(pErrorCode==NULL || U_FAILURE(*pErrorCode)) {
        return NULL;
    } else if(pBiDi==NULL || (length=pBiDi->length)<=0) {
        *pErrorCode=U_ILLEGAL_ARGUMENT_ERROR;
        return NULL;
    }

    ...
    return result;
}
</pre>
     This prevents the API function from doing anything on data that is not
     valid in a chain of function calls and relieves the caller from checking
     the error code after each call.</li>
   <li>Decide whether your module is part of the "common" or the "i18n" API
     collection. Use the appropriate macros, like <code>U_COMMON_IMPLEMENTATION</code>,
     <code>U_I18N_IMPLEMENTATION</code>, <code>U_COMMON_API</code>, <code>U_I18N_API</code>.
     See <code>utypes.h</code>.</li>
   <li>If we have the same module in C and in C++, then there will be two
     header files, one for each language, even if one uses the other. For
     example, ubidi.h for C and bidi.h for C++.</li>
   <li>
     <p>Platform dependencies are dealt with in the header files that <code>utypes.h</code>
       includes. They are <code>platform.h</code> and its more specific cousins
       like <code>pwin32.h</code> for Windows, which define basic types,
       and <code>putil.h</code>, which defines platform utilities.</p>
     <p></p>
     <strong>Important: </strong>Outside of these files, and a small number
     of implementation files that depend on platform differences (like <code>umutex.c</code>),
     <i>no</i> ICU source code may have <i>any</i> <code>#ifdef <i>OperatingSystemName</i></code>
     instructions.
     <p></p>
   </li>
   <li> For mutual-exclusion (mutex) blocks, there should be no function
     calls within a mutex block. The idea behind this is to prevent deadlocks
     from occuring later. There should be as little code inside a mutex block
     as possible to minimize the performance degradation from blocked threads.
   </li>
 </ul>
 <h3>C Guidelines</h3>
 <ul>
   <li>Since we don't have classes to subdivide the namespace, we use prefixes
     to avoid name collisions. Some of those prefixes contain a 3- (or sometimes
     4-) letter module identifier. Very general names like <code>u_charDirection()</code>
     don't have a module identifier in their prefix.
     <ul>
       <li>For POSIX replacements, we prepend the (all lowercase) POSIX function
         name with "u_": <code>u_strlen()</code>.</li>
       <li>For other API functions, we prepend a 'u', the module identifier
         (if appropriate), and an underscore '_', followed by the <i>mixed-case</i>
         function name: <code>u_charDirection()</code>, <code>ubidi_setPara()</code>.</li>
       <li>For types (struct, enum, union), we prepend a "U", often "U&lt;module
         identifier>" directly to the typename, without an underscore. <code>UComparisonResult</code>.</li>
       <li>For <code>#define</code>d constants and macros , we prepend a
         "U_", often "U&lt;module identifier>_" with an underscore to the
         uppercase macro name. <code>U_ZERO_ERROR</code>, <code>U_SUCCESS()</code>.</li>
     </ul>
   </li>
   <li>Function declarations need to be of the form <code>U_CAPI return-type
     U_EXPORT2</code> to satisfy all compilers' needs.</li>
   <li>Functions that roughly compare to constructors and destructors are
     called umod_open() and umod_close().
     <pre>
U_CAPI UBiDi * U_EXPORT2
ubidi_open();

U_CAPI UBiDi * U_EXPORT2
ubidi_openSized(UTextOffset maxLength, UTextOffset maxRunCount);

U_CAPI void U_EXPORT2
ubidi_close(UBiDi *pBiDi);
</pre>
   </li>
   <li>In cases like BreakIterator and NumberFormat, instead of having several
     different 'open' APIs for each kind of instances, use an enum selector.</li>
   <li>File names begin with a "u".</li>
   <li>For memory allocation in C implementation files for ICU, the functions/macros
     in <code>cmemory.h</code> must be used. </li>
 </ul>
 <h3>C++ Guidelines</h3>
 <ul>
   <li>Classes and their members do not need a "U" or any other prefix.</li>
   <li>class APIs need to be declared like <code>class U_I18N_API SimpleDateFormat</code>
     or like <code>class U_COMMON_API UCharCharacterIterator</code>.</li>
   <li>Class member functions should be only declared, not inline-implemented,
     in the class declaration. Inline implementations may follow after the
     class declaration in the same file.</li>
   <li>We are using <code>XP_PLUSPLUS</code> to make sure the compiler does
     C++, not <code>__cplusplus</code>.</li>
   <li>We do not use exceptions, and we do not use templates (at least on
     the API).</li>
   <li>File names do not begin with a "u".</li>
   <li>
     <h4>Adoption of objects:</h4>
     Some constructors and factory functions take pointers to objects that
     they <i>adopt</i>. This means that the newly created object will contain
     a pointer to the adoptee and takes over ownership and lifecycle control.
     If an error occurs while creating the new object - and thus in the code
     that adopts an object - then the semantics used within ICU must be <i>adopt-on-call</i>
     (as opposed to, e.g., adopt-on-success):
     <ul>
       <li>General: A constructor or factory function that adopts an object
         does so in all cases, even if an error occurs and a <code>UErrorCode</code>
         is set. This means that either the adoptee is deleted immediately
         or its pointer is stored in the new object. The former case is most
         common when the constructor or factory function is called and the
         <code>UErrorCode</code> already indicates a failure. In the latter
         case, the new object must take care of deleting the adoptee once
         it is deleted itself regardless of whether the constructor was successful.</li>
       <li>Constructors: The code that creates the object with the <code>new</code>
         operator must check the resulting pointer returned by <code>new</code>
         and delete any adoptees if it is <code>0</code> because the constructor
         was not called. (Typically, a <code>UErrorCode</code> must be set
         to <code>U_MEMORY_ALLOCATION_ERROR</code>.)</li>
       <li>Factory functions (<code>createInstance()</code>): The factory
         function must set a <code>U_MEMORY_ALLOCATION_ERROR</code> and delete
         any adoptees if it cannot allocate the new object. If the construction
         of the object fails otherwise, then the factory function must delete
         it - and it in turn must delete its adoptees. As a result, a factory
         function always returns either a valid object and a successful <code>UErrorCode</code>,
         or a <code>0</code> pointer and a failure <code>UErrorCode</code>.<br>
         Example:
         <pre>
Calendar*
Calendar::createInstance(TimeZone* zone, UErrorCode& errorCode) {
    if(U_FAILURE(errorCode)) {
        delete zone;
        return 0;
    }
    // since the Locale isn't specified, use the default locale
    Calendar* c = new GregorianCalendar(zone, Locale::getDefault(), errorCode);
    if(c == 0) {
        errorCode = U_MEMORY_ALLOCATION_ERROR;
        delete zone;
    } else if(U_FAILURE(errorCode)) {
        delete c;
        c = 0;
    }
    return c;
}
</pre>
     </ul>
   </li>
   <li>For memory allocation in C++ implementation files for ICU, the standard
     <code>new</code> and <code>delete</code> operators (or the C functions/macros
     in <code>cmemory.h</code>) must be used. </li>
 </ul>
 <a name="addfiles"></a>
 <h2 align="center">Adding files to ICU</h2>
 <h3>Adding .c, .cpp, and .h files</h3>
 <p>In order to add compilable files to ICU, you not only need to add them
   to the source code control system in the appropriate folder, but also
   add them to the build environment.</p>
 <p>The first step is to choose one of the ICU libraries:</p>
 <ol>
   <li>The <em>common</em> library provides mostly low-level utilities and
     basic APIs that often do not make use of Locales. Examples are APIs
     that deal with character properties, the Locale APIs themselves, and
     ResourceBundle APIs.</li>
   <li>The <em>i18n</em> library provides Locale-dependent and -using APIs,
     like for collation and formatting, that are most useful for internationalized
     user input and output.</li>
 </ol>
 <p>Put the source code files into the folder <code>icu/source/<i>library-name</i></code>.</p>
 <p>Then add them to the build system:
 <ul>
   <li>
     <p>For most platforms, add the expected .o files to <code>icu/source/<i>library-name</i>/Makefile.in</code>,
       to the <code>OBJECTS</code> variable.</p>
     <p>Add the <i>public</i> header files to the <code>HEADERS</code> variable.</p>
   </li>
   <li>For Microsoft Visual C++ 6.0, add all the source code files to <code>icu/source/<i>library-name</i>/<i>library-name</i>.dsp</code>.
     If you don't have Visual C++, then try to add the filenames to the project
     file manually; it is a text file, and this part should be fairly obvious.</li>
 </ul>
 <p></p>
 <p>You also need to add test code to <code>icu/source/test/cintltest</code>
   for C APIs and to <code>icu/source/test/intltest</code> for C++ APIs.</p>
 <p>All the API functions must be called by the test code (100% API coverage),
   and at least 85% of the implementation code should be exercised by the
   tests (>=85% code coverage).</p>
 <ol>
   <li>For C, create test code using the <code>log_err()</code>, <code>log_info()</code>,
     and <code>log_verbose()</code> APIs from <code>cintltst.h</code> (which
     uses <code>ctest.h</code>), and check it into the appropriate folder.</li>
   <li>In order to get your C test code called, you have to add its toplevel
     function and a descriptive test module path to the test system by calling
     <code>addTest()</code>. The function that makes the call to <code>addTest()</code>
     ultimately has to be called by <code>addAllTests()</code> in <code>calltest.c</code>.
     Groups of tests typically have a common <code>addGroup()</code> function
     that calls <code>addTest()</code> for the test functions in its group,
     according to the common part of the test module path.</li>
   <li>Add that test code to the build system, too. Modify <code>Makefile.in</code>
     and the appropriate <code>.dsp</code> file like for the library code.</li>
 </ol>
 <a name="tools"></a>
 <h2 align="center">Build Tools</h2>
 <p>We are using the following tools to build ICU:</p>
 <ul>
   <li>GNU make version 3.76.1 and up.</li>
   <li>GNU cc version 2.95.2 and up. (egcs merged back into gcc.)</li>
   <li>autoconf version 2.13 and up.</li>
   <li>autoconf needs m4 version 1.4 and up.</li>
   <li>Platform-specific compilers as listed with the supported platforms.<br>
     (For a complete list of supported platforms, see the <a href="http://www10.software.ibm.com/developerworks/opensource/icu/project/tech_faq.html">ICU
     Technical FAQ</a>.)</li>
   <li>On Windows: Microsoft Visual C++ 6.0 with the latest Service Packs.</li>
 </ul>
</body>
</html>
No results found.