From 6a5a325587fe67594abfa2a8c1120b50558b27e8 Mon Sep 17 00:00:00 2001 From: Alan Liu Date: Thu, 19 Feb 2004 00:32:14 +0000 Subject: [PATCH] ICU-3296 update and correct class docs for DecimalFormat, esp. re: padding X-SVN-Rev: 14542 --- icu4j/src/com/ibm/icu/text/DecimalFormat.java | 486 +++++++++--------- 1 file changed, 235 insertions(+), 251 deletions(-) diff --git a/icu4j/src/com/ibm/icu/text/DecimalFormat.java b/icu4j/src/com/ibm/icu/text/DecimalFormat.java index c9110afca2..264557074a 100755 --- a/icu4j/src/com/ibm/icu/text/DecimalFormat.java +++ b/icu4j/src/com/ibm/icu/text/DecimalFormat.java @@ -5,8 +5,8 @@ ******************************************************************************* * * $Source: /xsrl/Nsvn/icu/icu4j/src/com/ibm/icu/text/DecimalFormat.java,v $ - * $Date: 2004/02/12 01:01:38 $ - * $Revision: 1.42 $ + * $Date: 2004/02/19 00:32:14 $ + * $Revision: 1.43 $ * ***************************************************************************************** */ @@ -25,12 +25,13 @@ import java.io.ObjectInputStream; /** * DecimalFormat is a concrete subclass of - * NumberFormat that formats decimal numbers. It has a variety of + * {@link NumberFormat} that formats decimal numbers. It has a variety of * features designed to make it possible to parse and format numbers in any * locale, including support for Western, Arabic, or Indic digits. It also - * supports different flavors of numbers, including integers (123), fixed-point - * numbers (123.4), scientific notation (1.23E4), percentages (12%), and - * currency amounts ($123). All of these flavors can be easily localized. + * supports different flavors of numbers, including integers ("123"), + * fixed-point numbers ("123.4"), scientific notation ("1.23E4"), percentages + * ("12%"), and currency amounts ("$123"). All of these flavors can be easily + * localized. * *

This is an enhanced version of DecimalFormat that * is based on the standard version in the JDK. New or changed functionality @@ -38,11 +39,11 @@ import java.io.ObjectInputStream; * NEW or * CHANGED. * - *

To obtain a NumberFormat for a specific locale (including the + *

To obtain a {@link NumberFormat} for a specific locale (including the * default locale) call one of NumberFormat's factory methods such - * as getInstance(). Do not call the DecimalFormat + * as {@link NumberFormat#getInstance}. Do not call the DecimalFormat * constructors directly, unless you know what you are doing, since the - * NumberFormat factory methods may return subclasses other than + * {@link NumberFormat} factory methods may return subclasses other than * DecimalFormat. If you need to customize the format object, do * something like this: * @@ -50,17 +51,9 @@ import java.io.ObjectInputStream; * NumberFormat f = NumberFormat.getInstance(loc); * if (f instanceof DecimalFormat) { * ((DecimalFormat) f).setDecimalSeparatorAlwaysShown(true); - * } - * + * } * - *

Synchronization

- *

- * Decimal formats are not synchronized. It is recommended that you create - * separate format instances for each thread. If multiple threads access a format - * concurrently, it must be synchronized externally. - *

- * - *

Example + *

Example Usage * *

  * // Print out a number using the localized number, currency,
@@ -88,124 +81,146 @@ import java.io.ObjectInputStream;
  *             // Assume format is a DecimalFormat
  *             System.out.print(": " + ((DecimalFormat) format).toPattern()
  *                              + " -> " + form.format(myNumber));
- *         } catch (IllegalArgumentException e) {}
+ *         } catch (Exception e) {}
  *         try {
  *             System.out.println(" -> " + format.parse(form.format(myNumber)));
  *         } catch (ParseException e) {}
  *     }
- * }
- * 
- * - *

Notes + * } * *

Patterns

* *

A DecimalFormat consists of a pattern and a set of * symbols. The pattern may be set directly using - * applyPattern(), or indirectly using the API methods. The - * symbols are stored in a DecimalFormatSymbols object. When using - * the NumberFormat factory methods, the pattern and symbols are - * read from localized ResourceBundles provided by ICU. + * {@link #applyPattern}, or indirectly using other API methods which + * manipulate aspects of the pattern, such as the minimum number of integer + * digits. The symbols are stored in a {@link DecimalFormatSymbols} + * object. When using the {@link NumberFormat} factory methods, the + * pattern and symbols are read from ICU's locale data. * - * - * - *
- * *

Special Pattern Characters

* *

Many characters in a pattern are taken literally; they are matched during * parsing and output unchanged during formatting. Special characters, on the * other hand, stand for other characters, strings, or classes of characters. - * They must be quoted, unless noted otherwise, if they are to appear in the - * prefix or suffix as literals. + * For example, the '#' character is replaced by a localized digit. Often the + * replacement character is the same as the pattern character; in the U.S. locale, + * the ',' grouping character is replaced by ','. However, the replacement is + * still happening, and if the symbols are modified, the grouping character + * changes. Some special characters affect the behavior of the formatter by + * their presence; for example, if the percent character is seen, then the + * value is multiplied by 100 before being displayed. + * + *

To insert a special character in a pattern as a literal, that is, without + * any special meaning, the character must be quoted. There are some exceptions to + * this which are noted below. * *

The characters listed here are used in non-localized patterns. Localized * patterns use the corresponding characters taken from this formatter's - * DecimalFormatSymbols object instead, and these characters lose + * {@link DecimalFormatSymbols} object instead, and these characters lose * their special status. Two exceptions are the currency sign and quote, which * are not localized. * *

* - * - * - * - * - * - * - * - * - * - * - * - * - * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + *
Symbol - * Location - * Localized? - * Meaning - *
0 - * Number - * Yes - * Digit - *
# - * Number - * Yes - * Digit, zero shows as absent - *
. - * Number - * Yes - * Decimal separator or monetary decimal separator - *
- - * Number - * Yes - * Minus sign - *
, - * Number - * Yes - * Grouping separator - *
E - * Number - * Yes - * Separates mantissa and exponent in scientific notation. - * Need not be quoted in prefix or suffix. - *
; - * Subpattern boundary - * Yes - * Separates positive and negative subpatterns - *
% - * Prefix or suffix - * Yes - * Multiply by 100 and show as percentage - *
\u2030 - * Prefix or suffix - * Yes - * Multiply by 1000 and show as per mille - *
¤ (\u00A4) - * Prefix or suffix - * No - * Currency sign, replaced by currency symbol. If - * doubled, replaced by international currency symbol. - * If present in a pattern, the monetary decimal separator - * is used instead of the decimal separator. - *
' - * Prefix or suffix - * No - * Used to quote special characters in a prefix or suffix, - * for example, "'#'#" formats 123 to - * "#123". To create a single quote - * itself, use two in a row: "# o''clock". + *
Symbol + * Location + * Localized? + * Meaning + *
0 + * Number + * Yes + * Digit + *
1-9 + * Number + * Yes + * NEW + * '1' through '9' indicate rounding. + *
# + * Number + * Yes + * Digit, zero shows as absent + *
. + * Number + * Yes + * Decimal separator or monetary decimal separator + *
- + * Number + * Yes + * Minus sign + *
, + * Number + * Yes + * Grouping separator + *
E + * Number + * Yes + * Separates mantissa and exponent in scientific notation. + * Need not be quoted in prefix or suffix. + *
+ + * Exponent + * Yes + * NEW + * Prefix positive exponents with localized plus sign. + * Need not be quoted in prefix or suffix. + *
; + * Subpattern boundary + * Yes + * Separates positive and negative subpatterns + *
% + * Prefix or suffix + * Yes + * Multiply by 100 and show as percentage + *
\u2030 + * Prefix or suffix + * Yes + * Multiply by 1000 and show as per mille + *
¤ (\u00A4) + * Prefix or suffix + * No + * Currency sign, replaced by currency symbol. If + * doubled, replaced by international currency symbol. + * If present in a pattern, the monetary decimal separator + * is used instead of the decimal separator. + *
' + * Prefix or suffix + * No + * Used to quote special characters in a prefix or suffix, + * for example, "'#'#" formats 123 to + * "#123". To create a single quote + * itself, use two in a row: "# o''clock". + *
* + * Prefix or suffix boundary + * Yes + * NEW + * Pad escape, precedes pad character *
*
* *

A DecimalFormat pattern contains a postive and negative * subpattern, for example, "#,##0.00;(#,##0.00)". Each subpattern has a - * prefix, numeric part, and suffix. If there is no explicit negative - * subpattern, the localized minus sign, typically '-', is prefixed to the - * positive form. That is, "0.00" alone is equivalent to "0.00;-0.00". If there + * prefix, a numeric part, and a suffix. If there is no explicit negative + * subpattern, the negative subpattern is the localized minus sign prefixed to the + * positive subpattern. That is, "0.00" alone is equivalent to "0.00;-0.00". If there * is an explicit negative subpattern, it serves only to specify the negative * prefix and suffix; the number of digits, minimal digits, and other - * characteristics are all the same as the positive pattern. That means that + * characteristics are ignored in the negative subpattern. That means that * "#,##0.0#;(#)" has precisely the same result as "#,##0.0#;(#,##0.0#)". * *

The prefixes, suffixes, and various symbols used for infinity, digits, @@ -213,64 +228,112 @@ import java.io.ObjectInputStream; * values, and they will appear properly during formatting. However, care must * be taken that the symbols and strings do not conflict, or parsing will be * unreliable. For example, either the positive and negative prefixes or the - * suffixes must be distinct for DecimalFormat.parse() to be able + * suffixes must be distinct for {@link #parse} to be able * to distinguish positive from negative values. Another example is that the * decimal separator and thousands separator should be distinct characters, or * parsing will be impossible. * - *

The grouping separator is commonly used for thousands, but in some - * countries it separates ten-thousands. The grouping size is a constant number - * of digits between the grouping characters, such as 3 for 100,000,000 or 4 for - * 1,0000,0000. - * If you supply a pattern with multiple grouping characters, the interval - * between the last one and the end of the integer determines the primary - * grouping size, and the interval between the last two determines - * the secondary grouping size (see below); all others are ignored. - * So "#,##,###,####" == "###,###,####" == "##,#,###,####". - * - *

Some locales have two different grouping intervals: One used for the - * least significant integer digits (the primary grouping size), and - * one used for all others (the secondary grouping size). For example, - * if the primary grouping interval is 3, and the secondary is 2, then - * this corresponds to the pattern "#,##,##0", and the number 123456789 - * is formatted as "12,34,56,789". - * - *

DecimalFormat parses all Unicode characters that represent - * decimal digits, as defined by Character.digit(). In addition, - * DecimalFormat also recognizes as digits the ten consecutive - * characters starting with the localized zero digit defined in the - * DecimalFormatSymbols object. During formatting, the - * DecimalFormatSymbols-based digits are output. + *

The grouping separator is a character that separates clusters of + * integer digits to make large numbers more legible. It commonly used for + * thousands, but in some locales it separates ten-thousands. The grouping + * size is the number of digits between the grouping separators, such as 3 + * for "100,000,000" or 4 for "1 0000 0000". There are actually two different + * grouping sizes: One used for the least significant integer digits, the + * primary grouping size, and one used for all others, the + * secondary grouping size. In most locales these are the same, but + * sometimes they are different. For example, if the primary grouping interval + * is 3, and the secondary is 2, then this corresponds to the pattern + * "#,##,##0", and the number 123456789 is formatted as "12,34,56,789". If a + * pattern contains multiple grouping separators, the interval between the last + * one and the end of the integer defines the primary grouping size, and the + * interval between the last two defines the secondary grouping size. All others + * are ignored, so "#,##,###,####" == "###,###,####" == "##,#,###,####". * *

Illegal patterns, such as "#.#.#" or "#.###,###", will cause - * DecimalFormat to throw an IllegalArgumentException + * DecimalFormat to throw an {@link IllegalArgumentException} * with a message that describes the problem. * - *

If DecimalFormat.parse(String, ParsePosition) fails to parse + *

Pattern BNF

+ * + *
+ * pattern    := subpattern (';' subpattern)?
+ * subpattern := prefix? number suffix?
+ * number     := integer ('.' fraction)? exponent?
+ * prefix     := '\u0000'..'\uFFFD' - specialCharacters
+ * suffix     := '\u0000'..'\uFFFD' - specialCharacters
+ * integer    := '#'* '0'* '0'
+ * fraction   := '0'* '#'*
+ * exponent   := 'E' '+'? '0'* '0'
+ * padSpec    := '*' padChar
+ * padChar    := '\u0000'..'\uFFFD' - quote
+ *  
+ * Notation:
+ *   X*       0 or more instances of X
+ *   X?       0 or 1 instances of X
+ *   X..Y     any character from X up to Y, inclusive
+ *   S - T    characters in S, except those in T
+ * 
+ * The first subpattern is for positive numbers. The second (optional) + * subpattern is for negative numbers. + * + *

Not indicated in the BNF syntax above: + *

+ * + *

Parsing

+ * + *

DecimalFormat parses all Unicode characters that represent + * decimal digits, as defined by {@link UCharacter#digit}. In addition, + * DecimalFormat also recognizes as digits the ten consecutive + * characters starting with the localized zero digit defined in the + * {@link DecimalFormatSymbols} object. During formatting, the + * {@link DecimalFormatSymbols}-based digits are output. + * + *

If {@link #parse(String, ParsePosition)} fails to parse * a string, it returns null and leaves the parse position - * unchanged. The convenience method DecimalFormat.parse(String) - * indicates parse failure by throwing a ParseException. + * unchanged. The convenience method {@link #parse(String)} + * indicates parse failure by throwing a {@link java.text.ParseException}. * - *

Special Cases + *

Special Values * - *

NaN is formatted as a single character, typically + *

NaN is represented as a single character, typically * \uFFFD. This character is determined by the - * DecimalFormatSymbols object. This is the only value for which + * {@link DecimalFormatSymbols} object. This is the only value for which * the prefixes and suffixes are not used. * - *

Infinity is formatted as a single character, typically + *

Infinity is represented as a single character, typically * \u221E, with the positive or negative prefixes and suffixes * applied. The infinity character is determined by the - * DecimalFormatSymbols object. + * {@link DecimalFormatSymbols} object. * - *

- * Scientific Notation + *

Scientific Notation

* *

Numbers in scientific notation are expressed as the product of a mantissa - * and a power of ten, for example, 1234 can be expressed as 1.234 x 10^3. The - * mantissa is often in the range 1.0 <= x < 10.0, but it need not be. - * DecimalFormat can be instructed to format and parse scientific - * notation through the API or via a pattern. In a pattern, the exponent + * and a power of ten, for example, 1234 can be expressed as 1.234 x 103. The + * mantissa is typically in the half-open interval [1.0, 10.0) or sometimes [0.0, 1.0), + * but it need not be. DecimalFormat supports arbitrary mantissas. + * DecimalFormat can be instructed to use scientific + * notation through the API or through the pattern. In a pattern, the exponent * character immediately followed by one or more digit characters indicates * scientific notation. Example: "0.###E0" formats the number 1234 as * "1.234E3". @@ -305,16 +368,16 @@ import java.io.ObjectInputStream; *

  • Exponential patterns may not contain grouping separators. * * - *

    + *

    * NEW - * Padding + * Padding

    * *

    DecimalFormat supports padding the result of - * format() to a specific width. Padding may be specified either + * {@link #format} to a specific width. Padding may be specified either * through the API or through the pattern syntax. In a pattern the pad escape * character, followed by a single pad character, causes padding to be parsed * and formatted. The pad escape character is '*' in unlocalized patterns, and - * can be localized using DecimalFormatSymbols.setPadEscape(). For + * can be localized using {@link DecimalFormatSymbols#setPadEscape}. For * example, "$*x#,##0.00" formats 123 to "$xx123.00", * and 1234 to "$1,234.00". * @@ -323,6 +386,8 @@ import java.io.ObjectInputStream; * including prefix and suffix, determines the format width. For example, in * the pattern "* #0 o''clock", the format width is 10. * + *

  • The width is counted in 16-bit code units (Java chars). + * *
  • Some parameters which usually do not matter have meaning when padding is * used, because the pattern width is significant with padding. In the pattern * "* ##,##,#,##0.##", the format width is 14. The initial characters "##,##," @@ -331,12 +396,16 @@ import java.io.ObjectInputStream; * *
  • Padding may be inserted at one of four locations: before the prefix, * after the prefix, before the suffix, or after the suffix. If padding is - * specified in any other location, DecimalFormat.applyPattern() - * throws an IllegalArgumentException. If there is no prefix, - * before the prefix and after the prefix are equivalent, likewise for the - * suffix. + * specified in any other location, {@link #applyPattern} throws an {@link + * IllegalArgumentException}. If there is no prefix, before the + * prefix and after the prefix are equivalent, likewise for the suffix. + * + *
  • When specified in a pattern, the 16-bit char immediately + * following the pad escape is the pad character. This may be any character, + * including a special pattern character. That is, the pad escape + * escapes the following character. If there is no character after + * the pad escape, then the pattern is illegal. * - *
  • The pad character may not be a quote. * * *

    @@ -355,9 +424,9 @@ import java.io.ObjectInputStream; * not affect parsing or change any numerical values. * *

  • A rounding mode determines how values are rounded; see the - * java.math.BigDecimal documentation for a description of the + * {@link java.math.BigDecimal} documentation for a description of the * modes. Rounding increments specified in patterns use the default mode, - * ROUND_HALF_EVEN. + * {@link java.math.BigDecimal#ROUND_HALF_EVEN}. * *
  • Some locales use rounding in their currency formats to reflect the * smallest currency denomination. @@ -366,96 +435,10 @@ import java.io.ObjectInputStream; * behave identically to digit '0'. * * - *

    Pattern Syntax - *

    - * pattern    := subpattern{';' subpattern}
    - * subpattern := {prefix}number{suffix}
    - * number     := integer{'.' fraction}{exponent}
    - * prefix     := '\u0000'..'\uFFFD' - specialCharacters
    - * suffix     := '\u0000'..'\uFFFD' - specialCharacters
    - * integer    := '#'* '0'* '0'
    - * fraction   := '0'* '#'*
    - * exponent   := 'E' {'+'} '0'* '0'
    - * padSpec    := '*' padChar
    - * padChar    := '\u0000'..'\uFFFD' - quote
    - *  
    - * Notation:
    - *   X*       0 or more instances of X
    - *   { X }    0 or 1 instances of X
    - *   X..Y     any character from X up to Y, inclusive
    - *   S - T    characters in S, except those in T
    - * 
    - * The first subpattern is for positive numbers. The second (optional) - * subpattern is for negative numbers. - * - *

    Not indicated in the BNF syntax above: - *

    • The grouping separator ',' can occur inside the integer portion between the - * most significant digit and the least significant digit. - * - *
    • NEW - * Two grouping intervals are recognized: That between the - * decimal point and the first grouping symbol, and that - * between the first and second grouping symbols. These - * intervals are identical in most locales, but in some - * locales they differ. For example, the pattern - * "#,##,###" formats the number 123456789 as - * "12,34,56,789".
    • - * - *
    • - * NEW - * The pad specifier padSpec may appear before the prefix, - * after the prefix, before the suffix, after the suffix, or not at all. - * - *
    • - * NEW - * In place of '0', the digits '1' through '9' may be used to - * indicate a rounding increment. - *
    - * - *

    Special Pattern Characters - * - *

    Here are the special characters used in the pattern, with notes on their - * usage. Special characters must be quoted, unless noted otherwise, if they - * are to appear in the prefix or suffix. This does not apply to those listed - * with location "prefix or suffix." Such characters should only be quoted in - * order to remove their special meaning. - * - *

    - * - * - * - * - * - * - * - * - * - * - * - * - * - *
    SymbolLocationMeaning
    0-9NumberDigit. - * NEW - * '1' through '9' indicate rounding
    #NumberDigit, zero shows as absent
    .NumberDecimal separator or monetary decimal separator
    ,NumberGrouping separator
    ENumber - * Separates mantissa and exponent in scientific notation. - * Need not be quoted in prefix or suffix.
    NEW - * +Exponent - * Prefix positive exponents with localized plus sign. - * Need not be quoted in prefix or suffix.
    ;Subpattern boundary - * Separates positive and negative subpatterns
    %Prefix or suffixMultiply by 100 and show as percentage
    \u2030Prefix or suffix - * Multiply by 1000 and show as per mille
    \u00A4Prefix or suffix - * Currency sign, replaced by currency symbol. If - * doubled, replaced by international currency symbol. - * If present in a pattern, the monetary decimal separator - * is used instead of the decimal separator.
    'Prefix or suffix - * Used to quote special characters in a prefix or suffix, - * for example, "'#'#" formats 123 to - * "#123". To create a single quote - * itself, use two in a row: "# o''clock".
    NEW - * *Prefix or suffix boundary - * Pad escape, precedes pad character
    - * + *

    Synchronization

    * + *

    DecimalFormat objects are not synchronized. Multiple + * threads should not access one formatter concurrently. * * @see java.text.Format * @see NumberFormat @@ -1336,18 +1319,18 @@ public class DecimalFormat extends NumberFormat { char ch = text.charAt(position); /* We recognize all digit ranges, not only the Latin digit range - * '0'..'9'. We do so by using the Character.digit() method, + * '0'..'9'. We do so by using the UCharacter.digit() method, * which converts a valid Unicode digit to the range 0..9. * * The character 'ch' may be a digit. If so, place its value * from 0 to 9 in 'digit'. First try using the locale digit, * which may or MAY NOT be a standard Unicode digit range. If * this fails, try using the standard Unicode digit ranges by - * calling Character.digit(). If this also fails, digit will + * calling UCharacter.digit(). If this also fails, digit will * have a value outside the range 0..9. */ digit = ch - zero; - if (digit < 0 || digit > 9) digit = Character.digit(ch, 10); + if (digit < 0 || digit > 9) digit = UCharacter.digit(ch, 10); if (digit == 0) { @@ -1427,7 +1410,7 @@ public class DecimalFormat extends NumberFormat { code: digit = Character.digit(ch, 10); [Richard/GCL] */ - digit = Character.digit(text.charAt(pos), 10); + digit = UCharacter.digit(text.charAt(pos), 10); } if (digit >= 0 && digit <= 9) { exponentDigits.append((char)(digit + '0')); @@ -2079,8 +2062,9 @@ public class DecimalFormat extends NumberFormat { /** * NEW - * Set the character used to pad to the format width. This has no effect - * unless padding is enabled. + * Set the character used to pad to the format width. If padding + * is not enabled, then this will take effect if padding is later + * enabled. * @param padChar the pad character * @see #setFormatWidth * @see #getFormatWidth