ICU-20418 Adding concise number skeletons in ICU4C

This commit is contained in:
Shane Carr 2019-12-11 20:45:54 -08:00 committed by Shane F. Carr
parent df8841aa6f
commit fe98d870b2
6 changed files with 604 additions and 107 deletions

View File

@ -9,16 +9,24 @@ Number Skeletons
Number skeletons are a locale-agnostic way to configure a NumberFormatter in Number skeletons are a locale-agnostic way to configure a NumberFormatter in
ICU. Number skeletons work in MessageFormat. ICU. Number skeletons work in MessageFormat.
Number skeletons consist of *space-separated tokens* that correspond to Number skeletons consist of case-sensitive tokens that correspond to settings
settings in ICU NumberFormatter. For example, to format a currency in compact in ICU NumberFormatter. For example, to format a currency in compact notation
notation, you could use this skeleton: with the sign always shown, you could use this skeleton:
compact-short currency/GBP sign-always compact-short currency/GBP
***Since ICU 67***, you can also use more concise syntax:
+! K currency/GBP
To use a skeleton in MessageFormat, use the "number" type and prefix the To use a skeleton in MessageFormat, use the "number" type and prefix the
skeleton with `::` skeleton with `::`
{0, number, ::compact-short currency/GBP} {0, number, :: +! K currency/GBP}
The ICU `toSkeleton()` API outputs the long-form skeletons, but all parts of
ICU that read user-specified number skeletons accept both long-form and
concise skeletons.
## Syntax ## Syntax
@ -27,6 +35,9 @@ occurs before the first "/" character in a token, and the options are each of
the subsequent "/"-delimited strings. For example, "compact-short" and the subsequent "/"-delimited strings. For example, "compact-short" and
"currency" are stems, and "GBP" is an option. "currency" are stems, and "GBP" is an option.
Tokens are space-separated, with exceptions for concise skeletons listed at
the end of this document.
Stems might also be dynamic strings (not a fixed list); these are called Stems might also be dynamic strings (not a fixed list); these are called
*blueprint stems*. For example, to format a number with 2-3 significant *blueprint stems*. For example, to format a number with 2-3 significant
digits, you could use the following stem: digits, you could use the following stem:
@ -39,28 +50,28 @@ Options](#skeleton-stems-and-options).
## Examples ## Examples
| Skeleton | Input | en-US Output | Comments | | Long Skeleton | Concise Skeleton | Input | en-US Output | Comments |
|---|---|---|---| |---|---|---|---|---|
| `percent` | 25 | 25% | | `percent` | `%` | 25 | 25% |
| `.00` | 25 | 25.00 | Equivalent to Precision::fixedFraction(2) | | `.00` | `.00` | 25 | 25.00 | Equivalent to Precision::fixedFraction(2) |
| `percent .00` | 25 | 25.00% | | `percent .00` | `% .00` | 25 | 25.00% |
| `scale/100` | 0.3 | 30 | Multiply by 100 before formatting | | `scale/100` | `scale/100` | 0.3 | 30 | Multiply by 100 before formatting |
| `percent scale/100` | 0.3 | 30% | | `percent scale/100` | `%x100` | 0.3 | 30% |
| `measure-unit/length-meter` | 5 | 5 m | UnitWidth defaults to Short | | `measure-unit/length-meter` | `unit/meter` | 5 | 5 m | UnitWidth defaults to Short |
| `measure-unit/length-meter` <br/> `unit-width-full-name` | 5 | 5 meters | | `measure-unit/length-meter` <br/> `unit-width-full-name` | `unit/meter` <br/> `unit-width-full-name` | 5 | 5 meters |
| `currency/CAD` | 10 | CA$10.00 | | `currency/CAD` | `currency/CAD` | 10 | CA$10.00 |
| `currency/CAD` <br/> `unit-width-narrow` | 10 | $10.00 | Use the narrow symbol variant | | `currency/CAD` <br/> `unit-width-narrow` | `currency/CAD` <br/> `unit-width-narrow` | 10 | $10.00 | Use the narrow symbol variant |
| `compact-short` | 5000 | 5K | | `compact-short` | `K` | 5000 | 5K |
| `compact-long` | 5000 | 5 thousand | | `compact-long` | `KK` | 5000 | 5 thousand |
| `compact-short` <br/> `currency/CAD` | 5000 | CA$5K | | `compact-short` <br/> `currency/CAD` | `K currency/CAD` | 5000 | CA$5K |
| - | 5000 | 5,000 | | - | - | 5000 | 5,000 |
| `group-min2` | 5000 | 5000 | Require 2 digits in group for separator | | `group-min2` | `,?` | 5000 | 5000 | Require 2 digits in group for separator |
| `group-min2` | 15000 | 15,000 | | `group-min2` | `,?` | 15000 | 15,000 |
| `sign-always` | 60 | +60 | Show sign on all numbers | | `sign-always` | `+!` | 60 | +60 | Show sign on all numbers |
| `sign-always` | 0 | +0 | | `sign-always` | `+!` | 0 | +0 |
| `sign-except-zero` | 60 | +60 | Show sign on all numbers except 0 | | `sign-except-zero` | `+?` | 60 | +60 | Show sign on all numbers except 0 |
| `sign-except-zero` | 0 | 0 | | `sign-except-zero` | `+?` | 0 | 0 |
| `sign-accounting` <br/> `currency/CAD` | -40 | (CA$40.00) | | `sign-accounting` <br/> `currency/CAD` | `() currency/CAD` | -40 | (CA$40.00) |
## Skeleton Stems and Options ## Skeleton Stems and Options
@ -69,16 +80,19 @@ below.
### Notation ### Notation
Use one of the following stems to select your notation style: Use one of the following stems to select compact or simple notation:
- `compact-short` - `compact-short` or `K` (concise)
- `compact-long` - `compact-long` or `KK` (concise)
- `scientific` - `notation-simple` (or omit since this is default)
- `engineering`
- `notation-simple`
The skeletons `scientific` and `engineering` take the following optional There are two ways to select scientific or engineering notation: using long-form
options: syntax or concise syntax.
#### Scientific and Engineering Notation: Long Form
Start with the stem `scientific` or `engineering`. Those stems take the
following optional options:
- `/sign-xxx` sets the sign display option for the exponent; see [Sign](#sign). - `/sign-xxx` sets the sign display option for the exponent; see [Sign](#sign).
- `/+ee` sets exponent digits to "at least 2"; use `/+eee` for at least 3 digits, etc. - `/+ee` sets exponent digits to "at least 2"; use `/+eee` for at least 3 digits, etc.
@ -90,16 +104,34 @@ For example, all of the following skeletons are valid:
- `scientific/+ee` - `scientific/+ee`
- `scientific/+ee/sign-always` - `scientific/+ee/sign-always`
#### Scientific and Engineering Notation: Concise Form
The following are examples of concise form:
| Concise Skeleton | Equivalent Long-Form Skeleton |
|---|---|
| `E0` | `scientific` |
| `E00` | `scientific/+ee` |
| `EE+0` | `engineering/sign-always` |
| `E+?00` | `scientific/sign-except-zero/+ee` |
More precisely:
1. Start with `E` for scientific or `EE` for engineering.
2. Allow either `+` or `+?` as a concise sign display option.
3. Expect one or more `0`s. If more than one, set minimum integer digits.
### Unit ### Unit
The supported types of units are percent, currency, and measurement units. The supported types of units are percent, currency, and measurement units.
The following skeleton tokens are accepted: The following skeleton tokens are accepted:
- `percent` - `percent` or `%` (concise)
- Special: `%x100` to scale the number by 100 and then format with percent
- `permille` - `permille`
- `base-unit` - `base-unit`
- `currency/XXX` - `currency/XXX`
- `measure-unit/aaaa-bbbb` - `measure-unit/aaaa-bbbb` or `unit/bbb` (concise)
The `percent`, `permille`, and `base-unit` stems do not take any options. The `percent`, `permille`, and `base-unit` stems do not take any options.
@ -110,13 +142,19 @@ The `measure-unit` stem takes one required option: the unit identifier of the
unit to be formatted. The full unit identifier is required: both the type and unit to be formatted. The full unit identifier is required: both the type and
the subtype (for example, `length-meter`). the subtype (for example, `length-meter`).
The `unit` stem is an alternative to `measure-unit` that accepts a core unit
identifier with the subtype but not the type (for example, `meter` instead of
`length-meter`). It also supports variations allowed by UTS 35, including the per unit with the `-per-` infix (for example, `unit/furlong-per-second`).
### Per Unit ### Per Unit
To specify a unit to put in the denominator, use the following skeleton token: To specify a unit to put in the denominator, use the following skeleton token.
As with the `measure-unit` stem, pass the unit identifier as the option:
- `per-measure-unit/aaaa-bbbb` - `per-measure-unit/aaaa-bbbb`
As with the `measure-unit` stem, pass the unit identifier as the option. Note that if the `unit` stem is used, the demonimator can be placed in the same
token as the numerator.
### Unit Width ### Unit Width
@ -219,20 +257,23 @@ Modes](http://userguide.icu-project.org/formatparse/numbers/rounding-modes).
The following examples show how to specify integer width (minimum or maximum The following examples show how to specify integer width (minimum or maximum
integer digits): integer digits):
| Token | Explanation | Equivalent C++ Code | | Long Form | Concise Form | Explanation | Equivalent C++ Code |
|---|---|---| |---|---|---|---|
| `integer-width/+000` | At least 3 <br/> integer digits | `IntegerWidth::zeroFillTo(3)` | | `integer-width/+000` | `000` | At least 3 <br/> integer digits | `IntegerWidth::zeroFillTo(3)` |
| `integer-width/##0` | Between 1 and 3 <br/> integer digits | `IntegerWidth::zeroFillTo(1)` <br/> `.truncateAt(3)` | `integer-width/##0` | - | Between 1 and 3 <br/> integer digits | `IntegerWidth::zeroFillTo(1)` <br/> `.truncateAt(3)`
| `integer-width/00` | Exactly 2 <br/> integer digits | `IntegerWidth::zeroFillTo(2)` <br/> `.truncateAt(2)` | | `integer-width/00` | - | Exactly 2 <br/> integer digits | `IntegerWidth::zeroFillTo(2)` <br/> `.truncateAt(2)` |
| `integer-width/+` | Zero or more <br/> integer digits | `IntegerWidth::zeroFillTo(0) ` | `integer-width/+` | - | Zero or more <br/> integer digits | `IntegerWidth::zeroFillTo(0) `
The option start with either a single `+` symbols, signaling no limit on the The long-form option starts with either a single `+` symbol, signaling no limit
number of integer digits (no *truncateAt*), or zero or more `#` symbols. It on the number of integer digits (no *truncateAt*), or zero or more `#` symbols.
should then be followed by zero or more `0` symbols, indicating the minimum It should then be followed by zero or more `0` symbols, indicating the minimum
integer digits (the argument to *zeroFillTo*). If there is no `+` symbol, the integer digits (the argument to *zeroFillTo*). If there is no `+` symbol, the
maximum integer digits (the argument to *truncateAt*) is the number of `#` maximum integer digits (the argument to *truncateAt*) is the number of `#`
symbols plus the number of `0` symbols. symbols plus the number of `0` symbols.
The concise skeleton is simply one or more `0` characters. This supports
minimum integer digits but not maximum integer digits.
### Scale ### Scale
To specify the scale, use the following stem and option: To specify the scale, use the following stem and option:
@ -258,11 +299,11 @@ is able to be parsed by both engines.
The grouping strategy can be specified by the following stems: The grouping strategy can be specified by the following stems:
- `group-off` - `group-off` or `,_` (concise)
- `group-min2` - `group-min2` or `,?` (concise)
- `group-auto` - `group-auto` (or omit since this is the default)
- `group-on-aligned` - `group-on-aligned` or `,!` (concise)
- `group-thousands` - `group-thousands` or `,=` (concise)
For more details, see For more details, see
[UNumberGroupingStrategy](http://icu-project.org/apiref/icu4c/unumberformatter_8h.html). [UNumberGroupingStrategy](http://icu-project.org/apiref/icu4c/unumberformatter_8h.html).
@ -280,13 +321,13 @@ A custom NDecimalFormatSymbols instance is not supported at this time.
The following stems specify sign display: The following stems specify sign display:
- `sign-auto` - `sign-auto` (or omit since this is the default)
- `sign-always` - `sign-always` or `+!` (concise)
- `sign-never` - `sign-never` or `+_` (concise)
- `sign-accounting` - `sign-accounting` or `()` (concise)
- `sign-accounting-always` - `sign-accounting-always` or `()!` (concise)
- `sign-except-zero` - `sign-except-zero` or `+?` (concise)
- `sign-accounting-except-zero` - `sign-accounting-except-zero` or `()?` (concise)
For more details, see For more details, see
[UNumberSignDisplay](http://icu-project.org/apiref/icu4c/unumberformatter_8h.html). [UNumberSignDisplay](http://icu-project.org/apiref/icu4c/unumberformatter_8h.html).

View File

@ -21,6 +21,7 @@
#include "uinvchar.h" #include "uinvchar.h"
#include "charstr.h" #include "charstr.h"
#include "string_segment.h" #include "string_segment.h"
#include "unicode/errorcode.h"
using namespace icu; using namespace icu;
using namespace icu::number; using namespace icu::number;
@ -93,12 +94,29 @@ void U_CALLCONV initNumberSkeletons(UErrorCode& status) {
b.add(u"precision-increment", STEM_PRECISION_INCREMENT, status); b.add(u"precision-increment", STEM_PRECISION_INCREMENT, status);
b.add(u"measure-unit", STEM_MEASURE_UNIT, status); b.add(u"measure-unit", STEM_MEASURE_UNIT, status);
b.add(u"per-measure-unit", STEM_PER_MEASURE_UNIT, status); b.add(u"per-measure-unit", STEM_PER_MEASURE_UNIT, status);
b.add(u"unit", STEM_UNIT, status);
b.add(u"currency", STEM_CURRENCY, status); b.add(u"currency", STEM_CURRENCY, status);
b.add(u"integer-width", STEM_INTEGER_WIDTH, status); b.add(u"integer-width", STEM_INTEGER_WIDTH, status);
b.add(u"numbering-system", STEM_NUMBERING_SYSTEM, status); b.add(u"numbering-system", STEM_NUMBERING_SYSTEM, status);
b.add(u"scale", STEM_SCALE, status); b.add(u"scale", STEM_SCALE, status);
if (U_FAILURE(status)) { return; } if (U_FAILURE(status)) { return; }
// Section 3 (concise tokens):
b.add(u"K", STEM_COMPACT_SHORT, status);
b.add(u"KK", STEM_COMPACT_LONG, status);
b.add(u"%", STEM_PERCENT, status);
b.add(u"%x100", STEM_PERCENT_100, status);
b.add(u",_", STEM_GROUP_OFF, status);
b.add(u",?", STEM_GROUP_MIN2, status);
b.add(u",!", STEM_GROUP_ON_ALIGNED, status);
b.add(u"+!", STEM_SIGN_ALWAYS, status);
b.add(u"+_", STEM_SIGN_NEVER, status);
b.add(u"()", STEM_SIGN_ACCOUNTING, status);
b.add(u"()!", STEM_SIGN_ACCOUNTING_ALWAYS, status);
b.add(u"+?", STEM_SIGN_EXCEPT_ZERO, status);
b.add(u"()?", STEM_SIGN_ACCOUNTING_EXCEPT_ZERO, status);
if (U_FAILURE(status)) { return; }
// Build the CharsTrie // Build the CharsTrie
// TODO: Use SLOW or FAST here? // TODO: Use SLOW or FAST here?
UnicodeString result; UnicodeString result;
@ -529,6 +547,7 @@ MacroProps skeleton::parseSkeleton(
case STATE_INCREMENT_PRECISION: case STATE_INCREMENT_PRECISION:
case STATE_MEASURE_UNIT: case STATE_MEASURE_UNIT:
case STATE_PER_MEASURE_UNIT: case STATE_PER_MEASURE_UNIT:
case STATE_IDENTIFIER_UNIT:
case STATE_CURRENCY_UNIT: case STATE_CURRENCY_UNIT:
case STATE_INTEGER_WIDTH: case STATE_INTEGER_WIDTH:
case STATE_NUMBERING_SYSTEM: case STATE_NUMBERING_SYSTEM:
@ -564,6 +583,14 @@ skeleton::parseStem(const StringSegment& segment, const UCharsTrie& stemTrie, Se
CHECK_NULL(seen, precision, status); CHECK_NULL(seen, precision, status);
blueprint_helpers::parseDigitsStem(segment, macros, status); blueprint_helpers::parseDigitsStem(segment, macros, status);
return STATE_NULL; return STATE_NULL;
case u'E':
CHECK_NULL(seen, notation, status);
blueprint_helpers::parseScientificStem(segment, macros, status);
return STATE_NULL;
case u'0':
CHECK_NULL(seen, notation, status);
blueprint_helpers::parseIntegerStem(segment, macros, status);
return STATE_NULL;
default: default:
break; break;
} }
@ -604,6 +631,13 @@ skeleton::parseStem(const StringSegment& segment, const UCharsTrie& stemTrie, Se
macros.unit = stem_to_object::unit(stem); macros.unit = stem_to_object::unit(stem);
return STATE_NULL; return STATE_NULL;
case STEM_PERCENT_100:
CHECK_NULL(seen, scale, status);
CHECK_NULL(seen, unit, status);
macros.scale = Scale::powerOfTen(2);
macros.unit = NoUnit::percent();
return STATE_NULL;
case STEM_PRECISION_INTEGER: case STEM_PRECISION_INTEGER:
case STEM_PRECISION_UNLIMITED: case STEM_PRECISION_UNLIMITED:
case STEM_PRECISION_CURRENCY_STANDARD: case STEM_PRECISION_CURRENCY_STANDARD:
@ -683,6 +717,11 @@ skeleton::parseStem(const StringSegment& segment, const UCharsTrie& stemTrie, Se
CHECK_NULL(seen, perUnit, status); CHECK_NULL(seen, perUnit, status);
return STATE_PER_MEASURE_UNIT; return STATE_PER_MEASURE_UNIT;
case STEM_UNIT:
CHECK_NULL(seen, unit, status);
CHECK_NULL(seen, perUnit, status);
return STATE_IDENTIFIER_UNIT;
case STEM_CURRENCY: case STEM_CURRENCY:
CHECK_NULL(seen, unit, status); CHECK_NULL(seen, unit, status);
return STATE_CURRENCY_UNIT; return STATE_CURRENCY_UNIT;
@ -719,6 +758,9 @@ ParseState skeleton::parseOption(ParseState stem, const StringSegment& segment,
case STATE_PER_MEASURE_UNIT: case STATE_PER_MEASURE_UNIT:
blueprint_helpers::parseMeasurePerUnitOption(segment, macros, status); blueprint_helpers::parseMeasurePerUnitOption(segment, macros, status);
return STATE_NULL; return STATE_NULL;
case STATE_IDENTIFIER_UNIT:
blueprint_helpers::parseIdentifierUnitOption(segment, macros, status);
return STATE_NULL;
case STATE_INCREMENT_PRECISION: case STATE_INCREMENT_PRECISION:
blueprint_helpers::parseIncrementOption(segment, macros, status); blueprint_helpers::parseIncrementOption(segment, macros, status);
return STATE_NULL; return STATE_NULL;
@ -981,7 +1023,7 @@ void blueprint_helpers::generateMeasureUnitOption(const MeasureUnit& measureUnit
void blueprint_helpers::parseMeasurePerUnitOption(const StringSegment& segment, MacroProps& macros, void blueprint_helpers::parseMeasurePerUnitOption(const StringSegment& segment, MacroProps& macros,
UErrorCode& status) { UErrorCode& status) {
// A little bit of a hack: safe the current unit (numerator), call the main measure unit // A little bit of a hack: save the current unit (numerator), call the main measure unit
// parsing code, put back the numerator unit, and put the new unit into per-unit. // parsing code, put back the numerator unit, and put the new unit into per-unit.
MeasureUnit numerator = macros.unit; MeasureUnit numerator = macros.unit;
parseMeasureUnitOption(segment, macros, status); parseMeasureUnitOption(segment, macros, status);
@ -990,6 +1032,22 @@ void blueprint_helpers::parseMeasurePerUnitOption(const StringSegment& segment,
macros.unit = numerator; macros.unit = numerator;
} }
void blueprint_helpers::parseIdentifierUnitOption(const StringSegment& segment, MacroProps& macros,
UErrorCode& status) {
// Need to do char <-> UChar conversion...
U_ASSERT(U_SUCCESS(status));
CharString buffer;
SKELETON_UCHAR_TO_CHAR(buffer, segment.toTempUnicodeString(), 0, segment.length(), status);
ErrorCode internalStatus;
MeasureUnit::parseCoreUnitIdentifier(buffer.toStringPiece(), &macros.unit, &macros.perUnit, internalStatus);
if (internalStatus.isFailure()) {
// throw new SkeletonSyntaxException("Invalid core unit identifier", segment, e);
status = U_NUMBER_SKELETON_SYNTAX_ERROR;
return;
}
}
void blueprint_helpers::parseFractionStem(const StringSegment& segment, MacroProps& macros, void blueprint_helpers::parseFractionStem(const StringSegment& segment, MacroProps& macros,
UErrorCode& status) { UErrorCode& status) {
U_ASSERT(segment.charAt(0) == u'.'); U_ASSERT(segment.charAt(0) == u'.');
@ -1027,7 +1085,11 @@ void blueprint_helpers::parseFractionStem(const StringSegment& segment, MacroPro
} }
// Use the public APIs to enforce bounds checking // Use the public APIs to enforce bounds checking
if (maxFrac == -1) { if (maxFrac == -1) {
macros.precision = Precision::minFraction(minFrac); if (minFrac == 0) {
macros.precision = Precision::unlimited();
} else {
macros.precision = Precision::minFraction(minFrac);
}
} else { } else {
macros.precision = Precision::minMaxFraction(minFrac, maxFrac); macros.precision = Precision::minMaxFraction(minFrac, maxFrac);
} }
@ -1051,9 +1113,9 @@ blueprint_helpers::generateFractionStem(int32_t minFrac, int32_t maxFrac, Unicod
void void
blueprint_helpers::parseDigitsStem(const StringSegment& segment, MacroProps& macros, UErrorCode& status) { blueprint_helpers::parseDigitsStem(const StringSegment& segment, MacroProps& macros, UErrorCode& status) {
U_ASSERT(segment.charAt(0) == u'@'); U_ASSERT(segment.charAt(0) == u'@');
int offset = 0; int32_t offset = 0;
int minSig = 0; int32_t minSig = 0;
int maxSig; int32_t maxSig;
for (; offset < segment.length(); offset++) { for (; offset < segment.length(); offset++) {
if (segment.charAt(offset) == u'@') { if (segment.charAt(offset) == u'@') {
minSig++; minSig++;
@ -1101,6 +1163,75 @@ blueprint_helpers::generateDigitsStem(int32_t minSig, int32_t maxSig, UnicodeStr
} }
} }
void blueprint_helpers::parseScientificStem(const StringSegment& segment, MacroProps& macros, UErrorCode& status) {
U_ASSERT(segment.charAt(0) == u'E');
{
int32_t offset = 1;
if (segment.length() == offset) {
goto fail;
}
bool isEngineering = false;
if (segment.charAt(offset) == u'E') {
isEngineering = true;
offset++;
if (segment.length() == offset) {
goto fail;
}
}
UNumberSignDisplay signDisplay = UNUM_SIGN_AUTO;
if (segment.charAt(offset) == u'+') {
offset++;
if (segment.length() == offset) {
goto fail;
}
if (segment.charAt(offset) == u'!') {
signDisplay = UNUM_SIGN_ALWAYS;
} else if (segment.charAt(offset) == u'?') {
signDisplay = UNUM_SIGN_EXCEPT_ZERO;
} else {
goto fail;
}
offset++;
if (segment.length() == offset) {
goto fail;
}
}
int32_t minDigits = 0;
for (; offset < segment.length(); offset++) {
if (segment.charAt(offset) != u'0') {
goto fail;
}
minDigits++;
}
macros.notation = (isEngineering ? Notation::engineering() : Notation::scientific())
.withExponentSignDisplay(signDisplay)
.withMinExponentDigits(minDigits);
return;
}
fail: void();
// throw new SkeletonSyntaxException("Invalid scientific stem", segment);
status = U_NUMBER_SKELETON_SYNTAX_ERROR;
return;
}
void blueprint_helpers::parseIntegerStem(const StringSegment& segment, MacroProps& macros, UErrorCode& status) {
U_ASSERT(segment.charAt(0) == u'0');
int32_t offset = 1;
for (; offset < segment.length(); offset++) {
if (segment.charAt(offset) != u'0') {
offset--;
break;
}
}
if (offset < segment.length()) {
// throw new SkeletonSyntaxException("Invalid integer stem", segment);
status = U_NUMBER_SKELETON_SYNTAX_ERROR;
return;
}
macros.integerWidth = IntegerWidth::zeroFillTo(offset);
return;
}
bool blueprint_helpers::parseFracSigOption(const StringSegment& segment, MacroProps& macros, bool blueprint_helpers::parseFracSigOption(const StringSegment& segment, MacroProps& macros,
UErrorCode& status) { UErrorCode& status) {
if (segment.charAt(0) != u'@') { if (segment.charAt(0) != u'@') {

View File

@ -46,6 +46,7 @@ enum ParseState {
STATE_INCREMENT_PRECISION, STATE_INCREMENT_PRECISION,
STATE_MEASURE_UNIT, STATE_MEASURE_UNIT,
STATE_PER_MEASURE_UNIT, STATE_PER_MEASURE_UNIT,
STATE_IDENTIFIER_UNIT,
STATE_CURRENCY_UNIT, STATE_CURRENCY_UNIT,
STATE_INTEGER_WIDTH, STATE_INTEGER_WIDTH,
STATE_NUMBERING_SYSTEM, STATE_NUMBERING_SYSTEM,
@ -71,6 +72,7 @@ enum StemEnum {
STEM_BASE_UNIT, STEM_BASE_UNIT,
STEM_PERCENT, STEM_PERCENT,
STEM_PERMILLE, STEM_PERMILLE,
STEM_PERCENT_100, // concise-only
STEM_PRECISION_INTEGER, STEM_PRECISION_INTEGER,
STEM_PRECISION_UNLIMITED, STEM_PRECISION_UNLIMITED,
STEM_PRECISION_CURRENCY_STANDARD, STEM_PRECISION_CURRENCY_STANDARD,
@ -109,6 +111,7 @@ enum StemEnum {
STEM_PRECISION_INCREMENT, STEM_PRECISION_INCREMENT,
STEM_MEASURE_UNIT, STEM_MEASURE_UNIT,
STEM_PER_MEASURE_UNIT, STEM_PER_MEASURE_UNIT,
STEM_UNIT,
STEM_CURRENCY, STEM_CURRENCY,
STEM_INTEGER_WIDTH, STEM_INTEGER_WIDTH,
STEM_NUMBERING_SYSTEM, STEM_NUMBERING_SYSTEM,
@ -226,6 +229,8 @@ void generateMeasureUnitOption(const MeasureUnit& measureUnit, UnicodeString& sb
void parseMeasurePerUnitOption(const StringSegment& segment, MacroProps& macros, UErrorCode& status); void parseMeasurePerUnitOption(const StringSegment& segment, MacroProps& macros, UErrorCode& status);
void parseIdentifierUnitOption(const StringSegment& segment, MacroProps& macros, UErrorCode& status);
void parseFractionStem(const StringSegment& segment, MacroProps& macros, UErrorCode& status); void parseFractionStem(const StringSegment& segment, MacroProps& macros, UErrorCode& status);
void generateFractionStem(int32_t minFrac, int32_t maxFrac, UnicodeString& sb, UErrorCode& status); void generateFractionStem(int32_t minFrac, int32_t maxFrac, UnicodeString& sb, UErrorCode& status);
@ -234,6 +239,14 @@ void parseDigitsStem(const StringSegment& segment, MacroProps& macros, UErrorCod
void generateDigitsStem(int32_t minSig, int32_t maxSig, UnicodeString& sb, UErrorCode& status); void generateDigitsStem(int32_t minSig, int32_t maxSig, UnicodeString& sb, UErrorCode& status);
void parseScientificStem(const StringSegment& segment, MacroProps& macros, UErrorCode& status);
// Note: no generateScientificStem since this syntax was added later in ICU 67
void parseIntegerStem(const StringSegment& segment, MacroProps& macros, UErrorCode& status);
// Note: no generateIntegerStem since this syntax was added later in ICU 67
/** @return Whether we successfully found and parsed a frac-sig option. */ /** @return Whether we successfully found and parsed a frac-sig option. */
bool parseFracSigOption(const StringSegment& segment, MacroProps& macros, UErrorCode& status); bool parseFracSigOption(const StringSegment& segment, MacroProps& macros, UErrorCode& status);

View File

@ -114,16 +114,38 @@ class NumberFormatterApiTest : public IntlTestWithFieldPosition {
DecimalFormatSymbols SWISS_SYMBOLS; DecimalFormatSymbols SWISS_SYMBOLS;
DecimalFormatSymbols MYANMAR_SYMBOLS; DecimalFormatSymbols MYANMAR_SYMBOLS;
void assertFormatDescending(const char16_t* message, const char16_t* skeleton, /**
const UnlocalizedNumberFormatter& f, Locale locale, ...); * skeleton is the full length skeleton, which must round-trip.
*
* conciseSkeleton should be the shortest available skeleton.
* The concise skeleton can be read but not printed.
*/
void assertFormatDescending(
const char16_t* message,
const char16_t* skeleton,
const char16_t* conciseSkeleton,
const UnlocalizedNumberFormatter& f,
Locale locale,
...);
void assertFormatDescendingBig(const char16_t* message, const char16_t* skeleton, /** See notes above regarding skeleton vs conciseSkeleton */
const UnlocalizedNumberFormatter& f, Locale locale, ...); void assertFormatDescendingBig(
const char16_t* message,
const char16_t* skeleton,
const char16_t* conciseSkeleton,
const UnlocalizedNumberFormatter& f,
Locale locale,
...);
FormattedNumber /** See notes above regarding skeleton vs conciseSkeleton */
assertFormatSingle(const char16_t* message, const char16_t* skeleton, FormattedNumber assertFormatSingle(
const UnlocalizedNumberFormatter& f, Locale locale, double input, const char16_t* message,
const UnicodeString& expected); const char16_t* skeleton,
const char16_t* conciseSkeleton,
const UnlocalizedNumberFormatter& f,
Locale locale,
double input,
const UnicodeString& expected);
void assertUndefinedSkeleton(const UnlocalizedNumberFormatter& f); void assertUndefinedSkeleton(const UnlocalizedNumberFormatter& f);

File diff suppressed because it is too large Load Diff

View File

@ -70,6 +70,7 @@ void NumberSkeletonTest::validTokens() {
u"measure-unit/length-meter", u"measure-unit/length-meter",
u"measure-unit/area-square-meter", u"measure-unit/area-square-meter",
u"measure-unit/energy-joule per-measure-unit/length-meter", u"measure-unit/energy-joule per-measure-unit/length-meter",
u"unit/square-meter-per-square-meter",
u"currency/XXX", u"currency/XXX",
u"currency/ZZZ", u"currency/ZZZ",
u"currency/usd", u"currency/usd",
@ -105,7 +106,20 @@ void NumberSkeletonTest::validTokens() {
u"numbering-system/latn", u"numbering-system/latn",
u"precision-integer/@##", u"precision-integer/@##",
u"precision-integer rounding-mode-ceiling", u"precision-integer rounding-mode-ceiling",
u"precision-currency-cash rounding-mode-ceiling"}; u"precision-currency-cash rounding-mode-ceiling",
u"0",
u"00",
u"000",
u"E0",
u"E00",
u"E000",
u"EE0",
u"EE00",
u"EE+?0",
u"EE+?00",
u"EE+!0",
u"EE+!00",
};
for (auto& cas : cases) { for (auto& cas : cases) {
UnicodeString skeletonString(cas); UnicodeString skeletonString(cas);
@ -151,7 +165,20 @@ void NumberSkeletonTest::invalidTokens() {
u"integer-width/+0#", u"integer-width/+0#",
u"integer-width/+#", u"integer-width/+#",
u"integer-width/+#0", u"integer-width/+#0",
u"scientific/foo"}; u"scientific/foo",
u"E",
u"E1",
u"E+",
u"E+?",
u"E+!",
u"E+0",
u"EE",
u"EE+",
u"EEE",
u"EEE0",
u"001",
u"00+",
};
expectedErrorSkeleton(cases, UPRV_LENGTHOF(cases)); expectedErrorSkeleton(cases, UPRV_LENGTHOF(cases));
} }