ICU-20418 Adding concise number skeletons in ICU4C

2019-12-11 20:45:54 -08:00 · 2019-12-11 20:45:54 -08:00 · fe98d870b2
commit fe98d870b2
parent df8841aa6f
6 changed files with 604 additions and 107 deletions
--- a/docs/userguide/format_parse/numbers/skeletons.md
+++ b/docs/userguide/format_parse/numbers/skeletons.md
@ -9,16 +9,24 @@ Number Skeletons
 Number skeletons are a locale-agnostic way to configure a NumberFormatter in
 ICU.  Number skeletons work in MessageFormat.

-Number skeletons consist of *space-separated tokens* that correspond to
-settings in ICU NumberFormatter.  For example, to format a currency in compact
-notation, you could use this skeleton:
+Number skeletons consist of case-sensitive tokens that correspond to settings
+in ICU NumberFormatter.  For example, to format a currency in compact notation
+with the sign always shown, you could use this skeleton:

-    compact-short currency/GBP
+    sign-always compact-short currency/GBP
+
+***Since ICU 67***, you can also use more concise syntax:
+
+    +! K currency/GBP

 To use a skeleton in MessageFormat, use the "number" type and prefix the
 skeleton with `::`

-    {0, number, ::compact-short currency/GBP}
+    {0, number, :: +! K currency/GBP}
+
+The ICU `toSkeleton()` API outputs the long-form skeletons, but all parts of
+ICU that read user-specified number skeletons accept both long-form and
+concise skeletons.

 ## Syntax

@ -27,6 +35,9 @@ occurs before the first "/" character in a token, and the options are each of
 the subsequent "/"-delimited strings.  For example, "compact-short" and
 "currency" are stems, and "GBP" is an option.

+Tokens are space-separated, with exceptions for concise skeletons listed at
+the end of this document.
+
 Stems might also be dynamic strings (not a fixed list); these are called
 *blueprint stems*.  For example, to format a number with 2-3 significant
 digits, you could use the following stem:
@ -39,28 +50,28 @@ Options](#skeleton-stems-and-options).

 ## Examples

-| Skeleton | Input | en-US Output | Comments |
-|---|---|---|---|
-| `percent` | 25 | 25% |
-| `.00` | 25 | 25.00 | Equivalent to Precision::fixedFraction(2) |
-| `percent .00` | 25 | 25.00% |
-| `scale/100` | 0.3 | 30 | Multiply by 100 before formatting |
-| `percent scale/100` | 0.3 | 30% |
-| `measure-unit/length-meter` | 5 | 5 m | UnitWidth defaults to Short |
-| `measure-unit/length-meter` <br/> `unit-width-full-name` | 5 | 5 meters |
-| `currency/CAD` | 10 | CA$10.00 |
-| `currency/CAD` <br/> `unit-width-narrow` | 10 | $10.00 | Use the narrow symbol variant |
-| `compact-short` | 5000 | 5K |
-| `compact-long` | 5000 | 5 thousand |
-| `compact-short` <br/> `currency/CAD` | 5000 | CA$5K |
-| - | 5000 | 5,000 |
-| `group-min2` | 5000 | 5000 | Require 2 digits in group for separator |
-| `group-min2` | 15000 | 15,000 |
-| `sign-always` | 60 | +60 | Show sign on all numbers |
-| `sign-always` | 0 | +0 |
-| `sign-except-zero` | 60 | +60 | Show sign on all numbers except 0 |
-| `sign-except-zero` | 0 | 0 |
-| `sign-accounting` <br/> `currency/CAD` | -40 | (CA$40.00) |
+| Long Skeleton | Concise Skeleton | Input | en-US Output | Comments |
+|---|---|---|---|---|
+| `percent` | `%` | 25 | 25% |
+| `.00` | `.00` | 25 | 25.00 | Equivalent to Precision::fixedFraction(2) |
+| `percent .00` | `% .00` | 25 | 25.00% |
+| `scale/100` | `scale/100` | 0.3 | 30 | Multiply by 100 before formatting |
+| `percent scale/100` | `%x100` | 0.3 | 30% |
+| `measure-unit/length-meter` | `unit/meter` | 5 | 5 m | UnitWidth defaults to Short |
+| `measure-unit/length-meter` <br/> `unit-width-full-name` | `unit/meter` <br/> `unit-width-full-name` | 5 | 5 meters |
+| `currency/CAD` | `currency/CAD` | 10 | CA$10.00 |
+| `currency/CAD` <br/> `unit-width-narrow` | `currency/CAD` <br/> `unit-width-narrow` | 10 | $10.00 | Use the narrow symbol variant |
+| `compact-short` | `K` | 5000 | 5K |
+| `compact-long` | `KK` | 5000 | 5 thousand |
+| `compact-short` <br/> `currency/CAD` | `K currency/CAD` | 5000 | CA$5K |
+| - | - | 5000 | 5,000 |
+| `group-min2` | `,?` | 5000 | 5000 | Require 2 digits in group for separator |
+| `group-min2` | `,?` | 15000 | 15,000 |
+| `sign-always` | `+!` | 60 | +60 | Show sign on all numbers |
+| `sign-always` | `+!` | 0 | +0 |
+| `sign-except-zero` | `+?` | 60 | +60 | Show sign on all numbers except 0 |
+| `sign-except-zero` | `+?` | 0 | 0 |
+| `sign-accounting` <br/> `currency/CAD` | `() currency/CAD` | -40 | (CA$40.00) |

 ## Skeleton Stems and Options

@ -69,16 +80,19 @@ below.

 ### Notation

-Use one of the following stems to select your notation style:
+Use one of the following stems to select compact or simple notation:

- `compact-short`
- `compact-long`
- `scientific`
- `engineering`
- `notation-simple`
+- `compact-short` or `K` (concise)
+- `compact-long` or `KK` (concise)
+- `notation-simple` (or omit since this is default)

-The skeletons `scientific` and `engineering` take the following optional
-options:
+There are two ways to select scientific or engineering notation: using long-form
+syntax or concise syntax.
+
+#### Scientific and Engineering Notation: Long Form
+
+Start with the stem `scientific` or `engineering`.  Those stems take the
+following optional options:

 - `/sign-xxx` sets the sign display option for the exponent; see [Sign](#sign).
 - `/+ee` sets exponent digits to "at least 2"; use `/+eee` for at least 3 digits, etc.
@ -90,16 +104,34 @@ For example, all of the following skeletons are valid:
 - `scientific/+ee`
 - `scientific/+ee/sign-always`

+#### Scientific and Engineering Notation: Concise Form
+
+The following are examples of concise form:
+
+| Concise Skeleton | Equivalent Long-Form Skeleton |
+|---|---|
+| `E0` | `scientific` |
+| `E00` | `scientific/+ee` |
+| `EE+0` | `engineering/sign-always` |
+| `E+?00` | `scientific/sign-except-zero/+ee` |
+
+More precisely:
+
+1. Start with `E` for scientific or `EE` for engineering.
+2. Allow either `+` or `+?` as a concise sign display option.
+3. Expect one or more `0`s.  If more than one, set minimum integer digits.
+
 ### Unit

 The supported types of units are percent, currency, and measurement units.
 The following skeleton tokens are accepted:

- `percent`
+- `percent` or `%` (concise)
+- Special: `%x100` to scale the number by 100 and then format with percent
 - `permille`
 - `base-unit`
 - `currency/XXX`
- `measure-unit/aaaa-bbbb`
+- `measure-unit/aaaa-bbbb` or `unit/bbb` (concise)

 The `percent`, `permille`, and `base-unit` stems do not take any options.

@ -110,13 +142,19 @@ The `measure-unit` stem takes one required option: the unit identifier of the
 unit to be formatted.  The full unit identifier is required: both the type and
 the subtype (for example, `length-meter`).

+The `unit` stem is an alternative to `measure-unit` that accepts a core unit
+identifier with the subtype but not the type (for example, `meter` instead of
+`length-meter`).  It also supports variations allowed by UTS 35, including the per unit with the `-per-` infix (for example, `unit/furlong-per-second`).
+
 ### Per Unit

-To specify a unit to put in the denominator, use the following skeleton token:
+To specify a unit to put in the denominator, use the following skeleton token.
+As with the `measure-unit` stem, pass the unit identifier as the option:

 - `per-measure-unit/aaaa-bbbb`

-As with the `measure-unit` stem, pass the unit identifier as the option.
+Note that if the `unit` stem is used, the demonimator can be placed in the same
+token as the numerator.

 ### Unit Width

@ -219,20 +257,23 @@ Modes](http://userguide.icu-project.org/formatparse/numbers/rounding-modes).
 The following examples show how to specify integer width (minimum or maximum
 integer digits):

-| Token | Explanation | Equivalent C++ Code |
-|---|---|---|
-| `integer-width/+000` | At least 3 <br/> integer digits | `IntegerWidth::zeroFillTo(3)` |
-| `integer-width/##0` | Between 1 and 3 <br/> integer digits | `IntegerWidth::zeroFillTo(1)` <br/> `.truncateAt(3)`
-| `integer-width/00` | Exactly 2 <br/> integer digits | `IntegerWidth::zeroFillTo(2)` <br/> `.truncateAt(2)` |
-| `integer-width/+` | Zero or more <br/> integer digits | `IntegerWidth::zeroFillTo(0) `
+| Long Form | Concise Form | Explanation | Equivalent C++ Code |
+|---|---|---|---|
+| `integer-width/+000` | `000` | At least 3 <br/> integer digits | `IntegerWidth::zeroFillTo(3)` |
+| `integer-width/##0` | - | Between 1 and 3 <br/> integer digits | `IntegerWidth::zeroFillTo(1)` <br/> `.truncateAt(3)`
+| `integer-width/00` | - | Exactly 2 <br/> integer digits | `IntegerWidth::zeroFillTo(2)` <br/> `.truncateAt(2)` |
+| `integer-width/+` | - | Zero or more <br/> integer digits | `IntegerWidth::zeroFillTo(0) `

-The option start with either a single `+` symbols, signaling no limit on the
-number of integer digits (no *truncateAt*), or zero or more `#` symbols.  It
-should then be followed by zero or more `0` symbols, indicating the minimum
+The long-form option starts with either a single `+` symbol, signaling no limit
+on the number of integer digits (no *truncateAt*), or zero or more `#` symbols.
+It should then be followed by zero or more `0` symbols, indicating the minimum
 integer digits (the argument to *zeroFillTo*).  If there is no `+` symbol, the
 maximum integer digits (the argument to *truncateAt*) is the number of `#`
 symbols plus the number of `0` symbols.

+The concise skeleton is simply one or more `0` characters. This supports
+minimum integer digits but not maximum integer digits.
+
 ### Scale

 To specify the scale, use the following stem and option:
@ -258,11 +299,11 @@ is able to be parsed by both engines.

 The grouping strategy can be specified by the following stems:

- `group-off`
- `group-min2`
- `group-auto`
- `group-on-aligned`
- `group-thousands`
+- `group-off` or `,_` (concise)
+- `group-min2` or `,?` (concise)
+- `group-auto` (or omit since this is the default)
+- `group-on-aligned` or `,!` (concise)
+- `group-thousands` or `,=` (concise)

 For more details, see
 [UNumberGroupingStrategy](http://icu-project.org/apiref/icu4c/unumberformatter_8h.html).
@ -280,13 +321,13 @@ A custom NDecimalFormatSymbols instance is not supported at this time.

 The following stems specify sign display:

- `sign-auto`
- `sign-always`
- `sign-never`
- `sign-accounting`
- `sign-accounting-always`
- `sign-except-zero`
- `sign-accounting-except-zero`
+- `sign-auto` (or omit since this is the default)
+- `sign-always` or `+!` (concise)
+- `sign-never` or `+_` (concise)
+- `sign-accounting` or `()` (concise)
+- `sign-accounting-always` or `()!` (concise)
+- `sign-except-zero` or `+?` (concise)
+- `sign-accounting-except-zero` or `()?` (concise)

 For more details, see
 [UNumberSignDisplay](http://icu-project.org/apiref/icu4c/unumberformatter_8h.html).
--- a/icu4c/source/i18n/number_skeletons.cpp
+++ b/icu4c/source/i18n/number_skeletons.cpp
@ -21,6 +21,7 @@
 #include "uinvchar.h"
 #include "charstr.h"
 #include "string_segment.h"
+#include "unicode/errorcode.h"

 using namespace icu;
 using namespace icu::number;
@ -93,12 +94,29 @@ void U_CALLCONV initNumberSkeletons(UErrorCode& status) {
    b.add(u"precision-increment", STEM_PRECISION_INCREMENT, status);
    b.add(u"measure-unit", STEM_MEASURE_UNIT, status);
    b.add(u"per-measure-unit", STEM_PER_MEASURE_UNIT, status);
+    b.add(u"unit", STEM_UNIT, status);
    b.add(u"currency", STEM_CURRENCY, status);
    b.add(u"integer-width", STEM_INTEGER_WIDTH, status);
    b.add(u"numbering-system", STEM_NUMBERING_SYSTEM, status);
    b.add(u"scale", STEM_SCALE, status);
    if (U_FAILURE(status)) { return; }

+    // Section 3 (concise tokens):
+    b.add(u"K", STEM_COMPACT_SHORT, status);
+    b.add(u"KK", STEM_COMPACT_LONG, status);
+    b.add(u"%", STEM_PERCENT, status);
+    b.add(u"%x100", STEM_PERCENT_100, status);
+    b.add(u",_", STEM_GROUP_OFF, status);
+    b.add(u",?", STEM_GROUP_MIN2, status);
+    b.add(u",!", STEM_GROUP_ON_ALIGNED, status);
+    b.add(u"+!", STEM_SIGN_ALWAYS, status);
+    b.add(u"+_", STEM_SIGN_NEVER, status);
+    b.add(u"()", STEM_SIGN_ACCOUNTING, status);
+    b.add(u"()!", STEM_SIGN_ACCOUNTING_ALWAYS, status);
+    b.add(u"+?", STEM_SIGN_EXCEPT_ZERO, status);
+    b.add(u"()?", STEM_SIGN_ACCOUNTING_EXCEPT_ZERO, status);
+    if (U_FAILURE(status)) { return; }
+
    // Build the CharsTrie
    // TODO: Use SLOW or FAST here?
    UnicodeString result;
@ -529,6 +547,7 @@ MacroProps skeleton::parseSkeleton(
                case STATE_INCREMENT_PRECISION:
                case STATE_MEASURE_UNIT:
                case STATE_PER_MEASURE_UNIT:
+                case STATE_IDENTIFIER_UNIT:
                case STATE_CURRENCY_UNIT:
                case STATE_INTEGER_WIDTH:
                case STATE_NUMBERING_SYSTEM:
@ -564,6 +583,14 @@ skeleton::parseStem(const StringSegment& segment, const UCharsTrie& stemTrie, Se
        CHECK_NULL(seen, precision, status);
            blueprint_helpers::parseDigitsStem(segment, macros, status);
            return STATE_NULL;
+        case u'E':
+        CHECK_NULL(seen, notation, status);
+            blueprint_helpers::parseScientificStem(segment, macros, status);
+            return STATE_NULL;
+        case u'0':
+        CHECK_NULL(seen, notation, status);
+            blueprint_helpers::parseIntegerStem(segment, macros, status);
+            return STATE_NULL;
        default:
            break;
    }
@ -604,6 +631,13 @@ skeleton::parseStem(const StringSegment& segment, const UCharsTrie& stemTrie, Se
            macros.unit = stem_to_object::unit(stem);
            return STATE_NULL;

+        case STEM_PERCENT_100:
+        CHECK_NULL(seen, scale, status);
+        CHECK_NULL(seen, unit, status);
+            macros.scale = Scale::powerOfTen(2);
+            macros.unit = NoUnit::percent();
+            return STATE_NULL;
+
        case STEM_PRECISION_INTEGER:
        case STEM_PRECISION_UNLIMITED:
        case STEM_PRECISION_CURRENCY_STANDARD:
@ -683,6 +717,11 @@ skeleton::parseStem(const StringSegment& segment, const UCharsTrie& stemTrie, Se
        CHECK_NULL(seen, perUnit, status);
            return STATE_PER_MEASURE_UNIT;

+        case STEM_UNIT:
+        CHECK_NULL(seen, unit, status);
+        CHECK_NULL(seen, perUnit, status);
+            return STATE_IDENTIFIER_UNIT;
+
        case STEM_CURRENCY:
        CHECK_NULL(seen, unit, status);
            return STATE_CURRENCY_UNIT;
@ -719,6 +758,9 @@ ParseState skeleton::parseOption(ParseState stem, const StringSegment& segment,
        case STATE_PER_MEASURE_UNIT:
            blueprint_helpers::parseMeasurePerUnitOption(segment, macros, status);
            return STATE_NULL;
+        case STATE_IDENTIFIER_UNIT:
+            blueprint_helpers::parseIdentifierUnitOption(segment, macros, status);
+            return STATE_NULL;
        case STATE_INCREMENT_PRECISION:
            blueprint_helpers::parseIncrementOption(segment, macros, status);
            return STATE_NULL;
@ -981,7 +1023,7 @@ void blueprint_helpers::generateMeasureUnitOption(const MeasureUnit& measureUnit

 void blueprint_helpers::parseMeasurePerUnitOption(const StringSegment& segment, MacroProps& macros,
                                                  UErrorCode& status) {
-    // A little bit of a hack: safe the current unit (numerator), call the main measure unit
+    // A little bit of a hack: save the current unit (numerator), call the main measure unit
    // parsing code, put back the numerator unit, and put the new unit into per-unit.
    MeasureUnit numerator = macros.unit;
    parseMeasureUnitOption(segment, macros, status);
@ -990,6 +1032,22 @@ void blueprint_helpers::parseMeasurePerUnitOption(const StringSegment& segment,
    macros.unit = numerator;
 }

+void blueprint_helpers::parseIdentifierUnitOption(const StringSegment& segment, MacroProps& macros,
+                                                  UErrorCode& status) {
+    // Need to do char <-> UChar conversion...
+    U_ASSERT(U_SUCCESS(status));
+    CharString buffer;
+    SKELETON_UCHAR_TO_CHAR(buffer, segment.toTempUnicodeString(), 0, segment.length(), status);
+
+    ErrorCode internalStatus;
+    MeasureUnit::parseCoreUnitIdentifier(buffer.toStringPiece(), &macros.unit, &macros.perUnit, internalStatus);
+    if (internalStatus.isFailure()) {
+        // throw new SkeletonSyntaxException("Invalid core unit identifier", segment, e);
+        status = U_NUMBER_SKELETON_SYNTAX_ERROR;
+        return;
+    }
+}
+
 void blueprint_helpers::parseFractionStem(const StringSegment& segment, MacroProps& macros,
                                          UErrorCode& status) {
    U_ASSERT(segment.charAt(0) == u'.');
@ -1027,7 +1085,11 @@ void blueprint_helpers::parseFractionStem(const StringSegment& segment, MacroPro
    }
    // Use the public APIs to enforce bounds checking
    if (maxFrac == -1) {
-        macros.precision = Precision::minFraction(minFrac);
+        if (minFrac == 0) {
+            macros.precision = Precision::unlimited();
+        } else {
+            macros.precision = Precision::minFraction(minFrac);
+        }
    } else {
        macros.precision = Precision::minMaxFraction(minFrac, maxFrac);
    }
@ -1051,9 +1113,9 @@ blueprint_helpers::generateFractionStem(int32_t minFrac, int32_t maxFrac, Unicod
 void
 blueprint_helpers::parseDigitsStem(const StringSegment& segment, MacroProps& macros, UErrorCode& status) {
    U_ASSERT(segment.charAt(0) == u'@');
-    int offset = 0;
-    int minSig = 0;
-    int maxSig;
+    int32_t offset = 0;
+    int32_t minSig = 0;
+    int32_t maxSig;
    for (; offset < segment.length(); offset++) {
        if (segment.charAt(offset) == u'@') {
            minSig++;
@ -1101,6 +1163,75 @@ blueprint_helpers::generateDigitsStem(int32_t minSig, int32_t maxSig, UnicodeStr
    }
 }

+void blueprint_helpers::parseScientificStem(const StringSegment& segment, MacroProps& macros, UErrorCode& status) {
+    U_ASSERT(segment.charAt(0) == u'E');
+    {
+        int32_t offset = 1;
+        if (segment.length() == offset) {
+            goto fail;
+        }
+        bool isEngineering = false;
+        if (segment.charAt(offset) == u'E') {
+            isEngineering = true;
+            offset++;
+            if (segment.length() == offset) {
+                goto fail;
+            }
+        }
+        UNumberSignDisplay signDisplay = UNUM_SIGN_AUTO;
+        if (segment.charAt(offset) == u'+') {
+            offset++;
+            if (segment.length() == offset) {
+                goto fail;
+            }
+            if (segment.charAt(offset) == u'!') {
+                signDisplay = UNUM_SIGN_ALWAYS;
+            } else if (segment.charAt(offset) == u'?') {
+                signDisplay = UNUM_SIGN_EXCEPT_ZERO;
+            } else {
+                goto fail;
+            }
+            offset++;
+            if (segment.length() == offset) {
+                goto fail;
+            }
+        }
+        int32_t minDigits = 0;
+        for (; offset < segment.length(); offset++) {
+            if (segment.charAt(offset) != u'0') {
+                goto fail;
+            }
+            minDigits++;
+        }
+        macros.notation = (isEngineering ? Notation::engineering() : Notation::scientific())
+            .withExponentSignDisplay(signDisplay)
+            .withMinExponentDigits(minDigits);
+        return;
+    }
+    fail: void();
+    // throw new SkeletonSyntaxException("Invalid scientific stem", segment);
+    status = U_NUMBER_SKELETON_SYNTAX_ERROR;
+    return;
+}
+
+void blueprint_helpers::parseIntegerStem(const StringSegment& segment, MacroProps& macros, UErrorCode& status) {
+    U_ASSERT(segment.charAt(0) == u'0');
+    int32_t offset = 1;
+    for (; offset < segment.length(); offset++) {
+        if (segment.charAt(offset) != u'0') {
+            offset--;
+            break;
+        }
+    }
+    if (offset < segment.length()) {
+        // throw new SkeletonSyntaxException("Invalid integer stem", segment);
+        status = U_NUMBER_SKELETON_SYNTAX_ERROR;
+        return;
+    }
+    macros.integerWidth = IntegerWidth::zeroFillTo(offset);
+    return;
+}
+
 bool blueprint_helpers::parseFracSigOption(const StringSegment& segment, MacroProps& macros,
                                           UErrorCode& status) {
    if (segment.charAt(0) != u'@') {
--- a/icu4c/source/i18n/number_skeletons.h
+++ b/icu4c/source/i18n/number_skeletons.h
@ -46,6 +46,7 @@ enum ParseState {
    STATE_INCREMENT_PRECISION,
    STATE_MEASURE_UNIT,
    STATE_PER_MEASURE_UNIT,
+    STATE_IDENTIFIER_UNIT,
    STATE_CURRENCY_UNIT,
    STATE_INTEGER_WIDTH,
    STATE_NUMBERING_SYSTEM,
@ -71,6 +72,7 @@ enum StemEnum {
    STEM_BASE_UNIT,
    STEM_PERCENT,
    STEM_PERMILLE,
+    STEM_PERCENT_100, // concise-only
    STEM_PRECISION_INTEGER,
    STEM_PRECISION_UNLIMITED,
    STEM_PRECISION_CURRENCY_STANDARD,
@ -109,6 +111,7 @@ enum StemEnum {
    STEM_PRECISION_INCREMENT,
    STEM_MEASURE_UNIT,
    STEM_PER_MEASURE_UNIT,
+    STEM_UNIT,
    STEM_CURRENCY,
    STEM_INTEGER_WIDTH,
    STEM_NUMBERING_SYSTEM,
@ -226,6 +229,8 @@ void generateMeasureUnitOption(const MeasureUnit& measureUnit, UnicodeString& sb

 void parseMeasurePerUnitOption(const StringSegment& segment, MacroProps& macros, UErrorCode& status);

+void parseIdentifierUnitOption(const StringSegment& segment, MacroProps& macros, UErrorCode& status);
+
 void parseFractionStem(const StringSegment& segment, MacroProps& macros, UErrorCode& status);

 void generateFractionStem(int32_t minFrac, int32_t maxFrac, UnicodeString& sb, UErrorCode& status);
@ -234,6 +239,14 @@ void parseDigitsStem(const StringSegment& segment, MacroProps& macros, UErrorCod

 void generateDigitsStem(int32_t minSig, int32_t maxSig, UnicodeString& sb, UErrorCode& status);

+void parseScientificStem(const StringSegment& segment, MacroProps& macros, UErrorCode& status);
+
+// Note: no generateScientificStem since this syntax was added later in ICU 67
+
+void parseIntegerStem(const StringSegment& segment, MacroProps& macros, UErrorCode& status);
+
+// Note: no generateIntegerStem since this syntax was added later in ICU 67
+
 /** @return Whether we successfully found and parsed a frac-sig option. */
 bool parseFracSigOption(const StringSegment& segment, MacroProps& macros, UErrorCode& status);

--- a/icu4c/source/test/intltest/numbertest.h
+++ b/icu4c/source/test/intltest/numbertest.h
@ -114,16 +114,38 @@ class NumberFormatterApiTest : public IntlTestWithFieldPosition {
    DecimalFormatSymbols SWISS_SYMBOLS;
    DecimalFormatSymbols MYANMAR_SYMBOLS;

-    void assertFormatDescending(const char16_t* message, const char16_t* skeleton,
-                                const UnlocalizedNumberFormatter& f, Locale locale, ...);
+    /**
+     * skeleton is the full length skeleton, which must round-trip.
+     *
+     * conciseSkeleton should be the shortest available skeleton.
+     * The concise skeleton can be read but not printed.
+     */
+    void assertFormatDescending(
+      const char16_t* message,
+      const char16_t* skeleton,
+      const char16_t* conciseSkeleton,
+      const UnlocalizedNumberFormatter& f,
+      Locale locale,
+      ...);

-    void assertFormatDescendingBig(const char16_t* message, const char16_t* skeleton,
-                                   const UnlocalizedNumberFormatter& f, Locale locale, ...);
+    /** See notes above regarding skeleton vs conciseSkeleton */
+    void assertFormatDescendingBig(
+      const char16_t* message,
+      const char16_t* skeleton,
+      const char16_t* conciseSkeleton,
+      const UnlocalizedNumberFormatter& f,
+      Locale locale,
+      ...);

-    FormattedNumber
-    assertFormatSingle(const char16_t* message, const char16_t* skeleton,
-                       const UnlocalizedNumberFormatter& f, Locale locale, double input,
-                       const UnicodeString& expected);
+    /** See notes above regarding skeleton vs conciseSkeleton */
+    FormattedNumber assertFormatSingle(
+      const char16_t* message,
+      const char16_t* skeleton,
+      const char16_t* conciseSkeleton,
+      const UnlocalizedNumberFormatter& f,
+      Locale locale,
+      double input,
+      const UnicodeString& expected);

    void assertUndefinedSkeleton(const UnlocalizedNumberFormatter& f);

--- a/icu4c/source/test/intltest/numbertest_api.cpp
+++ b/icu4c/source/test/intltest/numbertest_api.cpp
--- a/icu4c/source/test/intltest/numbertest_skeletons.cpp
+++ b/icu4c/source/test/intltest/numbertest_skeletons.cpp
@ -70,6 +70,7 @@ void NumberSkeletonTest::validTokens() {
            u"measure-unit/length-meter",
            u"measure-unit/area-square-meter",
            u"measure-unit/energy-joule per-measure-unit/length-meter",
+            u"unit/square-meter-per-square-meter",
            u"currency/XXX",
            u"currency/ZZZ",
            u"currency/usd",
@ -105,7 +106,20 @@ void NumberSkeletonTest::validTokens() {
            u"numbering-system/latn",
            u"precision-integer/@##",
            u"precision-integer rounding-mode-ceiling",
-            u"precision-currency-cash rounding-mode-ceiling"};
+            u"precision-currency-cash rounding-mode-ceiling",
+            u"0",
+            u"00",
+            u"000",
+            u"E0",
+            u"E00",
+            u"E000",
+            u"EE0",
+            u"EE00",
+            u"EE+?0",
+            u"EE+?00",
+            u"EE+!0",
+            u"EE+!00",
+    };

    for (auto& cas : cases) {
        UnicodeString skeletonString(cas);
@ -151,7 +165,20 @@ void NumberSkeletonTest::invalidTokens() {
            u"integer-width/+0#",
            u"integer-width/+#",
            u"integer-width/+#0",
-            u"scientific/foo"};
+            u"scientific/foo",
+            u"E",
+            u"E1",
+            u"E+",
+            u"E+?",
+            u"E+!",
+            u"E+0",
+            u"EE",
+            u"EE+",
+            u"EEE",
+            u"EEE0",
+            u"001",
+            u"00+",
+    };

    expectedErrorSkeleton(cases, UPRV_LENGTHOF(cases));
 }