Two issues here:
- fix 2 build issue in i18n when compiling with clang++ -fsanitize=undefined
the following two symbols were not exported (and they should be):
typeinfo for icu::CollationCacheEntry
typeinfo for icu::numparse::impl::CodePointMatcher
- remove undefined behavior warning in NumberFormatTestTuple.. minor, but very annoying
when repeated many times during every test run. Tends to mask real errors.
> numberformattesttuple.cpp:319:5: runtime error: member access within null pointer of type 'NumberFormatTestTuple'
If you call the API getDefaultHourCycle on an empty DateTimePatternGenerator
instance (ie: no locale) then it calls UPRV_UNREACHABLE which calls abort().
We should return an error code instead of aborting.
Compiled regular expression patterns make use of several shared common
UnicodeSets. This change simplifies the creation and use of these
static UnicodeSets.
- Pointer fields to the static sets are removed from the compiled patterns,
and the static variables are accessed directly. The deleted pointers
were a hold-over from earlier code that did not use shared statics.
- The UnicodeSet pattern literals are changed from hex constants to
u"string literals".
- The size of fRuleSets (from regexst.h) is changed from a hard-coded 10
to the number of UnicodeSets actually required. Doing this required
a change to regexcst.pl to export the required size. Changing and
rerunning this perl code resulted in massive but benign changes to
the generated file regexcst.h, the result of perl having changed its
order of enumeration of hashes since the file was last regenerated.
- UnicodeSets are frozen when possible. Should result in faster matching.
DateTimePatternGenerator needs to consider the hour-cycle preferred by
Locale. This means that we need to to override the hour-cycle when a
locale contains "hc" keyword. This patch is adding such functionality.
In addition, "DateTimePatternGenerator::adjustFieldTypes" should adjust
hour field to properly follow tr35
spec(https://www.unicode.org/reports/tr35/tr35-dates.html#dfst-hour).
- Merge the look-ahead results slots used when multiple rules share a common accepting state.
- Sequentially number the look-ahead result slot. Will eventually allow replacing the runtime map with an array.
- Inhibit chaining out of look-ahead rules. This could never actually happen; when a hard break
rule matches, the engine is stopped immediately, but the state table was being constructed
as if it could happen. Reduces table size for line break rules.
- Remove incorrect handling of fAccepting and fLookAhead fields of a state table row
when removing duplicate states. Look-ahead slot number was being mis-interpreted as a state number.
See PR #915, see changes.txt
- Unicode 13 beta data as of 2019-nov-21
- uprops.icu format version 7.7 with more bits for Script/Script_Extensions
- more bits in spoof checker ScriptSet
- root line break rules adjusted for UAX 14 changes, from Andy
- line break tailorings not yet in sync with root