Commit Graph

22479 Commits

Author SHA1 Message Date
Andy Heninger
54a60fe6f4 ICU-11548 Improve regex static UnicodeSets handling
Compiled regular expression patterns make use of several shared common
UnicodeSets. This change simplifies the creation and use of these
static UnicodeSets.

- Pointer fields to the static sets are removed from the compiled patterns,
  and the static variables are accessed directly. The deleted pointers
  were a hold-over from earlier code that did not use shared statics.

- The UnicodeSet pattern literals are changed from hex constants to
  u"string literals".

- The size of fRuleSets (from regexst.h) is changed from a hard-coded 10
  to the number of UnicodeSets actually required. Doing this required
  a change to regexcst.pl to export the required size. Changing and
  rerunning this perl code resulted in massive but benign changes to
  the generated file regexcst.h, the result of perl having changed its
  order of enumeration of hashes since the file was last regenerated.

- UnicodeSets are frozen when possible. Should result in faster matching.
2020-01-30 15:13:07 -08:00
Egor Pugin
76f190024d ICU-20938 Add --skip-dll-export option to genccode to prevent exporting statically linked ICU data from executables. 2020-01-23 12:00:29 -08:00
Keita Suzuki
a4a5c603ac ICU-20767 Potential negative index access in one of the sample codes 2020-01-22 13:13:27 -08:00
Shane Carr
8c717b514e ICU-20665 Removing number-dependence from ICU4C FormattedStringBuilder fields.
See #727
2020-01-17 11:22:02 +01:00
Frank Yung-Fong Tang
21df05234d ICU-20673 Allow built-in translit ID w/o data.
See #958
2020-01-16 21:28:01 -08:00
Shane Carr
0ad2f9590b ICU-20418 Fix indentation of CHECK_NULL in number_skeletons.cpp 2020-01-14 11:52:27 +01:00
Shane Carr
fe98d870b2 ICU-20418 Adding concise number skeletons in ICU4C 2020-01-14 11:52:27 +01:00
Shane Carr
df8841aa6f ICU-20418 Adding *internal* parse method for core unit identifiers.
Also see ICU-20286
2020-01-14 11:52:27 +01:00
Shane Carr
b24538eb05 ICU-20921 Adding find and compare to StringPiece 2020-01-14 11:52:27 +01:00
Joshua Root
a3078fb8c8 ICU-20875 Include <cstddef> for max_align_t
The definition of max_align_t is not guaranteed to be available unless
the appropriate header is included. Since use of <stddef.h> from C++ is
deprecated, that's <cstddef>, and max_align_t is thus defined under the
std namespace rather than in the global namespace.
2020-01-09 15:42:52 -08:00
Caio Lima
09d409f5f4 ICU-20442 Adding support for hour-cycle on DateTimePatternGenerator
DateTimePatternGenerator needs to consider the hour-cycle preferred by
Locale. This means that we need to to override the hour-cycle when a
locale contains "hc" keyword. This patch is adding such functionality.
In addition, "DateTimePatternGenerator::adjustFieldTypes" should adjust
hour field to properly follow tr35
spec(https://www.unicode.org/reports/tr35/tr35-dates.html#dfst-hour).
2020-01-09 16:45:56 +01:00
Smaarn
996da8faac ICU-20871 Fixed: no rule was defined to create the $(OUTDIR) directory if it didn't exist.
This would cause failures during cross compilation cases such as:

make[6]: Leaving directory '/spksrc/spk/bazarr/work-qoriq-6.1/icu/source/data'
make[5]: *** No rule to make target 'out', needed by 'out/icudt64b.dat'.  Stop.
2020-01-08 15:42:35 +01:00
Hugh McMaster
5aae52d3ef ICU-20924 Use pkg-config to generate the path to pkgdata.inc 2020-01-07 14:19:02 -08:00
Frank Tang
11ad8d69fb ICU-20934 Fix TZ test error
Somehow these tests are now fail on trunks.
Per https://mm.icann.org/pipermail/tz-announce/2019-July/000056.html
     Brazil has canceled DST and will stay on standard time indefinitely.
2020-01-03 20:52:11 -08:00
Frank Tang
4a8483be91 ICU-20900 Fix createCanonical
See #922
2020-01-03 15:00:04 -08:00
Markus Scherer
60b567d6ab ICU-20917 LocaleMatcher: prefer a more-default locale 2020-01-02 18:00:52 -08:00
Frank Tang
79fac50101 ICU-20310 omit "-true" in toLanguageTag
See #952
2019-12-30 15:39:59 -08:00
Markus Scherer
cb1d4f5903 ICU-20916 UBSan & ErrorProne fixes 2019-12-20 14:56:31 -08:00
Markus Scherer
ad638c274e ICU-20916 LocaleMatcher distinguish between equivalent locales
- equivalent but originally unequal
- locale distance shifted left for additional fraction bits with micro distance
- Java more verbose matcher debug output
See #949
2019-12-20 09:36:57 -08:00
Shane Carr
46ec4fd523 ICU-12863 Add list style APIs to C and C++
See #894
2019-12-17 13:07:36 -08:00
Andy Heninger
faa2f9f9e1 ICU-20303 Break Iterator, improve handling of look-ahead rules.
- Merge the look-ahead results slots used when multiple rules share a common accepting state.
- Sequentially number the look-ahead result slot. Will eventually allow replacing the runtime map with an array.
- Inhibit chaining out of look-ahead rules. This could never actually happen; when a hard break
  rule matches, the engine is stopped immediately, but the state table was being constructed
  as if it could  happen. Reduces table size for line break rules.
- Remove incorrect handling of fAccepting and fLookAhead fields of a state table row
  when removing duplicate states. Look-ahead slot number was being mis-interpreted as a state number.
2019-12-13 13:17:21 -08:00
Shane Carr
7917df1e80 ICU-20883 Move UFormattedDateInterval to end of argument list. 2019-12-12 13:48:28 -08:00
Frank Tang
923ec1ad30 ICU-20436 Add getDefaultHourCycle to DateTimePatternGenerator
See #901
2019-12-12 00:13:37 -08:00
Rosen Penev
8fda72f6d8 ICU-20877 i18n: Don't use C++11 math
It's not available with some libc implementations. Specifically,
BIONIC and uClibc-ng. uprv_ variants are available.

Signed-off-by: Rosen Penev <rosenp@gmail.com>
2019-12-11 20:55:23 -08:00
Joshua Root
c6fd07cdec ICU-20904 Don't use char16_t with C++98/03
When C code includes the ICU headers, the UChar type is defined to be
uint16_t. But when C++ code includes the headers, UChar is char16_t
even when U_SHOW_CPLUSPLUS_API has been set to 0. Apart from arguably
being an inconsistency in the API, this means that C++98 or C++03 code
can't use the C API even though C99 code can.

So, change unicode/umachine.h to check not just whether __cplusplus is
defined but the value of U_CPLUSPLUS_VERSION when deciding how to
typedef UChar.
2019-12-11 18:41:27 -08:00
Shane F. Carr
39eb0f4fbf
ICU-20919 Merge maint/maint-66 (release-66-preview) to master 2019-12-11 15:25:36 -08:00
Caio Lima
7c147e4e85 ICU-20741 Changing SimpleDateTimeFormat::subFormat to only include 1 field at the same position when there is a data fallback 2019-12-10 21:53:47 -08:00
Steven R. Loomis
ffbc8cf85f ICU-20857 update API Change Report for ICU 66preview
- uses tools updated in [ICU-20910]
2019-12-03 17:11:37 -08:00
Peter Edberg
e2afc5486d ICU-20857 BRS66 update urename.h 2019-12-03 08:53:23 -08:00
Jeff Genovy
f3e2f4f02e ICU-20857 Update Readme for ICU 66 Preview. 2019-12-02 15:13:15 -08:00
Jeff Genovy
afaff40164 ICU-20907 Disable optimization on Windows when building for ARM64 with Visual Studio versions below 16.4. 2019-11-27 15:35:58 -08:00
Andy Heninger
197e0239ab ICU-20893 Line break tailorings updated to Unicode 13. 2019-11-26 15:25:06 -08:00
Shane Carr
017c8b762e ICU-20890 Change locale_dependencies.py into LOCALE_DEPS.json files
- Refactors Python to make I/O operations more abstract
- Adds stable sample data for Python test
2019-11-22 20:23:30 -08:00
Peter Edberg
04c8616f93 ICU-20857 integrate CLDR release-36-1-preview to maint-66 2019-11-22 19:01:36 -08:00
Caio Lima
873e2db780 ICU-20741 Adding tests for C/C++ API into DateFormatTests 2019-11-22 15:43:27 -08:00
Markus Scherer
a7e378d587 ICU-20893 Unicode 13 beta
See PR #915, see changes.txt
- Unicode 13 beta data as of 2019-nov-21
- uprops.icu format version 7.7 with more bits for Script/Script_Extensions
- more bits in spoof checker ScriptSet
- root line break rules adjusted for UAX 14 changes, from Andy
- line break tailorings not yet in sync with root
2019-11-21 17:35:53 -08:00
Peter Edberg
ceb84b5dde ICU-20844 remove restriction on minInt=minFrac=0, ensure doFastFormatInt32
and NumberFormatterImpl::writeNumber produce at least 1 result digit (#917)
2019-11-13 16:15:02 -08:00
Frank Tang
afbd1b91d9 ICU-20705 Add udtitvfmt_formatCalendarToResult
See #896
2019-11-12 09:34:52 -08:00
Mihai Nita
17d23d71c0 ICU-20739 Force seconds if the skeleton has fractional seconds 2019-11-08 16:03:40 -08:00
Shane Carr
cfb298f035 ICU-20709 Use SIGNUM_COUNT for number of entries in Signum enum. 2019-11-05 14:43:34 -08:00
Shane Carr
00946cef43 ICU-20709 Moving rounder call before number properties.
- Changes EXCEPT_ZERO notation to hide sign on numbers that round to zero.
- Adds additional tests for this behavior.
2019-11-05 14:43:34 -08:00
Shane Carr
e7b540d1af ICU-20709 Refactoring number formatter to apply pattern after compact notation. 2019-11-05 14:43:34 -08:00
Shane Carr
369e67221c ICU-20709 Adding fourth signum type. Converting Java to use enum. 2019-11-05 14:43:34 -08:00
Frank Tang
fab4c3c719 ICU-20884 initialized buffer uloc_getKeywordValue 2019-11-05 13:51:35 -08:00
Frank Yung-Fong Tang
3735b6b8c0 ICU-20872 remove extra ; after function {}
See #888
2019-11-05 11:43:02 -08:00
Andy Heninger
1206f07a52 ICU-20863 Regex Named Capture map, add a missing nullptr check. 2019-10-28 21:10:41 -07:00
Andy Heninger
e94657e614 ICU-20863 Regex Named Capture map, add a missing nullptr check. 2019-10-28 16:53:18 -07:00
Frank Tang
84f6735fde ICU-20478 Sort variant in (for|to)LanguageTag of icu::Locale and ULocale
See #836
2019-10-28 14:57:10 -07:00
Frank Yung-Fong Tang
176674f9f1 ICU-20872 remove extra ; after function {}
See #888
2019-10-23 11:30:29 -07:00
Andy Heninger
03937347fb ICU-20863 Regex, lazy creation and reduced size of map from capture group names to numbers. 2019-10-22 17:23:26 -07:00