Commit Graph

31152 Commits

Author SHA1 Message Date
Frank Tang
8ca80c4b6d ICU-21158 Fix doc of UDISPCTX_NO_SUBSTITUTE
See #1200
2020-07-31 18:39:46 -07:00
Frank Tang
7ddc231195 ICU-20734 Improve fuzzer_driver
See #1204
2020-07-31 15:30:03 -07:00
Frank Tang
41d1d57af0 ICU-21122 Fix flaky TestAdoptCalendarLeak 2020-07-29 20:39:55 -07:00
Frank Tang
d7ec310436 ICU-20684 Fix uninitialized in isMatchAtCPBoundary
Downstream bug https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=15505
Fix Fuzzer-detected Use-of-uninitialized-value in isMatchAtCPBoundary

To test to show the bug in the new test case, configure and build with
CFLAGS="-fsanitize=memory" CXXFLAGS="-fsanitize=memory" ./runConfigureICU \
  --enable-debug --disable-release  Linux  --disable-layoutex

Test with
cintltst /tsutil/custrtst
2020-07-29 14:21:53 -07:00
Andy Heninger
895aff3bff ICU-21178 Add check for corrupt rbbitst.txt data.
In the test data from rbbitst.txt, two or more adjacent boundary markers with
no intervening test data were accepted, with no indication of a problem.

This situation occurred, as described in bug ICU-21178, with a bad import of
some test cases from CLDR. PR #1194 corrected the problem with the test data
in ICU4C. This PR adds code to flag this situation in the test data, and
also propagates the data fix to ICU4J's copy of rbbitst.txt.
2020-07-24 15:16:12 -07:00
Frank Tang
0d4b1c1cb9 ICU-21160 Fix the length return by preflight
See #1178
2020-07-21 18:05:20 -07:00
Andy Heninger
003b431540 ICU-13590 RBBI, improve handling of concurrent look-ahead rules.
Change the mapping from rule number to boundary position to use a simple array
instead of a linear search lookup map.

Look-ahead rules have a preceding context, a boundary position, and following context.
In the implementation, when the preceding context matches, the potential boundary
position is saved. Then, if the following context proves to match, the saved boundary is
returned as an actual boundary.

Look-ahead rules are numbered, and the implementation maintains a map from
rule number to the tentative saved boundary position.

In an earlier improvement to the rule builder, the rule numbering was changed to be a
contiguous sequence, from the original sparse numbering. In anticipation of
changing the mapping from number to position to use a simple array.
2020-07-21 14:39:15 -07:00
Ramon
2de2585f1b ICU-13339 Do not parse decimal point for integers 2020-07-20 23:52:59 -05:00
Hugo van der Merwe
e734111ee5 ICU-21192 MeasureUnit Identifier spec compliance: s/p/pow/
Specification:
https://www.unicode.org/reports/tr35/tr35-general.html#Unit_Identifiers
2020-07-16 01:58:32 +02:00
David Beaumont
dfc8b8b746 ICU-20697 Delete now unused files and documentation for the old ICU LDML tooling. 2020-07-14 20:27:28 +02:00
Michael Block
f917c43cf1 ICU-21178 Adding the trailing space back into two RBBI test cases. 2020-07-07 16:05:05 -07:00
Makoto Kato
c9037ca8d3 ICU-11992 uprv_tzname doesn't return valid time zone on Android 2020-07-06 10:11:20 -07:00
John Wilcock
6fe86f3934 ICU-21173 Add support for more currency variants. ICU4C equivalent of…
See #1184
2020-07-03 04:51:15 +02:00
Hugo van der Merwe
3fca290880 ICU-21174 Add a const version of MaybeStackVector::getAlias().
(Also makes a tiny tweak to appendAll() documentation.)
2020-07-02 01:56:08 +02:00
Markus Scherer
4d428cb8f3 ICU-21176 spoof checker: remove whitelist/blacklist metaphors from API docs 2020-07-01 15:21:05 -07:00
John Wilcock
9219c6ae03 ICU-13733 Added test for mismatching currency format for strict-mode parsing
See #1169
2020-06-30 02:22:57 +02:00
Diego Barrios Romero
de0306daaa ICU-21170 Fix function prototypes 2020-06-25 15:31:11 -07:00
Jeff Genovy
8727c56501 ICU-21177 Update README.md badges to point to new Azure Pipelines URL. 2020-06-25 13:22:10 -07:00
Jeff Genovy
6445e68dcc ICU-21177 Skip running the Exhaustive tests on changes to docs and other files. 2020-06-25 10:59:36 -07:00
Łukasz Wojniłowicz
ed56301abd ICU-20545 Ensure that path ends with detected file separator
CharString, when asked, appends U_FILE_SEP_CHAR at the end of the string
it holds, if it won't find U_FILE_SEP_CHAR or U_FILE_ALT_SEP_CHAR there.
The problem starts if the dir variable uses
U_FILE_ALT_SEP_CHAR which is not equal to U_FILE_SEP_CHAR. Then the
resulting path could look like this
../data\
instead of this
../data/

This patch uses U_FILE_SEP_CHAR unless it detects that the dir variable
doesn't use it, and uses U_FILE_ALT_SEP_CHAR instead.
2020-06-24 11:38:41 -07:00
Markus Scherer
ef12882fdb ICU-21144 LocaleMatcher setMaxDistance(), isMatch() 2020-06-23 13:56:49 -07:00
David Beaumont
17f889bd0e ICU-21142 Hopefully fixing the install script (bash only) 2020-06-22 19:46:32 +02:00
Andy Heninger
99dc49a0c0 ICU-20869 Fix compiler warning in FixedDecimal::getFractionalDigits().
Fix a clang compiler warning and a potential undefined behavior arising
from casting an out-of-range double to an int. See the Jira ticket for a
more detailed description of the problem.

This PR is to fix the immediate problem. Longer term, the function
may be replaced entirely - see issue ICU-21147.
2020-06-19 11:31:42 -07:00
David Beaumont
3b17a492be ICU-21142 Improving Maven handling of CLDR API Jar file for new tools. 2020-06-19 14:03:36 +02:00
Łukasz Wojniłowicz
cd5b025ef8 ICU-20545 Detect file separator char from dir
If udata_create won't find U_FILE_SEP_CHAR at the end of a dir variable,
then it appends it. The problem starts if the dir variable uses
U_FILE_ALT_SEP_CHAR which is not equal to U_FILE_SEP_CHAR. Then the
resulting path could look like this
../data\mappings/cns-11643-1992.ucm
instead of this
../data/mappings/cns-11643-1992.ucm

This patch uses U_FILE_SEP_CHAR unless it detects that the dir variable
doesn't use it, and uses U_FILE_ALT_SEP_CHAR instead.
2020-06-18 10:54:25 -07:00
Hugo van der Merwe
55127d6778 ICU-21165 Add LdmlConverter UNITS output, update SUPPLEMENTAL_DATA.
- Produce new supplementalData.txt and units.txt with:

      ant -f build-icu-data.xml -DoutDir=/tmp/new_dir \
          -DcldrVersion=37 -DoutputTypes=UNITS,SUPPLEMENTAL_DATA
2020-06-18 09:57:34 +02:00
Hugo van der Merwe
6a1df9e16c ICU-21169 Add SingleUnitImpl::getSimpleUnitID().
Also:
- Use BytesTrie not UCharsTrie.
- Add a nullptr check for a uprv_malloc.
2020-06-18 09:27:03 +02:00
Frank Tang
982c4799bf ICU-21161 Mark uloc_getDisplayScriptInContext static
Remove from urename.h
2020-06-17 23:26:33 -07:00
Hugo van der Merwe
8e2651828a ICU-21137 Adjust VSCode IDE settings and README.
ICU Pull Request #1159.

- Set intltest's Current Working Directory correctly to enable finding
  resources.
- Adds c_cpp_properties.json, primarily for the includePath settings.
- Load average takes a while to respond, specify -j24 to always limit
  parallel jobs to a maximum of 24.
- make's "-l" parameter is system load average, not CPU percentage.
  A load average of 90 makes my laptop unusable, changing to -l20.
- Make running all tests the unit-testing default.
- Document the adjustments that can be made in the README.
- Skip these json files when checking for copyright notices. Pure json
  does not permit comments, so c_cpp_properties.json cannot have
  comments.
- defines += U_DISABLE_RENAMING=1 to simplify reference following.

Rebased from 00a5d6dd5c.
2020-06-18 03:15:42 +02:00
Andy Heninger
1eef362329 ICU-13565 Break Iteration, remove the dictionary bit from the implementation.
For identifying text that needs to be handled by a word dictionary for Break Iteration,
change from using a bit in the character category to sorting all dictionary categories
together, and recording the boundary between the non-dictionary and dictionary ranges.

This is internal to the implementaion. It does not affect behavior.
It does increase the number of character categories that can be handled using a
compact 8 bit Trie, from 127 to 255.
2020-06-17 12:00:14 -07:00
Hugo van der Merwe
85aee40cc3 ICU-21078 Improve instructions and gitignore files for cldr-to-icu.
This also adds .idea/ to the top-level .gitignore, next to .vs/ and
.vscode/.
2020-06-16 20:09:46 +02:00
David Beaumont
03bb079d3f ICU-21149 Adding a helper to allow simpler debugging of mappers. 2020-06-16 00:53:48 +02:00
Frank Tang
e7bd5b1cef ICU-21109 minimum grouping digits in DecimalFormat
See #1152
2020-06-11 14:32:52 -07:00
Fredrik Roubert
0735ea8c6f ICU-21143 Applying non-zero offset to null pointer is undefined behaviour.
The result of pointer end + 1 will not be used if end is nullptr so it
doesn't really matter that the result of this operation is undefined,
but it's therefore also unnecessary to perform the operation at all.

Changing this removes this unnecessary operation and by doing so gives
the undefined behaviour sanitizer one thing less to worry about.
2020-06-04 15:13:36 +02:00
David Beaumont
56bb01ba84 ICU-21142 Moving CLDR jar file to common location 2020-06-04 12:26:57 +02:00
David Beaumont
a29369b586 ICU-21140 Make UTF-8 explicit for all file access. 2020-06-03 11:09:02 -07:00
David Beaumont
87bbd3d067 ICU-21135 Fix pseudo locales to filter only non-root paths and avoid aliases.
See #1157
2020-06-03 17:32:23 +02:00
daniel-ju
bb7b8481bd ICU-21140 Fix cldr-to-icu tooling to work on Windows 2020-06-02 22:29:31 +02:00
Andy Heninger
f0ad454691 ICU-13565 RBBI, make all state table row data be unsigned. 2020-06-01 20:05:17 -07:00
Jeff Genovy
723037953b ICU-21119 Enable verbose output from ICU data build when building DEBUG on Windows 2020-05-29 16:02:56 -07:00
Shane F. Carr
3ff6627ce6 ICU-21134 Copy additional data when toNumberFormatter is used
See #1156
2020-05-28 22:33:58 -05:00
Frank Tang
ec7e29f2b6 ICU-13786 Fix addLikelySubtags/minimizeSubtags
See #1140
2020-05-27 18:36:36 -07:00
Frank Tang
c5ebb80a73 ICU-13565 Reduce size of BreakIterator brk files
See #1100
2020-05-27 14:26:10 -07:00
David Beaumont
566e0f8686 ICU-21084 Migrating ICU tools to use PathMatcher 2020-05-26 23:38:23 +02:00
Steven R. Loomis
4231ca5be0 ICU-21098 fix ticket URLs for logKnownIssue tickets.
- Still allows "1234" or "cldrbug:1234" format ticket IDs
- However, docs recommend "ICU-1234" or "CLDR-1234" format
in the future.
- Other ticket IDs could be used, but won't be linkified.
2020-05-20 15:58:51 -07:00
Markus Scherer
eaee0b175e ICU-21029 LocaleMatcher: add option to turn off default locale 2020-05-20 15:16:28 -07:00
Stephan Szabo
b6eb747550 ICU-10879 Split out OBJECTS from Makefiles into separate files 2020-05-20 11:37:05 -07:00
Jeff Genovy
9b3746e0b6 ICU-21121 Enable tests for MacOSX build bot. 2020-05-14 11:37:46 -07:00
Frank Tang
943b090160 ICU-21090 Fix private class 2020-05-12 23:27:04 -07:00
Shane F. Carr
715d254a02 ICU-21081 Make U_ASSERT C++14 compatible 2020-05-08 19:03:43 -07:00