Commit Graph

91 Commits

Author SHA1 Message Date
Andy Heninger
f6fbd54e92 ICU-13549 CjkBreakEngine::divideUpDictionaryRange, problems with supplemental character handling.
X-SVN-Rev: 40949
2018-02-18 22:44:18 +00:00
Andy Heninger
4959b9b3a3 ICU-13569 rbbi table compression, work in progress.
X-SVN-Rev: 40873
2018-02-08 21:17:18 +00:00
Andy Heninger
4e1c4096a6 ICU-9954 Break Iteration, remove reverse rules, add boundary caching.
X-SVN-Rev: 40433
2017-09-19 18:17:22 +00:00
Andy Heninger
bc77976528 ICU-13318 RBBITest, remove obsolete tests, move remaining test data to rbbitst.txt
X-SVN-Rev: 40356
2017-08-26 00:44:28 +00:00
Andy Heninger
ce90dfb861 ICU-13274 RBBI test updates, moved from #9954
X-SVN-Rev: 40305
2017-08-01 23:26:14 +00:00
Andy Heninger
51e21b5242 ICU-13058 Add RBBI test of Unicode emoji-test.txt file, and partial update of break rules recent emoji changes.
X-SVN-Rev: 39909
2017-03-23 00:20:20 +00:00
Peter Edberg
04c115425d ICU-13010 Add Extend* to rule GB11′, update tests and add more emoji cluster tests
X-SVN-Rev: 39726
2017-03-02 21:04:09 +00:00
Steven R. Loomis
dea458fef7 ICU-12515 C filtered break
* sync rbbitst.txt with J
* fix an issue where isBoundary() didn't check the trie's presence

X-SVN-Rev: 39211
2016-09-13 19:58:55 +00:00
Steven R. Loomis
0c5b2b597d ICU-12455 BRS - BOM fix
X-SVN-Rev: 38915
2016-07-01 16:59:16 +00:00
Michael Ow
61607c2773 ICU-12564 Update copyright notice in trunk
X-SVN-Rev: 38848
2016-06-15 18:58:17 +00:00
Yoshito Umaoka
00ca13e126 ICU-12564 Reverted r38761 and r38762, because we want to prepend the Unicode copyright for existing source files, instead of replacing copyright comments.
X-SVN-Rev: 38776
2016-05-31 21:45:07 +00:00
Michael Ow
c9f199a30f ICU-12564 Update copyright notice in ICU4C
X-SVN-Rev: 38761
2016-05-26 22:32:17 +00:00
Andy Heninger
25a04f741a ICU-10698 Test word break of 'What is Unicode' in Japanese, resolve C vs. J differences.
X-SVN-Rev: 38699
2016-05-04 23:55:22 +00:00
Andy Heninger
2e088aff9c ICU-11723 Test dictionary breaking of 'アレルギー性結膜炎'
X-SVN-Rev: 38692
2016-05-03 22:44:32 +00:00
Andy Heninger
66537179d7 ICU-11996 CJKBreakEngine divideUpDictionaryRange, pick up test case from ICU4J.
X-SVN-Rev: 38678
2016-04-29 23:51:24 +00:00
Andy Heninger
0338b5470a ICU-11999 BreakIterator, UnhandledBreakEngine consuming too many characters. Updated test file from ICU4J.
X-SVN-Rev: 38670
2016-04-29 21:32:46 +00:00
Andy Heninger
7265eeae4c ICU-11556 rbbitst.txt test data file, add explicit locale.
X-SVN-Rev: 38644
2016-04-25 18:10:08 +00:00
Andy Heninger
ac9c717990 ICU-11556 Line Break rules update for L2/16-043R, don't break CA$; also LB rules refactored for reduced memory consumption.
X-SVN-Rev: 38643
2016-04-22 23:07:12 +00:00
Andy Heninger
9d9256f3b7 ICU-12081 Initial implementation Emoji break rules and a new RBBI monkey test.
X-SVN-Rev: 38387
2016-02-26 21:58:26 +00:00
Steven R. Loomis
98f5987b43 ICU-11248 use '@ss=' and not x-uli
remove an old test hack.

X-SVN-Rev: 37940
2015-09-10 07:00:30 +00:00
Steven R. Loomis
f87d28cfd2 ICU-11248 merge to trunk: FilteredBreakIteratorBuilder work
* passes rbbi extended tests
* uses <locale en@x-uli=true> in rbbitst.txt,
so added a "known issue" for this when en@ss=standard will suffice.

X-SVN-Rev: 37721
2015-08-05 00:03:18 +00:00
Peter Edberg
2ae320dbdf ICU-11673 Add new Japanese name for Georgia to cjdict
X-SVN-Rev: 37608
2015-06-23 02:09:49 +00:00
Peter Edberg
d88c68d067 ICU-11688 Add Thai words for "update" and "event" to dictionary
X-SVN-Rev: 37606
2015-06-23 00:44:09 +00:00
Peter Edberg
00038112bb ICU-11019 C: Add Thai words for "browser" and "post" to dictionary
X-SVN-Rev: 37126
2015-03-04 07:11:04 +00:00
Peter Edberg
43f62124cd ICU-9379 C: Update BreakIterator createInstance to handle linebreak variant files; update tests
X-SVN-Rev: 37059
2015-02-24 22:37:10 +00:00
Yoshito Umaoka
cbe7c4983b ICU-11466 Added a word break test case for Hangul, starting with Latin text. Such case did not work well with ICU4J 52, but works fine with other ICU versions.
X-SVN-Rev: 36915
2015-01-06 18:57:38 +00:00
Peter Edberg
d87c86274c ICU-10326 Add dictionary-based word/line break for Burmese/Myanmar
X-SVN-Rev: 36397
2014-09-08 22:16:21 +00:00
Peter Edberg
602bb30ae4 ICU-10872 Fix en_US_POSIX word break for colon (C)
X-SVN-Rev: 36381
2014-09-07 07:05:59 +00:00
Andy Heninger
f71b9053d2 ICU-8550 Dictionary Break Iterator, fixes to work with UTF-8 text.
X-SVN-Rev: 35724
2014-05-17 00:44:39 +00:00
Andy Heninger
ce39777eda ICU-4833 Update RBBI title case rules, replace obsolete rule syntax.
X-SVN-Rev: 35333
2014-03-04 19:58:04 +00:00
Peter Edberg
5877fc6542 ICU-10630 Add Thai words for "application" and "tag" to thaidict.txt
X-SVN-Rev: 35213
2014-02-24 19:34:40 +00:00
Peter Edberg
7ebe390ea6 ICU-10691 Add 23 city names (mostly from exemplar city data) to thaidict.txt
X-SVN-Rev: 35211
2014-02-24 16:39:00 +00:00
Peter Edberg
2a0fd5c83d ICU-10593 Add "บลูทูธ" ("Bluetooth") to Thai break dictionary
X-SVN-Rev: 34752
2013-12-12 03:46:15 +00:00
Peter Edberg
5e491553d2 ICU-10571 Add 11 region names in Japanese to cjdict
X-SVN-Rev: 34748
2013-12-12 02:55:47 +00:00
Peter Edberg
bf4126616b ICU-7647 Add/use LaoBreakEngine and laodict.txt; more useful messages in gendict
X-SVN-Rev: 34229
2013-09-06 23:43:13 +00:00
Peter Edberg
b6dcdfcd25 ICU-10176 No line break in $SY $HL; update tests accordingly
X-SVN-Rev: 34142
2013-08-30 05:51:27 +00:00
Peter Edberg
8f0c5ac557 ICU-10296 Add 2 words to Thai dictionary (มั้ย, มั๊ยล่ะ), add tests (C)
X-SVN-Rev: 34134
2013-08-29 23:30:48 +00:00
Peter Edberg
f8c52a1490 ICU-10300 Add "オーストラリア" to cjdict.txt, add related test (C)
X-SVN-Rev: 34130
2013-08-29 20:52:45 +00:00
Peter Edberg
2f02059dda ICU-10299 Fix CjkBreakEngine fSet to include 30FC,FF70; fix broken test data (ICU4C)
X-SVN-Rev: 34118
2013-08-29 05:13:36 +00:00
Markus Scherer
ff5564232d ICU-10168 Unicode 6.3 data files as of 2013-aug-27
X-SVN-Rev: 34098
2013-08-28 01:36:57 +00:00
Peter Edberg
bd7fe8790f ICU-10120 Add test for katakana + extend
X-SVN-Rev: 33685
2013-05-20 00:39:58 +00:00
Peter Edberg
20ae926194 ICU-10120 Copy fix #9983 (for word ubrk_previous infinite loop) to word_POSIX too
X-SVN-Rev: 33682
2013-05-19 06:36:13 +00:00
Markus Scherer
2982958b06 ICU-10128 update ICU to Unicode 6.3 beta (merge from branches/markus/uni63 at r33661)
X-SVN-Rev: 33662
2013-05-15 21:51:04 +00:00
Andy Heninger
0fb049e787 ICU-9983 Fix inconsistent forward and reverse RBBI word break rules when a dictionary character is followed by an extend char.
X-SVN-Rev: 33351
2013-03-01 00:26:35 +00:00
Andy Heninger
e77d52cb38 ICU-9077 Incorporate review comments.
X-SVN-Rev: 33300
2013-02-22 02:26:37 +00:00
Andy Heninger
915f2365a3 ICU-9077 Fix failing test rbbi/RBBITest/TestExtended
X-SVN-Rev: 33234
2013-02-15 08:14:10 +00:00
Andy Heninger
c68b5d9d38 ICU-9077 Enhancements to break iteration tests.
X-SVN-Rev: 33233
2013-02-15 07:17:59 +00:00
Andy Heninger
c437359b7f ICU-9077 RBBI test enhancements.
X-SVN-Rev: 33232
2013-02-15 02:13:58 +00:00
Andy Heninger
863fa94df1 ICU-9505 restore ja locale specific line breaking, which provides css normal behavior.
X-SVN-Rev: 32546
2012-10-08 04:11:15 +00:00
Maxime Serrano
c64c0299d7 ICU-9353 merge dbbi-tries work into the trunk
X-SVN-Rev: 32184
2012-08-16 23:01:49 +00:00