Commit Graph

158 Commits

Author SHA1 Message Date
Markus Scherer
524748c6bf ICU-20984 StringPiece & ByteSink overloads for char8_t* 2020-03-16 10:49:21 -07:00
Shane Carr
bb1f00efb8 ICU-20919 Merge branch 'maint/maint-66' into maint-66-merge
Conflicts:
	icu4j/main/shared/data/icudata.jar
2020-02-21 18:21:05 -08:00
Laurent Stacul
3b58179396 ICU-20972 Fix invalid conversion from const char8_t* to const char* (C++20) 2020-02-20 13:09:18 -08:00
Andy Heninger
54a60fe6f4 ICU-11548 Improve regex static UnicodeSets handling
Compiled regular expression patterns make use of several shared common
UnicodeSets. This change simplifies the creation and use of these
static UnicodeSets.

- Pointer fields to the static sets are removed from the compiled patterns,
  and the static variables are accessed directly. The deleted pointers
  were a hold-over from earlier code that did not use shared statics.

- The UnicodeSet pattern literals are changed from hex constants to
  u"string literals".

- The size of fRuleSets (from regexst.h) is changed from a hard-coded 10
  to the number of UnicodeSets actually required. Doing this required
  a change to regexcst.pl to export the required size. Changing and
  rerunning this perl code resulted in massive but benign changes to
  the generated file regexcst.h, the result of perl having changed its
  order of enumeration of hashes since the file was last regenerated.

- UnicodeSets are frozen when possible. Should result in faster matching.
2020-01-30 15:13:07 -08:00
Andy Heninger
03937347fb ICU-20863 Regex, lazy creation and reduced size of map from capture group names to numbers. 2019-10-22 17:23:26 -07:00
Andy Heninger
327087150f ICU-20618 Regex nested lookaround expressions, clean up active match region handling. 2019-08-19 13:31:34 -07:00
Fredrik Roubert
b4b2378931 ICU-20601 Wrap ICU test compound macros in do { } while.
This does the same for the ICU test code as was done for the
public ICU API in commit 480bec3ea6.
2019-08-15 22:01:42 +02:00
Fredrik Roubert
5d6d29b76a ICU-20601 Remove superfluous semicolons (-Wextra-semi-stmt).
These are the same changes for the C++ code as was done for the C code
by commit 17606e0345.
2019-08-15 12:30:21 +02:00
Jeff Genovy
e72290c45e ICU-13764 Add separate CI build that treats warnings as errors with clang.
This adds a separate CI build that enables -Werror for clang.

This also fixes all of the -Wall -Wextra warnings in the tests, and all the
-Wextra-semi warnings as well.
2019-07-30 22:10:02 -07:00
Andy Heninger
e559b30309 ICU-20359 Fix stack overflow in Regex Pattern Compile. 2019-03-07 10:31:30 -08:00
Jeff Genovy
33d7868d45 ICU-20351 Warning cleanup changes for ICU4C under MSVC. 2019-01-16 16:43:02 -08:00
Andy Heninger
0d32dd8f05 ICU-13632 regex out-of-bounds memory reference fix.
X-SVN-Rev: 41088
2018-03-09 18:39:14 +00:00
Andy Heninger
193aa17f08 ICU-13631 Regex Address Sanitizer fix.
X-SVN-Rev: 41086
2018-03-08 18:32:15 +00:00
Markus Scherer
03f431d30d ICU-13340 obsolete unicode/utf_old.h: add U_HIDE_OBSOLETE_UTF_OLD_H to optionally hide all of the .h contents; default: no behavior change; adjust code & tests to work either way
X-SVN-Rev: 40413
2017-09-14 06:24:35 +00:00
Andy Heninger
5f57938910 ICU-12884 regex timeout not working with {loop counts} in patterns.
X-SVN-Rev: 39693
2017-02-21 23:12:48 +00:00
Andy Heninger
242e02c388 ICU-12764 icu4c utf-8 source files, update Copyright notices.
X-SVN-Rev: 39583
2017-01-20 00:20:31 +00:00
Michael Ow
61607c2773 ICU-12564 Update copyright notice in trunk
X-SVN-Rev: 38848
2016-06-15 18:58:17 +00:00
Yoshito Umaoka
00ca13e126 ICU-12564 Reverted r38761 and r38762, because we want to prepend the Unicode copyright for existing source files, instead of replacing copyright comments.
X-SVN-Rev: 38776
2016-05-31 21:45:07 +00:00
Michael Ow
c9f199a30f ICU-12564 Update copyright notice in ICU4C
X-SVN-Rev: 38761
2016-05-26 22:32:17 +00:00
Fredrik Roubert
7f4b8d106b ICU-12012 Replace all sizeof p / sizeof *p with UPRV_LENGTHOF().
R=markus.icu@gmail.com

Review URL: https://codereview.appspot.com/285520043 .

X-SVN-Rev: 38337
2016-02-23 10:40:09 +00:00
Andy Heninger
8dba7301b7 ICU-11554 Fix regex bug with look-behind matching & UTF-8 input.
X-SVN-Rev: 38056
2015-10-09 20:01:46 +00:00
Andy Heninger
57ac300668 ICU-11480 added tests for regex with capture groups that do not participate in match.
X-SVN-Rev: 37816
2015-08-25 20:47:38 +00:00
Andy Heninger
d96fea9eb6 ICU-11123 promote RegexMatcher::find(UErrorCode &) to public API
X-SVN-Rev: 37073
2015-02-26 02:34:20 +00:00
Andy Heninger
ec3f77f878 ICU-5312 Regular Expressions Named Capture.
X-SVN-Rev: 37040
2015-02-18 23:56:19 +00:00
Andy Heninger
22c8c94d14 ICU-11469 Regular Expressions, remove old tech preview functions.
X-SVN-Rev: 36953
2015-01-14 00:03:29 +00:00
Andy Heninger
63758dca88 ICU-11371 Improved checking of regular expression pattern size limits.
X-SVN-Rev: 36801
2014-12-02 21:58:18 +00:00
Andy Heninger
f9c67eb71e ICU-11374 Regular Expression, improve checking of integer overflow.
X-SVN-Rev: 36790
2014-12-02 01:32:49 +00:00
Steven R. Loomis
7594250cc5 ICU-7653 changed uprv_lengthof to UPRV_LENGTHOF, also added apidoc
X-SVN-Rev: 36275
2014-08-28 22:13:45 +00:00
Tom Zhang
ee1f29b584 ICU-7653 move LENGTHOF(array) to common, internal header
X-SVN-Rev: 36265
2014-08-28 14:55:34 +00:00
Andy Heninger
f2dfa7422e ICU-10815 Fix for uregex_findNext() not setting U_REGEX_STOPPED_BY_CALLER
X-SVN-Rev: 36260
2014-08-28 01:19:29 +00:00
Andy Heninger
1ba1ec3b83 ICU-11049 regular expressions, use same logic in UText and (UChar *) code paths when checking limit of potential match start positions.
X-SVN-Rev: 36161
2014-08-14 17:44:05 +00:00
Andy Heninger
e03585d7cf ICU-11049 fix regex find() memory overrun.
X-SVN-Rev: 36124
2014-08-06 21:49:08 +00:00
Andy Heninger
a45f7faf63 ICU-10835 Improve performance of case insensitive find operations.
X-SVN-Rev: 35683
2014-05-02 22:51:53 +00:00
Michael Ow
3daf54af40 ICU-10740 Fix uconfig test errors
X-SVN-Rev: 35480
2014-03-15 06:08:42 +00:00
Andy Heninger
10dd7ed47b ICU-10463 Regular Expressions, rework debug conditionals to fix build failures on clang, and to somewhat simplify.
X-SVN-Rev: 34565
2013-10-14 22:11:21 +00:00
Andy Heninger
045919648e ICU-10459 Fix segfault in uregex_group() when match is in invalid state.
X-SVN-Rev: 34559
2013-10-11 20:59:39 +00:00
Michael Ow
4835d5705a ICU-10398 Ensure cintltst and intltest passes without data
X-SVN-Rev: 34398
2013-09-19 02:32:57 +00:00
Andy Heninger
20016a58db ICU-9719 Regular Expressions, add loop breaking to unbounded {min, max} loops.
X-SVN-Rev: 33848
2013-06-26 00:27:11 +00:00
George Rhoten
865333dd77 ICU-9457 Fix some compiler warnings.
X-SVN-Rev: 32111
2012-08-05 16:33:16 +00:00
Andy Heninger
b916fcb50b ICU-9283 fix for look-behind assertions w/ case insensitive matching.
X-SVN-Rev: 31782
2012-04-27 21:29:34 +00:00
Yoshito Umaoka
b36054b80b ICU-9139 Fix non-ASCII characters in regextst.cpp
X-SVN-Rev: 31540
2012-02-29 16:51:39 +00:00
Andy Heninger
c74df646b7 ICU-6947 implement UREGEX_LITERAL flag.
X-SVN-Rev: 31398
2012-02-15 01:30:55 +00:00
Andy Heninger
da5380d926 ICU-8123 Remove utext comparison functions.
X-SVN-Rev: 31277
2012-01-31 01:14:41 +00:00
Andy Heninger
b8315ecf6a ICU-8826 Regex case insensitive match fixes; also fixes #6074, hitEnd() sometimes fails.
X-SVN-Rev: 31233
2012-01-20 00:50:02 +00:00
Steven R. Loomis
20934ac953 ICU-8861 merge r30809 cleanup from #8755
X-SVN-Rev: 30814
2011-10-12 20:08:09 +00:00
Steven R. Loomis
737d129645 ICU-8784 commit iSeries porting fixes into trunk
X-SVN-Rev: 30667
2011-09-15 19:34:17 +00:00
Michael Ow
310c23c24e ICU-8578 Apply patch to fix some compiler warnings and related issues
X-SVN-Rev: 30205
2011-06-10 18:56:08 +00:00
Steven R. Loomis
61bdf314c6 ICU-8350 terminate the buffer.
X-SVN-Rev: 29989
2011-05-03 19:48:42 +00:00
Steven R. Loomis
64a323a8e2 ICU-8350 fix another asciism
X-SVN-Rev: 29987
2011-05-03 18:16:18 +00:00
Andy Heninger
2861a47a86 ICU-7029 Add test case for this ticket.
X-SVN-Rev: 29846
2011-04-21 23:19:40 +00:00