targets to test/fuzzer/ directory. This will enable compilation and
smoke test of fuzzer targets as part of the ICU continuous build.
ICU-20652 Fixed exit-on-error behaviour of fuzzer targets execution.
Minor clean-ups and improvements
ICU-20652 Modifies fuzzer/Makefile.in to fix Windows build issue.
ICU-20627 Adds explicit enablement of fuzzer targets build to ICU4C
configuration and Makefile.in. File 'configure' was created from
'configure.ac' by executing 'autoreconf'.
autoreconf added some new entries into 'configure' about runstatedir. Not sure
why it did this, they are not related to fuzzer.
- Use STATIC_NEW for mutex creation, to avoid order-of-destruction problems
by avoiding destruction altogether, while avoiding memory leak reports.
- Remove UConditionVar, replace with direct use of std::condition_variable
This fixes a regression introduced by commit
b12a927c93 for issue ICU-13778.
The above commit improved the error checking in the
DateTimePatternGenerator class, adding checks for errors/failures
where there previously was none at all. This was done in order to
catch catastrophic errors like out-of-memory (OOM), and properly
report them to the caller, rather than ignoring/hiding these errors.
However, in doing so it exposed a case where the code was depending
on ignoring errors in order to fall-back to the Gregorian calendar
when the default ICU locale is set to root.
This restores the previous behavior, by allowing the error of
U_MISSING_RESOURCE_ERROR to fall-though and continue without
reporting back an error to the caller.
Note: This regression was technically introduced in ICU 63, and
also effects ICU 64 as well.
- StringSegment, ICU4C:
* Moved to top icu namespace
* Compilation unit renamed to string_segment.
- NumberStringBuilder, C and J:
* Moved to main icu namespace
* Compilation unit renamed to formatted_string_builder
* Renamed class to FormattedStringBuilder
- Moves nextPosition logic of NumberStringBuilder to helper class
- added PKGDATA_TRAILING_SPACE to all of the pkgdataMakefile.in file.
- NOTE: Users who create their own pkgdata.inc / icupkg.inc files may need
to recreate this PKGDATA_TRAILING_SPACE behavior.
- used the above variable, normally undefined, in mh-* files that need a trailing space
- Also, fixed use of system() in pkgdata.cpp per ICU-20538
This was causing pkgdata to return a zero status even on clang
failure, masking this issue.
(cherry picked from commit 83a0542b5b)
- added PKGDATA_TRAILING_SPACE to all of the pkgdataMakefile.in file.
- NOTE: Users who create their own pkgdata.inc / icupkg.inc files may need
to recreate this PKGDATA_TRAILING_SPACE behavior.
- used the above variable, normally undefined, in mh-* files that need a trailing space
- Also, fixed use of system() in pkgdata.cpp per ICU-20538
This was causing pkgdata to return a zero status even on clang
failure, masking this issue.
Builds res_index.txt based on directory glob minus aliases read from deprecates XML file.
In ICU 64, please use the ICU Data Build Tool instead of reslocal.mk for locale filtering.
Remove the dependencies from the ICU library code on static constructors
that were introduced by using std::mutex and condition variables. The
mutexes are lazily initialized by embedding them as local static variables
in getter functions, and relying on the C++ compiler/runtime to do thread
safe initialization of them.
- see also ICU-20062
- add a `-B` option to the two python invocations on Windows
- set PYTHONDONTWRITEBYTECODE in configure.ac and icudefs.mk.in
Co-authored-by: Fredrik Roubert <roubert@google.com>
- Fixes filterrb.cpp to check for wildcard when at a leaf.
- Adds additional verbose logging to genrb.
- Fixes filtration to add deps to dep_targets instead of dep_files.
- Separates dep_files to common_dep_files and specific_dep_files.
from assumed UTF-8 resulted in an extremely large percentage of Unicode
replacement characters in the data passed to the API under test.
ICU-20217 Uses fuzzer generated bytes to make random selection of locales, converters,
etc., replacing the random number generator. This way the fuzzer can control
the selections.
ICU-20217 Minor follow-ups from code review.
Removes fuzzer target break_iterator_utf32_fuzzer which does not perform
anything useful what the regular break iterator fuzzer target already performs.
ICU-20217 Fixes for-loop body.
ICU-20217 Uses am allocated buffer to pass head-truncated fuzzer data to the
API under test. The fuzzer may otherwise not detect buffer underflow.
by
ICU-20217 Typing fix.
ICU-20217 Fixing typing.
ICU-20217 Improve fuzzer targets, move truncated fuzzer data into a
new buffer to prevent that buffer underflow goes undetected.
ICU-20217 Fixes buffer management of fuzzer-provided data.
ICU-20217 Factor in PR review comments.
- Changes Java DecimalFormat boolean get* methods to is*.
- Makes the new draft methods non-virtual.
- Removes obsolete template class in header file.
- Adds proper U_HIDE tags in unum.h and decimfmt.h
- Adds first "span" field category
- Re-implements DateIntervalFormat#fallbackFormat to use FieldPositionHandler
- New temporary wiring in SimpleFormatter
- Adds additional logic to NumberStringBuilder.
- Extends logic of number::impl::Field type.
- Adds tests for RBNF support.
- Adds tests from ftang's original PR.
- Use move assignment for fields->formatter (LocalizedNumberFormatter) instead of creating new heap object every time.
- Add test cases for DecimalFormat object in invalid state.
- Protect against self-assignment in assignment operator.
- Fix segmentation fault when attempting to compare valid and invalid DecimalFormat objects.
- Changes based on review feedback from Shane.
- Fix minor typos in the public header file.
The problem is that Docker receives zip files only as LFS links when
cloning ICU from GitHub. Converting the txt files into zip files, which
is the required corpus format for the fuzzer, will be done by the oss-fuzz
build script.
ICU-20217 Adds fuzzer seed corpus files to the list of files that don't have
copyright notice.
Moved the macro from platform.h to uassert.h.
Removed any "unreachable" code that previously occurred after the UPRV_UNREACHABLE macro is used.
Changes based on review from Andy.
Co-authored-by: Daniel Ju <daju@microsoft.com>
- Wires up FormattedNumber[Range] in applicable languages.
- Adds new header files and tests, with minor cleanup to old tests.
- Adds code to guarantee terminating NUL in FormattedNumber[Range].
- Cleanup of API docs for inherited methods in FormattedNumber[Range].
- Reads, parses, and applies the filter file syntax.
- Removes unused keys from the resource bundle.
- Adds sample filter txt file with test in intltest.
- Reads filters.json or .hjson from ICU_DATA_FILTER_FILE environment variable
- Adds DepTarget for dependency semantics, and warns for missing deps.
- Fixes certain tests that crash with sliced locale data.
- Includes support for 4 filter types.
The API documentation is perfectly clear about this, an empty string for
the value means that the keyword should be removed:
@param keywordValue value of the keyword to be set. If 0-length or
NULL, will result in the keyword being removed. No error is given if
that keyword does not exist.
Remove all POSIX and Win32 specific mutex, atomic and threading implementations
in favor of C++11 std library functions.
Move the related (internal) ICU types and functions into the icu namespace.
Adds some plumbing to allow MutablePatternModifier to set fields, and otherwise builds upon the infrastructure from the previous commit to add the MEASURE_UNIT field.
- Creates new Python package in icu4c/data/buildtool
- Creates BUILRDULES.py in icu4c/data and icu4c/test/testdata, unified between Unix/Windows
- Removes most data build orchestration rules from makedata.mak, testdata.mak, data/Makefile.in, and test/testdata/Makefile.in
- Removes pool.res files and builds them on the fly instead
- fastpath for UnicodeSet.add(new last range)
- fewer UnicodeSet memory allocations:
initial internal list array, exponential array growth,
allocate strings list/set only when first one is added
- faster CodePointTrie.getRange(): fewer calls to filter function
- revert UnicodeSet(intprop=value) from trie ranges to range starts + lookup
- cache per-int-prop range starts: fewer lookups
This eliminates the need for a scratch buffer in Locale::toLanguageTag()
and also the need for counting bytes required in uloc_toLanguageTag(),
something that ByteSink will now handle correctly and thereby
eliminating the bug where too few bytes required was returned.
Using temporary variables for the two values to be compared here makes
GCC compile the code just like we expect it to. (What it really is that
it otherwise does on some architechtures remains a mystery.)
This will make the tests pass as expected also on IA-32 with GCC.
It'll also make it possible to revert the old workaround for SPARC
introduced by commit 5b0592af79.
Tested:
Linux gcc45 3.16.0-5-686-pae #1 SMP Debian 3.16.51-3+deb8u1 (2018-01-08) i686 GNU/Linux
Debian clang version 3.5.0-10 (tags/RELEASE_350/final) (based on LLVM 3.5.0)
g++ (Debian 4.9.2-10+deb8u1) 4.9.2
Linux gcc202 4.16.0-1-sparc64-smp #1 SMP Debian 4.16.5-1 (2018-04-29) sparc64 GNU/Linux
clang version 4.0.1-10+sparc64 (tags/RELEASE_401/final)
g++ (Debian 8.2.0-7) 8.2.0
* ICU-20205 Add locale test for RelativeDateTimeFormatter.
* ICU-20205 Fix error in pt relative date data. Improve error handling in code.
* ICU-20205 Add instantiation test & regen data from ICU4C
* ICU-20205 Added DateFormatSymbols error check per jefgen's comments.
uloc_forLanguageTag has a few mapping tables to map grandfathered
language tags and deprecated language subtags to their preferred or
modern values.
Update them based on the latest version of the IANA
language subtag registry. [1]
Five grandfathered tags without a preferred value are still mapped to
what ICU has mapped them to for backward compatibility until the
wisdom of continuing to do so is reviewed.
In addition, map redundant language tags to their preferred values
regardless of whether they're followed by other subtags or not. (e.g.
zh-yue vs zh-yue-u-co-pinyin) .
Similary, ja-latn-hepburn-heploc is mapped to ja-latn-alaic97 (the
variant subtag 'hepburn-helploc' with the prefix 'ja-latn' has the
preferred value, 'alaic97') .
Update the mapping for deprecated language subtags (e.g. 'jw' to
'jv' and a bunch of 3-letter language codes).
Add a new table for deprecated region subtags to map them to their
modern values. (e.g. 'DD' to 'DE').
Add a new test case for deprecated language and region mapping and
a few more cases for updated grandfathered and redundant tag mapping.
[1]
https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry
* ICU-20119 Update and fixes for the following BRS tasks:
- Update urename.h
- Test uconfig.h variations
* ICU-20119 Updates copyright scanner script exclusions: don't scan ./git/*.
* ICU-20119 Changes in reply to comments for pull requst #165.
* Fix MSVC C4251: Need to export explicit template instantiation for std::atomic<int32_t> when building DLLs
* Some more warning fixes for MSVC as well.
* Can't use static_cast in C file.