If a file with an input line larger than INT32_MAX (i.e. 2 GB) contains
an UTF8 character after that limit, escapesrc crashes on 64 bit systems
or does not remove incomplete files on 32 bit systems.
The issue is that an unchecked cast from size_t to int32_t can turn
negative, which results in negative offsets during array access.
This will eventually lead to an out of boundary read, which most likely
crashes the tool.
This patch sets a fixed limit on 1 GB to make sure that no side effects
occur if the line is exactly INT32_MAX or a few bytes less. It should
still be way more than anyone would really need.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
If gencnval encounters an empty input file the function resolveAlias
triggers an out of boundary write due to uniqueAliasArr pointing to
a 0 byte reserved memory address.
This patch protects the code in question with a check for
knownAliasesCount being not 0.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
- Adds first "span" field category
- Re-implements DateIntervalFormat#fallbackFormat to use FieldPositionHandler
- New temporary wiring in SimpleFormatter
- Adds additional logic to NumberStringBuilder.
- Extends logic of number::impl::Field type.
- Adds tests for RBNF support.
- Adds tests from ftang's original PR.
- widen API from LocalePriorityList to Iterable
- merge getBestMatch(multiple locales) and getBestMatch(single locale) into one function
- process desired locales incrementally, create fewer objects
- reject poor matches early: use bestDistance-demotion for threshold
- add API for java.util.Locale, convert incrementally
- new feature: tracks indexes of supported and desired locales which eliminates conversion of result objects in wrappers around getBestMatch() as shown by the java.util.Locale API here
- simpler data structures, more serialization-friendly (easier to port to C++)
- e.g., use a BytesTrie each for likelySubtags & locale distance, instead of layers of TreeMap
- un-hardcode locale matcher data; use modern resource bundle functions
- split builder code & runtime code into separate classes
- move LSR to simple top-level value class, cache regionIndex in LSR
- simpler handling of private use languages and pseudolocales
- simplify RegionMapper
- LocaleDistance builder: move the node distance into the DistanceTable, remove DistanceNode
- support distance rules with region codes, not just with variables
- enforce & use distance rule constraints:
- no rule with *,supported or desired,*
- no rule with language * and script/region non-*
- distance trie collapse a (desired, supported)=(ANY, ANY) pair into a single *
- look up each desired language only once for all supported LSRs
- remove layers-of-Maps compaction (trie builder compacts)
- remove unused XML printing
- remove other unused code
- make XLocaleMatcherTest.testPerf() exercise locale distance lookup code
- CreateFileMappingW is marked for both desktop and UWP apps, so we can call that in both code paths.
- We can use the W version of the CreateFileMapping API instead of A version since we pass a NULL for the name anyways.
- We can call the same API CreateFile[A|W] from both the UWP and Win32 versions of the code, reducing one of the UWP forks.
- Add a work-around for older versions of the Windows 10 SDK UWP headers.
- Remove the code that was creating a custom security descriptor (but setting everything to NULL) and pass null to the API directly. This way we will get the default security descriptor instead of the NULL dacl.
- Change to use nullptr instead of NULL in C++ code.
- Use move assignment for fields->formatter (LocalizedNumberFormatter) instead of creating new heap object every time.
- Add test cases for DecimalFormat object in invalid state.
- Protect against self-assignment in assignment operator.
- Fix segmentation fault when attempting to compare valid and invalid DecimalFormat objects.
- Changes based on review feedback from Shane.
- Fix minor typos in the public header file.
The problem is that Docker receives zip files only as LFS links when
cloning ICU from GitHub. Converting the txt files into zip files, which
is the required corpus format for the fuzzer, will be done by the oss-fuzz
build script.
ICU-20217 Adds fuzzer seed corpus files to the list of files that don't have
copyright notice.
the buildtools directory it needs when running the ICU4C unit tests in an
out-of-source installation.
The change is a quick workaround for now for an issue that can have wide impact.