Unintentional integer overflow (OVERFLOW_BEFORE_WIDEN)
overflow_before_widen: Potentially overflowing expression:
cdict->dictContentSize * 6U
with type unsigned int (32 bits, unsigned) is evaluated using 32-bit
arithmetic, and then used in a context that expects an expression of
type U64 (64 bits, unsigned).
Exposed when loading a dictionary < LDM minMatch bytes in MT mode.
Test Plan:
```
CC=clang make -j zstreamtest MOREFLAGS="-O0 -fsanitize=address"
./zstreamtest -vv -i100000000 -t1 --newapi -s7065 -t3925297
```
TODO: Add an explicit test that loads a small dictionary in MT mode
`ZSTD_estimateCCtxSize()` provides estimates for one-shot compression, which
is guaranteed not to buffer inputs or outputs. So it ignores the sizes of the
buffers, assuming they'll be zero. However, the actual workspace allocation
logic always allocates those buffers, and when running under ASAN, the
workspace surrounds every allocation with 256 bytes of redzone. So the 0-sized
buffers end up consuming 512 bytes of space, which is accounted for in the
actual allocation path through the use of `ZSTD_cwksp_alloc_size()` but isn't
in the estimation path, since it ignores the buffers entirely.
This commit fixes this.
Resubmission of #2001. This switches the `sed` invocations to use `-E`,
extended regex syntax, which is better standardized across platforms.
I guess.
Same test plan:
```
make -C lib clean libzstd.pc
cat lib/libzstd.pc
echo # should fail
make -C lib clean libzstd.pc LIBDIR=/foo
make -C lib clean libzstd.pc INCLUDEDIR=/foo
make -C lib clean libzstd.pc LIBDIR=/usr/localfoo
make -C lib clean libzstd.pc INCLUDEDIR=/usr/localfoo
make -C lib clean libzstd.pc LIBDIR=/usr/local/lib prefix=/foo
make -C lib clean libzstd.pc INCLUDEDIR=/usr/local/include prefix=/foo
echo # should succeed
make -C lib clean libzstd.pc LIBDIR=/usr/local/foo
make -C lib clean libzstd.pc INCLUDEDIR=/usr/local/foo
make -C lib clean libzstd.pc LIBDIR=/usr/local/
make -C lib clean libzstd.pc INCLUDEDIR=/usr/local/
make -C lib clean libzstd.pc LIBDIR=/usr/local
make -C lib clean libzstd.pc INCLUDEDIR=/usr/local
make -C lib clean libzstd.pc LIBDIR=/tmp/foo prefix=/tmp
make -C lib clean libzstd.pc INCLUDEDIR=/tmp/foo prefix=/tmp
make -C lib clean libzstd.pc LIBDIR=/tmp/foo prefix=/tmp/foo
make -C lib clean libzstd.pc INCLUDEDIR=/tmp/foo prefix=/tmp/foo
echo # should also succeed
make -C lib clean libzstd.pc prefix=/foo LIBDIR=/foo/bar INCLUDEDIR=/foo/
cat lib/libzstd.pc
mkdir out
cd out
cmake ../build/cmake
make
cat lib/libzstd.pc
```
When the output buffer is `NULL` with size 0, but the frame content size
is non-zero, we will write to the NULL pointer because our bounds check
underflowed.
This was exposed by a recent PR that allowed an empty frame into the
single-pass shortcut in streaming mode.
* Fix the bug.
* Fix another NULL dereference in zstd-v1.
* Overflow checks in 32-bit mode.
* Add a dedicated test.
* Expose the bug in the dedicated simple_decompress fuzzer.
* Switch all mallocs in fuzzers to return NULL for size=0.
* Fix a new timeout in a fuzzer.
Neither clang nor gcc show a decompression speed regression on x86-64.
On x86-32 clang is slightly positive and gcc loses 2.5% of speed.
Credit to OSS-Fuzz.
This diff reorganizes the `lib/Makefile` to extract various settings that a
user would normally invoke together (supposing that they were aware of them)
if they were trying to build the smallest `libzstd` possible. It collects
these settings under a master setting `ZSTD_LIB_MIN_SIZE`.
Also document this new option.
`-Wall` implies `-Wformat-zero-length`, which will cause compilation to fail
under `-Werror` when an empty string is passed as the format string to a
`printf`-family function. This commit moves us back to prefixing the provided
format string, which successfully avoids that warning.
However, this removes the failure mode where that `RAWLOG` invocation would
fail to compile when no format string was provided at all (which was desirable
to avoid having code that would successfully compile normally but fail under
`-pedantic`, which *does* require that a non-zero number of args are provided).
So this commit also introduces a function which does nothing at all, but will
fail to compile if not provided with at least one argument, which is a string.
This successfully links the compilability of pedantic and non-pedantic builds.
Fixes:
Enable RLE blocks for superblock mode
Fix the limitation that the literals block must shrink. Instead, when we're within 200 bytes of the next header byte size, we will just use the next one up. That way we should (almost?) always have space for the table.
Remove the limitation that the first sub-block MUST have compressed literals and be compressed. Now one sub-block MUST be compressed (otherwise we fall back to raw block which is okay, since that is streamable). If no block has compressed literals that is okay, we will fix up the next Huffman table.
Handle the case where the last sub-block is uncompressed (maybe it is very small). Before it would skip superblock in this case, now we allow the last sub-block to be uncompressed. To do this we need to regenerate the correct repcodes.
Respect disableLiteralsCompression in superblock mode
Fix superblock mode to handle a block consisting of only compressed literals
Fix a off by 1 error in superblock mode that disabled it whenever there were last literals
Fix superblock mode with long literals/matches (> 0xFFFF)
Allow superblock mode to repeat Huffman tables
Respect ZSTD_minGain().
Tests:
Simple check for the condition in #2096.
When the simple_round_trip fuzzer enables superblock mode, it checks that the compressed size isn't expanded too much.
Remaining limitations:
O(targetCBlockSize^2) because we recompute statistics every sequence
Unable to split literals of length > targetCBlockSize into multiple sequences
Refuses to generate sub-blocks that don't shrink the compressed data, so we could end up with large sub-blocks. We should emit those sections as uncompressed blocks instead.
...
Fixes#2096