Rather than restrict our temp chain table to 2 ** chainLog entries, this
commit uses all available space to reach further back to gather longer
chains to pack into the DDSS chain table.
Rather than interleave all of the chain table entries, tying each entry's
position to the corresponding position in the input, this commit changes the
layout so that all the entries in a single chain are laid out next to each
other. The last entry in the hash table's bucket for this hash is now a packed
pointer of position + length of this chain.
This cannot be merged as written, since it allocates temporary memory inside
ZSTD_dedicatedDictSearch_lazy_loadDictionary().
Previously, if DDSS was enabled on a CCtx and a dictionary was inserted into
the CCtx, the CCtx MatchState would be filled as a DDSS struct, causing
segfaults etc. This changes the check to use whether the MatchState is marked
as using the DDSS (which is only ever set for CDict MatchStates), rather than
looking at the CCtxParams.
Entries in the hashTable chain cache aren't subject to the same aliasing that
the circular chain table is subject to. As such, we don't need to stop when we
cross the chain limit. We can delve deeper. :)
This caused us to double-search the first position and fail to search the
last position in the chain, slowing down search and making it less effective.
This makes it clear that not only is the feature allowed here, we're actually
using it, as opposed to the CCtxParam field, in which it's enabled, but we may
or may not be using it.
The unused function definitions are hidden behind a
`#ifndef ZSTD_NO_UNUSED_FUNCTIONS` check.
Initially hiding all functions which are unused and take up more than
2KB of stack space, because these will show up as warnings in the
Linux Kernel build system.
Previously, this construct would add `-O3` onto the end of the compiler flags
variable, **after** `MOREFLAGS`, which meant that it was impossible to over-
ride. This commit fixes this order and should otherwise be a no-op.
Unintentional integer overflow (OVERFLOW_BEFORE_WIDEN)
overflow_before_widen: Potentially overflowing expression:
cdict->dictContentSize * 6U
with type unsigned int (32 bits, unsigned) is evaluated using 32-bit
arithmetic, and then used in a context that expects an expression of
type U64 (64 bits, unsigned).
* Fix bug introduced in PR #2271
* Fix long-standing bug that is impossible to trigger inside of zstd
* Add a fuzzer that makes sure the normalized count always round trips
correctly
new version easier to vectorize
leads to smaller code and faster execution
notably at the last recombination stage
(basically, fixed cost per block).
Assembly inspected with godbolt
On my laptop, with `clang` and `-mavx2` :
2K block : 1280 MB/s -> 1550 MB/s
8K block : 1750 MB/s -> 1860 MB/s
Fuzzing build modes (FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION) doesn't
necessarily imply that assert() is enabled, according to the manual.
When the current do-nothing is expanded under -Wunused-variable (-Wall),
it results in unused variables in some of the FUZZING_BUILD_MODE...
blocks.
This patch extends the do-nothing to avoid the unused variable.
The bound check condition should always be met because we selected `set_basic` as
our encoding type. But that code is very far away, so assert it is true so if it is
ever false we can catch it, and add a bounds check.
Fixes#2213.
Allow compression to use dictionaries with missing symbols in their
entropy tables. We set the FSE repeat mode to check when there are
missing symbols, and set the FSE repeat mode to valid when all symbols
are present.
Note that when not all symbols are present, the heuristics which favor
dictionary tables for lower compression levels won't activate.
Tested by manually creating a dictionary with missing symbols of every
type, and validing that the compressor rejects it before this change,
and accepts it after this change. Also, I ran the `dictionary_loader`
fuzzer for >1 hour of CPU time without running into cases where
compression succeeds, but decompression fails.
Fixes#2174.
default rule is `lib-release`
`lib-release` wasn't working : it was just skipped.
Removing `lib-release` from the list of .PHONY targets fixes it.
Same for `lib-mt`.
this API is deprecated, for a loong time now,
all related symbols will be removed in a future version (likely v1.5.0)
and the header file `zbuff.h` doesn't compile from `include/` anyway,
because it needs to be positioned one directory below `zstd.h`.
Also removed `cover.h` from `cmake` installer,
as it should have never been part of this list to begin with.
Exposed when loading a dictionary < LDM minMatch bytes in MT mode.
Test Plan:
```
CC=clang make -j zstreamtest MOREFLAGS="-O0 -fsanitize=address"
./zstreamtest -vv -i100000000 -t1 --newapi -s7065 -t3925297
```
TODO: Add an explicit test that loads a small dictionary in MT mode
This commit pulls out the internals of `ZSTD_estimateCCtxSize_usingCCtxParams`
into a helper. It then migrates two other callsites to use that helper,
a small optimization for `ZSTD_estimateCStreamSize_usingCCtxParams`, which
folds the buffer sizing into the helper, and then `ZSTD_resetCCtx_internal`,
which is more invasive.
This attempts to guarantee that the estimates returned to users are always
correct.
`ZSTD_estimateCCtxSize()` provides estimates for one-shot compression, which
is guaranteed not to buffer inputs or outputs. So it ignores the sizes of the
buffers, assuming they'll be zero. However, the actual workspace allocation
logic always allocates those buffers, and when running under ASAN, the
workspace surrounds every allocation with 256 bytes of redzone. So the 0-sized
buffers end up consuming 512 bytes of space, which is accounted for in the
actual allocation path through the use of `ZSTD_cwksp_alloc_size()` but isn't
in the estimation path, since it ignores the buffers entirely.
This commit fixes this.
Resubmission of #2001. This switches the `sed` invocations to use `-E`,
extended regex syntax, which is better standardized across platforms.
I guess.
Same test plan:
```
make -C lib clean libzstd.pc
cat lib/libzstd.pc
echo # should fail
make -C lib clean libzstd.pc LIBDIR=/foo
make -C lib clean libzstd.pc INCLUDEDIR=/foo
make -C lib clean libzstd.pc LIBDIR=/usr/localfoo
make -C lib clean libzstd.pc INCLUDEDIR=/usr/localfoo
make -C lib clean libzstd.pc LIBDIR=/usr/local/lib prefix=/foo
make -C lib clean libzstd.pc INCLUDEDIR=/usr/local/include prefix=/foo
echo # should succeed
make -C lib clean libzstd.pc LIBDIR=/usr/local/foo
make -C lib clean libzstd.pc INCLUDEDIR=/usr/local/foo
make -C lib clean libzstd.pc LIBDIR=/usr/local/
make -C lib clean libzstd.pc INCLUDEDIR=/usr/local/
make -C lib clean libzstd.pc LIBDIR=/usr/local
make -C lib clean libzstd.pc INCLUDEDIR=/usr/local
make -C lib clean libzstd.pc LIBDIR=/tmp/foo prefix=/tmp
make -C lib clean libzstd.pc INCLUDEDIR=/tmp/foo prefix=/tmp
make -C lib clean libzstd.pc LIBDIR=/tmp/foo prefix=/tmp/foo
make -C lib clean libzstd.pc INCLUDEDIR=/tmp/foo prefix=/tmp/foo
echo # should also succeed
make -C lib clean libzstd.pc prefix=/foo LIBDIR=/foo/bar INCLUDEDIR=/foo/
cat lib/libzstd.pc
mkdir out
cd out
cmake ../build/cmake
make
cat lib/libzstd.pc
```
When the output buffer is `NULL` with size 0, but the frame content size
is non-zero, we will write to the NULL pointer because our bounds check
underflowed.
This was exposed by a recent PR that allowed an empty frame into the
single-pass shortcut in streaming mode.
* Fix the bug.
* Fix another NULL dereference in zstd-v1.
* Overflow checks in 32-bit mode.
* Add a dedicated test.
* Expose the bug in the dedicated simple_decompress fuzzer.
* Switch all mallocs in fuzzers to return NULL for size=0.
* Fix a new timeout in a fuzzer.
Neither clang nor gcc show a decompression speed regression on x86-64.
On x86-32 clang is slightly positive and gcc loses 2.5% of speed.
Credit to OSS-Fuzz.
This diff reorganizes the `lib/Makefile` to extract various settings that a
user would normally invoke together (supposing that they were aware of them)
if they were trying to build the smallest `libzstd` possible. It collects
these settings under a master setting `ZSTD_LIB_MIN_SIZE`.
Also document this new option.
`-Wall` implies `-Wformat-zero-length`, which will cause compilation to fail
under `-Werror` when an empty string is passed as the format string to a
`printf`-family function. This commit moves us back to prefixing the provided
format string, which successfully avoids that warning.
However, this removes the failure mode where that `RAWLOG` invocation would
fail to compile when no format string was provided at all (which was desirable
to avoid having code that would successfully compile normally but fail under
`-pedantic`, which *does* require that a non-zero number of args are provided).
So this commit also introduces a function which does nothing at all, but will
fail to compile if not provided with at least one argument, which is a string.
This successfully links the compilability of pedantic and non-pedantic builds.
Fixes:
Enable RLE blocks for superblock mode
Fix the limitation that the literals block must shrink. Instead, when we're within 200 bytes of the next header byte size, we will just use the next one up. That way we should (almost?) always have space for the table.
Remove the limitation that the first sub-block MUST have compressed literals and be compressed. Now one sub-block MUST be compressed (otherwise we fall back to raw block which is okay, since that is streamable). If no block has compressed literals that is okay, we will fix up the next Huffman table.
Handle the case where the last sub-block is uncompressed (maybe it is very small). Before it would skip superblock in this case, now we allow the last sub-block to be uncompressed. To do this we need to regenerate the correct repcodes.
Respect disableLiteralsCompression in superblock mode
Fix superblock mode to handle a block consisting of only compressed literals
Fix a off by 1 error in superblock mode that disabled it whenever there were last literals
Fix superblock mode with long literals/matches (> 0xFFFF)
Allow superblock mode to repeat Huffman tables
Respect ZSTD_minGain().
Tests:
Simple check for the condition in #2096.
When the simple_round_trip fuzzer enables superblock mode, it checks that the compressed size isn't expanded too much.
Remaining limitations:
O(targetCBlockSize^2) because we recompute statistics every sequence
Unable to split literals of length > targetCBlockSize into multiple sequences
Refuses to generate sub-blocks that don't shrink the compressed data, so we could end up with large sub-blocks. We should emit those sections as uncompressed blocks instead.
...
Fixes#2096