We've already checked that we are more than FASTLOOP_SAFE_DISTANCE
away from the end, so this branch can never be true, we will have
already jumped to the second decode loop.
Use LZ4_wildCopy16 for variable-length literals. For literal counts that
fit in the flag byte, copy directly. We can also omit oend checks for
roughly the same reason as the previous shortcut: We check once that both
match length and literal length fit in FASTLOOP_SAFE_DISTANCE, including
wildcopy distance.
Add an LZ4_wildCopy16, that will wildcopy, potentially smashing up
to 16 bytes, and use it for match copy. On x64, this avoids many
blocked loads due to store forwarding, similar to issue #411.
Copy the main loop, and change checks such that op is always less
than oend-SAFE_DISTANCE. Currently these are added for the literal
copy length check, and for the match copy length check.
Otherwise the first loop is exactly the same as the second. Follow on
diffs will optimize the first copy loop based on this new requirement.
I also tried instead making a separate inlineable function for the copy
loop (similar to existing partialDecode flags, etc), but I think the
changes might be significant enough to warrent doubling the code, instead
pulling out common functionality to separate functions.
This is the basic transformation that will allow several following optimisations.
Every 0xff byte in the compressed block corresponds to a length of 255 (not 256) in the input data. For long repeating sequences, using (length >> 8) may generate bad compressed blocks.
Dictionaries don't need to be > 4 bytes, they need to be >= 4 bytes. This test
was overly conservative.
Also removes the test in `LZ4_attach_dictionary()`.
Fixes a mismatch in behavior between loading into the context (via
`LZ4_loadDict()`) a very small (<= 4 bytes) non-contiguous dictionary, versus
attaching it with `LZ4_attach_dictionary()`.
Before this patch, this divergence could be reproduced by running
```
make -C tests fuzzer MOREFLAGS="-m32"
tests/fuzzer -v -s1239 -t3146
```
Making sure these two paths behave exactly identically is an easy way to test
the correctness of the attach path, so it's desirable that this remain an
unpolluted, high signal test.
following recommendations by @raggi.
The fix is slightly different, but achieves the same goal,
and is backed by a test tool which proves that it works
(generates the error before the patch, no longer after the patch).
which actively tries to make it write out of bound.
For this scenario to be possible,
it's necessary to set dstCapacity < LZ4F_compressBound()
When a compression operation fails,
the CCtx context is left in an undefined state,
therefore compression cannot resume.
As a consequence :
- round trip tests must be aborted, since there is nothing valid to decompress
- most users avoid this situation, by ensuring that dstCapacity >= LZ4F_compressBound()
For these reasons, this use case was poorly tested up to now.
when LZ4F_decompress() decodes an uncompressed block,
it provides an incorrect hint for next block
when frame checksum is enabled and block checksum is not.
Impact is low : the hint is just an hint,
the decoder works whatever the amount of input provided.
But the assumption that each call to LZ4F_decompress()
would generate just one complete block if input size hint was respected
was broken by this error.
so "funny" thing with cppcheck
is that no 2 versions give the same list of warnings.
On Mac, I'm using v1.81, which had all warnings fixed.
On Travis CI, it's v1.61, and it complains about a dozen more/different things.
On Linux, it's v1.72, and it finds a completely different list of a half dozen warnings.
Some of these seems to be bugs/limitations in cppcheck itself.
The TravisCI version v1.61 seems unable to understand %zu correctly, and seems to assume it means %u.