when src ptr is in very low memory area (< 64K),
the virtual reference to data in dictionary
might end up in a very high memory address.
Since it's not a "real" memory address,
just a virtual one, to calculate distance,
it doesn't matter : only distance matters.
The assert was to restrictive.
Fixed.
by enabling the fast decoder path.
Visual requires a different set of macro constants to detect x86 / x64.
On my laptop, decoding speed on x64 went up from 3.12 to 3.45 GB/s.
32-bit is less impressive, though still favorable,
with speed increasing from 2.55 to 2.60 GB/s.
So both cases are now enabled.
Suggested by Bartosz Taudul (@wolfpld).
Otherwise, the output from decoding LZ4-compressed input could be
platform dependent.
Also add a compile-time check to confirm the existing code's assumptions
that, if <stdint.h> isn't used, then sizeof(int) == 4.
Updates #792
PR#756 fixed the data corruption bug, but didn't clear `ip`. PR#760
fixed that off-by-one error, but missed the case where `ip == filledIp`,
which is harder for the fuzzers to find (it took 20 days not 1 day).
Verified this fixed the issue reported by OSS-Fuzz.
Credit to OSS-Fuzz.
When using an empty dictionary, we bail out of loading or attaching it in
ways that leave the working context in potentially slightly different states.
In particular, in some paths, we will cause the currentOffset to be non-zero,
while in others we would allow it to remain 0.
This difference in behavior is perfectly harmless, but in some situations, it
can produce slight differences in the compressed output. For sanity's sake,
we currently try to maintain a strict correspondence between the behavior of
the dict attachment and the dict loading paths. This patch restores them to
behaving identically.
This shouldn't have any negative side-effects, as far as I can tell. When
writing the dict attachment code, I tried to preserve zeroed currentOffsets
when possible, since they benchmarked as very slightly faster. However, the
case of attaching an empty dictionary is probably rare enought that it's
acceptable to minisculely degrade performance in that corner case.
When the match is very long and found quickly, we can do
matchLength * nbCompares iterations through the chain
swapping, which can really slow down compression.
It is important to continue to look backwards if the current pattern
reaches `lowPrefixPtr`. If the pattern detection doesn't go all the
way to the beginning of the pattern, or the end of the pattern it
slows down the search instead of speeding it up.
The slow unit in `round_trip_stream_fuzzer` used to take 12 seconds
to run with -O3, now it takes 0.2 seconds.
Credit to OSS-Fuzz
This diff fixes an issue in which we failed to clear the `dictCtx` in HC
compression. The `dictCtx` is not supposed to be used when an `extDict` is
present: matches found in the `dictCtx` do not account for the presence of an
`extDict` segment, and their offsets are therefore miscalculated when one is
present. This can lead to data corruption.
This diff clears the `dictCtx` whenever setting an `extDict`.
This issue was uncovered by @terrelln's fuzzing work.
It's now possible to select a custom LZ4_DISTANCE_MAX at compile time,
provided it's <= 65535.
However, in some cases (when compressing in byU16 mode),
the new distance wasn't respected,
as it used to implied that it was necessarily within range.
Added a distance check for this case.
Also : added a new TravisCI test which ensures that
custom LZ4_DISTANCE_MAX compiles correctly
and compresses correctly (relying on `assert()` to find outsized offsets).
When I did FAST_DEC_LOOP performance test, I found the
offset check is much more than v1.8.3
You will see the condition check difference via lzbench with dickens test case.
v1.8.3 34959
v.1.9.x 1055885
After investigate the code, we could see the difference.
v.1.8.3 SKIP the condition check if
if condition is true in:
https://github.com/lz4/lz4/blob/v1.8.3/lib/lz4.c#L1463
AND below condition is true
https://github.com/lz4/lz4/blob/v1.8.3/lib/lz4.c#L1478\
The offset check should be invoked.
v1.9.3
The offset check code will be invoked in every loop which lead to downgrade.
So the fix would be move this check to specific condition
to avoid useless condition check.
After this change, the call number is same as v1.8.3