Commit Graph

2470 Commits

Author SHA1 Message Date
W. Felix Handte
4c58006719 Only Bump Offset When Attaching Non-Null Dictionary
We do want to bump, even if the dictionary is empty, but we **don't** want to
bump if the dictionary is null.
2019-08-06 19:08:41 -04:00
W. Felix Handte
4f49d744e8 Add Attach Dict Debug Log 2019-08-06 18:54:03 -04:00
W. Felix Handte
918269a4e3 Make Attaching an Empty Dict Behave the Same as Using it Directly
When using an empty dictionary, we bail out of loading or attaching it in
ways that leave the working context in potentially slightly different states.
In particular, in some paths, we will cause the currentOffset to be non-zero,
while in others we would allow it to remain 0.

This difference in behavior is perfectly harmless, but in some situations, it
can produce slight differences in the compressed output. For sanity's sake,
we currently try to maintain a strict correspondence between the behavior of
the dict attachment and the dict loading paths. This patch restores them to
behaving identically.

This shouldn't have any negative side-effects, as far as I can tell. When
writing the dict attachment code, I tried to preserve zeroed currentOffsets
when possible, since they benchmarked as very slightly faster. However, the
case of attaching an empty dictionary is probably rare enought that it's
acceptable to minisculely degrade performance in that corner case.
2019-08-06 18:50:33 -04:00
Yann Collet
b5b9760c80
Merge pull request #772 from lz4/offset0
silence msan warning when offset==0
2019-08-06 19:17:16 +02:00
Yann Collet
e18fbd51c1 silence msan warning when offset==0 2019-08-06 15:35:49 +02:00
Yann Collet
0726bddabd
Merge pull request #771 from terrelln/rep-ext-fix
[lz4hc] Further improve pattern detection and chain swapping
2019-08-02 01:27:53 +02:00
Nick Terrell
064adb2e8d [lz4hc] Chain swap with acceleration 2019-07-31 10:17:26 -07:00
Nick Terrell
38c3945de3 [lz4hc] Only allow chain swapping forwards
When the match is very long and found quickly, we can do
matchLength * nbCompares iterations through the chain
swapping, which can really slow down compression.
2019-07-31 10:17:26 -07:00
Nick Terrell
be1738aa46 [lz4hc] Fix pattern detection end of dictionary
The pattern detection in extDict mode could put `matchIndex`
within the last 3 bytes of the dictionary. This would cause
a read out of bounds.
2019-07-31 10:17:21 -07:00
Nick Terrell
58ea585878 [lz4hc] Fix minor pessimization in extDict pattern matching
We should be comparing `matchPtr` not `ip`. This bug just means
that this branch was not taken, so we might miss some of the
forward length.
2019-07-31 10:16:25 -07:00
Nick Terrell
7e97bf377d [lz4hc] Improve pattern detection in ext dict
It is important to continue to look backwards if the current pattern
reaches `lowPrefixPtr`. If the pattern detection doesn't go all the
way to the beginning of the pattern, or the end of the pattern it
slows down the search instead of speeding it up.

The slow unit in `round_trip_stream_fuzzer` used to take 12 seconds
to run with -O3, now it takes 0.2 seconds.

Credit to OSS-Fuzz
2019-07-31 10:16:21 -07:00
Yann Collet
ce9176a68d
Merge pull request #768 from terrelln/rep-ext
[LZ4HC] Speed up pattern compression with external dictionary
2019-07-24 13:47:19 -07:00
Nick Terrell
4c1d4c437d [LZ4HC] Speed up pattern compression with external dictionary
Fixes #761.
2019-07-24 10:59:20 -07:00
Yann Collet
805947ffcb
Merge pull request #766 from Low-power/cli-option---best
Add option '--best' to lz4(1)
2019-07-23 01:00:53 -07:00
WHR
eee8cc79e7 lz4cli: add option '--best' as an alias of '-12' 2019-07-23 13:37:11 +08:00
Yann Collet
fb8a159436
Merge pull request #763 from terrelln/unused
[lz4frame] Fix unused variable warnings in fuzzing mode
2019-07-19 16:54:01 -07:00
Yann Collet
7a516411d4
Merge pull request #760 from terrelln/destSize
[LZ4_compress_destSize] Fix off-by-one error in fix
2019-07-19 15:22:51 -07:00
Nick Terrell
87e52f7d5d [lz4frame] Fix unused variable warnings in fuzzing mode 2019-07-19 14:44:06 -07:00
Yann Collet
ee23c273e2
Merge pull request #758 from dooxe/develop
Added `BUNDLE	DESTINATION`
2019-07-19 09:11:12 -07:00
Yann Collet
316f2b6f4d
Merge pull request #762 from terrelln/frame-fuzz
[fuzz] Add LZ4 frame fuzzers
2019-07-18 21:53:33 -07:00
Nick Terrell
d28159c025 [fuzz] Add LZ4 frame fuzzers
* Round trip fuzzer
* Compress fuzzer
* Decompress fuzzer
2019-07-18 18:54:59 -07:00
Nick Terrell
b487660309 [lz4frame] Skip magic and checksums in fuzzing mode
When `FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION` is defined we skip
magic and checksum checks. This makes it easier to fuzz decompression.
2019-07-18 18:45:32 -07:00
Nick Terrell
1f236e0790 Fix LZ4_attach_dictionary with empty dictionary 2019-07-18 12:29:15 -07:00
Nick Terrell
675ef9a9fc [fuzz] Add HC fuzzers for round trip, compress, and streaming 2019-07-18 12:29:15 -07:00
Nick Terrell
399a80d48e [fuzzer] Update scripts for new fuzzers 2019-07-18 12:29:15 -07:00
Nick Terrell
9b258abd93 [fuzz] Add a streaming round trip fuzzer 2019-07-18 12:29:15 -07:00
Nick Terrell
7c32101c65 [LZ4_compress_destSize] Fix off-by-one error in fix
The next match is looking at the current ip, not the next ip,
so it needs to be cleared as well.

Credit to OSS-Fuzz
2019-07-18 12:20:29 -07:00
W. Felix Handte
40943ba0c9 Unconditionally Clear dictCtx 2019-07-18 13:35:12 -04:00
W. Felix Handte
369fb3900c Fix Data Corruption Bug when Streaming with an Attached Dict in HC Mode
This diff fixes an issue in which we failed to clear the `dictCtx` in HC
compression. The `dictCtx` is not supposed to be used when an `extDict` is
present: matches found in the `dictCtx` do not account for the presence of an
`extDict` segment, and their offsets are therefore miscalculated when one is
present. This can lead to data corruption.

This diff clears the `dictCtx` whenever setting an `extDict`.

This issue was uncovered by @terrelln's fuzzing work.
2019-07-18 12:48:41 -04:00
dooxe
99d925f997 Added BUNDLE DESTINATION in CMakeLists.txt so that it works with newer versions of cmake 2019-07-18 11:25:43 +02:00
Yann Collet
19b099986a
Merge pull request #756 from terrelln/destSize
[LZ4_compress_destSize + multi-blocks streaming] Fix rare data corruption bug
2019-07-17 13:25:41 -07:00
Nick Terrell
13a2d9e34f [LZ4_compress_destSize] Fix overflow condition 2019-07-17 11:50:47 -07:00
Nick Terrell
6bc6f836a1 [LZ4_compress_destSize] Fix rare data corruption bug 2019-07-17 11:38:38 -07:00
Nick Terrell
690009e2c2 [LZ4_compress_destSize] Allow 2 more bytes of match length 2019-07-17 11:07:24 -07:00
Yann Collet
7654a5a6d2
Merge pull request #752 from terrelln/fuzzers
[ossfuzz] Improve the fuzzers
2019-07-16 11:18:09 -07:00
Yann Collet
81a14ccccb
Merge pull request #755 from lz4/custom_distance
ensure conformance with custom LZ4_DISTANCE_MAX
2019-07-15 16:38:28 -07:00
Nick Terrell
3c40db8d25 [ossfuzz] Improve the fuzzers
* Run more decompression variants
* Round trip the compression fuzzer and do partial decompression as well
* Add a compression fuzzer that compresses into a smaller output buffer
  and test the destSize variant

These fuzzers caught 2 bugs that were fixed in the previous commit.
* Input buffer over-read in partial decompress
* Partial decompress fails if output size is 0
2019-07-15 12:22:04 -07:00
Nick Terrell
725cb0aafd [lz4] Fix bugs in partial decoding
* Partial decoding could read a few bytes beyond the end of the input
* Partial decoding returned an error with an empty output buffer
2019-07-15 12:21:59 -07:00
Yann Collet
6654c2cd3b ensure conformance with custom LZ4_DISTANCE_MAX
It's now possible to select a custom LZ4_DISTANCE_MAX at compile time,
provided it's <= 65535.

However, in some cases (when compressing in byU16 mode),
the new distance wasn't respected,
as it used to implied that it was necessarily within range.

Added a distance check for this case.
Also : added a new TravisCI test which ensures that
custom LZ4_DISTANCE_MAX compiles correctly
and compresses correctly (relying on `assert()` to find outsized offsets).
2019-07-15 12:11:34 -07:00
Yann Collet
a23541463d
Merge pull request #753 from Hitatm/fix_LZ4_DISTANCE_MAX
bugfix: correctly control the offset < LZ4_DISTANCE_MAX,when change t…
2019-07-15 09:08:11 -07:00
Hitatm
8ac954aa71 bugfix: correctly control the offset < LZ4_DISTANCE_MAX,when change the value of LZ4_DISTANCE_MAX, 2019-07-15 22:53:46 +08:00
Yann Collet
f1e8e806e0 keep the "lorem ipsum" topic of the example string
but make it compressible
2019-07-11 17:29:16 -07:00
Yann Collet
23bd36918e
Merge pull request #751 from hamidzr/simple-buffer-example-input
simple buffer example minor input update. fixes #750
2019-07-11 17:26:15 -07:00
Hamid Zare
771a7192d6 print the compression ratio 2019-07-11 14:39:29 -07:00
Hamid Zare
658ab8fca1 changed the input text to something more compression friendly 2019-07-11 14:35:51 -07:00
Yann Collet
eb6b599a50
Merge pull request #749 from sylvestre/patch-1
Remove an useless declaration
2019-07-04 13:03:08 -07:00
Sylvestre Ledru
12e5841e76
Remove an useless declaration 2019-07-04 18:13:36 +02:00
Yann Collet
68d045e0b2
Merge pull request #746 from lz4/circleci
CircleCI : reduced test duration
2019-07-03 16:16:52 -07:00
Yann Collet
3d68e32b73
Merge pull request #743 from lz4/fuzzasan_fixed
updated frametest
2019-07-03 16:16:19 -07:00
Yann Collet
fb52a10ced
Merge pull request #748 from amchoukir/amchoukir-doublebuffer-doc
Update blockStreaming_doubleBuffer.md
2019-07-03 16:13:44 -07:00