Commit Graph

851 Commits

Author SHA1 Message Date
Bartosz Taudul
7224f9bd5d Force inline small functions used by LZ4_compress_generic. 2020-01-17 00:37:47 +01:00
Yann Collet
d755f87f9f fixed lz4hc assert error
when src ptr is in very low memory area (< 64K),
the virtual reference to data in dictionary
might end up in a very high memory address.

Since it's not a "real" memory address,
just a virtual one, to calculate distance,
it doesn't matter : only distance matters.

The assert was to restrictive.
Fixed.
2019-12-03 14:49:22 -08:00
Yann Collet
0f6cbd996f faster decoding speed with Visual
by enabling the fast decoder path.
Visual requires a different set of macro constants to detect x86 / x64.

On my laptop, decoding speed on x64 went up from 3.12 to 3.45 GB/s.
32-bit is less impressive, though still favorable,
with speed increasing from 2.55 to 2.60 GB/s.

So both cases are now enabled.

Suggested by Bartosz Taudul (@wolfpld).
2019-12-02 16:38:33 -08:00
Nigel Tao
c5a83c1a48 Have read_variable_length use fixed size types
Otherwise, the output from decoding LZ4-compressed input could be
platform dependent.

Also add a compile-time check to confirm the existing code's assumptions
that, if <stdint.h> isn't used, then sizeof(int) == 4.

Updates #792
2019-09-21 12:38:46 +10:00
Nick Terrell
d7cad81093 [LZ4_compress_destSize] Fix off-by-one error
PR#756 fixed the data corruption bug, but didn't clear `ip`. PR#760
fixed that off-by-one error, but missed the case where `ip == filledIp`,
which is harder for the fuzzers to find (it took 20 days not 1 day).

Verified this fixed the issue reported by OSS-Fuzz.

Credit to OSS-Fuzz.
2019-08-09 10:36:46 -07:00
W. Felix Handte
4c58006719 Only Bump Offset When Attaching Non-Null Dictionary
We do want to bump, even if the dictionary is empty, but we **don't** want to
bump if the dictionary is null.
2019-08-06 19:08:41 -04:00
W. Felix Handte
4f49d744e8 Add Attach Dict Debug Log 2019-08-06 18:54:03 -04:00
W. Felix Handte
918269a4e3 Make Attaching an Empty Dict Behave the Same as Using it Directly
When using an empty dictionary, we bail out of loading or attaching it in
ways that leave the working context in potentially slightly different states.
In particular, in some paths, we will cause the currentOffset to be non-zero,
while in others we would allow it to remain 0.

This difference in behavior is perfectly harmless, but in some situations, it
can produce slight differences in the compressed output. For sanity's sake,
we currently try to maintain a strict correspondence between the behavior of
the dict attachment and the dict loading paths. This patch restores them to
behaving identically.

This shouldn't have any negative side-effects, as far as I can tell. When
writing the dict attachment code, I tried to preserve zeroed currentOffsets
when possible, since they benchmarked as very slightly faster. However, the
case of attaching an empty dictionary is probably rare enought that it's
acceptable to minisculely degrade performance in that corner case.
2019-08-06 18:50:33 -04:00
Yann Collet
b5b9760c80
Merge pull request #772 from lz4/offset0
silence msan warning when offset==0
2019-08-06 19:17:16 +02:00
Yann Collet
e18fbd51c1 silence msan warning when offset==0 2019-08-06 15:35:49 +02:00
Nick Terrell
064adb2e8d [lz4hc] Chain swap with acceleration 2019-07-31 10:17:26 -07:00
Nick Terrell
38c3945de3 [lz4hc] Only allow chain swapping forwards
When the match is very long and found quickly, we can do
matchLength * nbCompares iterations through the chain
swapping, which can really slow down compression.
2019-07-31 10:17:26 -07:00
Nick Terrell
be1738aa46 [lz4hc] Fix pattern detection end of dictionary
The pattern detection in extDict mode could put `matchIndex`
within the last 3 bytes of the dictionary. This would cause
a read out of bounds.
2019-07-31 10:17:21 -07:00
Nick Terrell
58ea585878 [lz4hc] Fix minor pessimization in extDict pattern matching
We should be comparing `matchPtr` not `ip`. This bug just means
that this branch was not taken, so we might miss some of the
forward length.
2019-07-31 10:16:25 -07:00
Nick Terrell
7e97bf377d [lz4hc] Improve pattern detection in ext dict
It is important to continue to look backwards if the current pattern
reaches `lowPrefixPtr`. If the pattern detection doesn't go all the
way to the beginning of the pattern, or the end of the pattern it
slows down the search instead of speeding it up.

The slow unit in `round_trip_stream_fuzzer` used to take 12 seconds
to run with -O3, now it takes 0.2 seconds.

Credit to OSS-Fuzz
2019-07-31 10:16:21 -07:00
Nick Terrell
4c1d4c437d [LZ4HC] Speed up pattern compression with external dictionary
Fixes #761.
2019-07-24 10:59:20 -07:00
Yann Collet
fb8a159436
Merge pull request #763 from terrelln/unused
[lz4frame] Fix unused variable warnings in fuzzing mode
2019-07-19 16:54:01 -07:00
Yann Collet
7a516411d4
Merge pull request #760 from terrelln/destSize
[LZ4_compress_destSize] Fix off-by-one error in fix
2019-07-19 15:22:51 -07:00
Nick Terrell
87e52f7d5d [lz4frame] Fix unused variable warnings in fuzzing mode 2019-07-19 14:44:06 -07:00
Nick Terrell
b487660309 [lz4frame] Skip magic and checksums in fuzzing mode
When `FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION` is defined we skip
magic and checksum checks. This makes it easier to fuzz decompression.
2019-07-18 18:45:32 -07:00
Nick Terrell
1f236e0790 Fix LZ4_attach_dictionary with empty dictionary 2019-07-18 12:29:15 -07:00
Nick Terrell
7c32101c65 [LZ4_compress_destSize] Fix off-by-one error in fix
The next match is looking at the current ip, not the next ip,
so it needs to be cleared as well.

Credit to OSS-Fuzz
2019-07-18 12:20:29 -07:00
W. Felix Handte
40943ba0c9 Unconditionally Clear dictCtx 2019-07-18 13:35:12 -04:00
W. Felix Handte
369fb3900c Fix Data Corruption Bug when Streaming with an Attached Dict in HC Mode
This diff fixes an issue in which we failed to clear the `dictCtx` in HC
compression. The `dictCtx` is not supposed to be used when an `extDict` is
present: matches found in the `dictCtx` do not account for the presence of an
`extDict` segment, and their offsets are therefore miscalculated when one is
present. This can lead to data corruption.

This diff clears the `dictCtx` whenever setting an `extDict`.

This issue was uncovered by @terrelln's fuzzing work.
2019-07-18 12:48:41 -04:00
Nick Terrell
13a2d9e34f [LZ4_compress_destSize] Fix overflow condition 2019-07-17 11:50:47 -07:00
Nick Terrell
6bc6f836a1 [LZ4_compress_destSize] Fix rare data corruption bug 2019-07-17 11:38:38 -07:00
Nick Terrell
690009e2c2 [LZ4_compress_destSize] Allow 2 more bytes of match length 2019-07-17 11:07:24 -07:00
Yann Collet
7654a5a6d2
Merge pull request #752 from terrelln/fuzzers
[ossfuzz] Improve the fuzzers
2019-07-16 11:18:09 -07:00
Nick Terrell
725cb0aafd [lz4] Fix bugs in partial decoding
* Partial decoding could read a few bytes beyond the end of the input
* Partial decoding returned an error with an empty output buffer
2019-07-15 12:21:59 -07:00
Yann Collet
6654c2cd3b ensure conformance with custom LZ4_DISTANCE_MAX
It's now possible to select a custom LZ4_DISTANCE_MAX at compile time,
provided it's <= 65535.

However, in some cases (when compressing in byU16 mode),
the new distance wasn't respected,
as it used to implied that it was necessarily within range.

Added a distance check for this case.
Also : added a new TravisCI test which ensures that
custom LZ4_DISTANCE_MAX compiles correctly
and compresses correctly (relying on `assert()` to find outsized offsets).
2019-07-15 12:11:34 -07:00
Hitatm
8ac954aa71 bugfix: correctly control the offset < LZ4_DISTANCE_MAX,when change the value of LZ4_DISTANCE_MAX, 2019-07-15 22:53:46 +08:00
Sylvestre Ledru
12e5841e76
Remove an useless declaration 2019-07-04 18:13:36 +02:00
Yann Collet
bb5c34a875 bumped version number to v1.9.2
to reduce risks that future bug reports in `dev` branch report `v1.9.1` as the failing version.
2019-07-01 09:01:43 -07:00
Nick Terrell
e72d442300 Fix out-of-bounds read of up to 64 KB in the past 2019-06-28 14:58:35 -07:00
Yann Collet
1d759576b9 precise again that LZ4 decoder needs metadata
and that such metadata must be provided / sent / saved by the application.
2019-06-06 13:20:30 -07:00
Yann Collet
348e107d99 restored FORCE_INLINE 2019-06-04 14:04:49 -07:00
Yann Collet
280fc0856d
Merge pull request #717 from lz4/inplace
Added documentation and macro to support in-place compression and decompression
2019-05-31 12:59:38 -07:00
Yann Collet
5997e139f5 added more details for in-place documentation 2019-05-31 11:56:59 -07:00
Yann Collet
33cb8518ac decompress: changed final memcpy() into memmove()
for compatibility with in-place decompression scenarios.
2019-05-31 11:44:37 -07:00
Chenxi Mao
64b5917736 FAST_DEC_LOOP: only did offset check in specific condition.
When I did FAST_DEC_LOOP performance test, I found the
offset check is much more than v1.8.3

You will see the condition check difference via lzbench with dickens test case.
v1.8.3 34959
v.1.9.x 1055885

After investigate the code, we could see the difference.
v.1.8.3 SKIP the condition check if
if condition is true in:
https://github.com/lz4/lz4/blob/v1.8.3/lib/lz4.c#L1463
AND below condition is true
https://github.com/lz4/lz4/blob/v1.8.3/lib/lz4.c#L1478\
The offset check should be invoked.

v1.9.3
The offset check code will be invoked in every loop which lead to downgrade.
So the fix would be move this check to specific condition
to avoid useless condition check.

After this change, the call number is same as v1.8.3
2019-05-31 08:36:13 +08:00
Yann Collet
676d46df27 updated LZ4_DECOMPRESS_INPLACE_MARGIN
to pass worst case scenario.
Now adds margin proportional to input size to counter local expansion.
2019-05-30 16:19:30 -07:00
Yann Collet
22adbb176a add more doc on in-place (de)compression 2019-05-30 09:45:21 -07:00
Yann Collet
76116495bf some more minor conversion warnings fixes 2019-05-29 13:14:52 -07:00
Yann Collet
444550defa ensure lz4.h can be included with or without LZ4_STATIC_LINKING_ONLY in any order
ensure correct propagation of LZ4_DISTANCE_MAX
2019-05-29 12:21:14 -07:00
Yann Collet
b17f578a91 added comments and macros for in-place (de)compression 2019-05-29 12:06:13 -07:00
Niko Dzhus
2be2fe43a8 fix temporary buffer use when input size hint is respected 2019-05-24 22:08:44 +03:00
Yann Collet
a7151324af
Merge pull request #708 from gabrielstedman/list
Add multiframe report to --list command
2019-05-16 15:56:42 -07:00
gstedman
98a86c8ef6 Add multiframe report to --list command 2019-05-15 21:13:19 +01:00
George Prekas
605d811e6c enable LZ4_FAST_DEC_LOOP build macro on aarch64/GCC by default 2019-05-07 08:36:06 -05:00
Brenden Eng
9e056bc032 Include block checksum in worst case scenario calculation of dstCapacity 2019-04-25 22:37:39 -04:00