Commit Graph

2306 Commits

Author SHA1 Message Date
Yann Collet
d755f87f9f fixed lz4hc assert error
when src ptr is in very low memory area (< 64K),
the virtual reference to data in dictionary
might end up in a very high memory address.

Since it's not a "real" memory address,
just a virtual one, to calculate distance,
it doesn't matter : only distance matters.

The assert was to restrictive.
Fixed.
2019-12-03 14:49:22 -08:00
Yann Collet
0f6cbd996f faster decoding speed with Visual
by enabling the fast decoder path.
Visual requires a different set of macro constants to detect x86 / x64.

On my laptop, decoding speed on x64 went up from 3.12 to 3.45 GB/s.
32-bit is less impressive, though still favorable,
with speed increasing from 2.55 to 2.60 GB/s.

So both cases are now enabled.

Suggested by Bartosz Taudul (@wolfpld).
2019-12-02 16:38:33 -08:00
Yann Collet
bed083b3c7
Merge pull request #815 from andrewthad/patch-1
Fix typos in streaming_api_basics.md
2019-11-30 08:16:58 -08:00
Andrew Martin
059682fb01
Fix typos in streaming_api_basics.md 2019-11-30 06:58:54 -05:00
Yann Collet
6e0d2be73a
Merge pull request #808 from rkoradi/benchmarkWithDictionary
Make benchmark compatible with dictionary compression
2019-11-06 09:36:56 -08:00
Reto Koradi
cc91777c98 Make benchmark compatible with dictionary compression
Support the -D command line option for running benchmarks. The
benchmark code was slightly restructured to factor out the calls
that need to be different for each benchmark scenario. Since there
are now 4 scenarios (all combinations of fast/HC and with/without
dictionary), the logic was getting somewhat convoluted otherwise.

This was done by extending the compressionParameters struct that
previously contained just a single function pointer. It now
contains 4 function pointers for init/reset/compress/cleanup,
with the related state. The functions get a pointer to the
structure as their first argument (inspired by C++), so that they
can access the state values in the struct.
2019-11-05 23:38:00 -08:00
Yann Collet
e8baeca51e
Merge pull request #798 from bimbashrestha/adding_cirrus_test
Adding cirrus test for FreeBSD
2019-10-07 09:51:15 -07:00
Bimba Shrestha
dcdb9c73c0 Adding unamestr var 2019-10-07 08:12:42 -07:00
Bimba Shrestha
77e6ff1160 Adding condition for FreeBSD and using gmake 2019-10-04 10:12:22 -07:00
Bimba Shrestha
c0acce96c1 Using instead of gmake (to address the travis failure) 2019-10-04 09:29:58 -07:00
Bimba Shrestha
b2c1fa8c59 Using gmake instead of make 2019-10-04 08:40:46 -07:00
Bimba Shrestha
60d6533b9e Adding cirrus config file for freebsd-12-0 2019-10-04 08:36:08 -07:00
Yann Collet
689dff5b44
Merge pull request #796 from jcaesar/dev
meson: move one layer deeper to allow easy construction of a wrap file
2019-09-26 13:01:55 -07:00
Julius Michaelis
b6623e710d meson: move one layer deeper to allow easy construction of a wrap file 2019-09-26 17:29:04 +09:00
Yann Collet
e9d8e15263
Merge pull request #794 from bimbashrestha/compress_frame_fuzzer_heap_overflow
Using size instead of LZ4_compressBound(size) <- causes heap overflow
2019-09-23 12:50:05 -07:00
Bimba Shrestha
192161e97e Using size instead of LZ4_compressBound(size) <- causes heap overflow 2019-09-23 11:54:56 -07:00
Yann Collet
d5ceafd411
Merge pull request #793 from nigeltao/dev
Have read_variable_length use fixed size types
2019-09-21 09:55:59 -07:00
Nigel Tao
c5a83c1a48 Have read_variable_length use fixed size types
Otherwise, the output from decoding LZ4-compressed input could be
platform dependent.

Also add a compile-time check to confirm the existing code's assumptions
that, if <stdint.h> isn't used, then sizeof(int) == 4.

Updates #792
2019-09-21 12:38:46 +10:00
Yann Collet
804d47cf78
Merge pull request #790 from bimbashrestha/seperating_seed_generation_and_use_in_fuzzers
Separating the seed generation and use in FUZZ_dataProducer api
2019-09-18 10:21:43 -07:00
Bimba Shrestha
8edc5879d0 Retreiving 32 bits from the end for fuzzer 2019-09-13 18:08:58 -07:00
Bimba Shrestha
9cb73d69c4 Addressing naming nits and moving size modification up in all fuzzers 2019-09-13 16:04:48 -07:00
Bimba Shrestha
208694297a Seperating the seed generation and use 2019-09-13 14:07:52 -07:00
Yann Collet
9b2b96edc4
Merge pull request #770 from neheb/dev
util.h: Remove deprecated utime for non-Windows
2019-09-10 12:56:53 -07:00
Rosen Penev
a55095ddd6
util.h: Remove deprecated utime for non-Windows
utime was deprecated in POSIX 2008.
2019-09-10 11:29:05 -07:00
Yann Collet
64e3134b2a
Merge pull request #785 from bimbashrestha/transfer_remaining_fuzzers_to_consume_from_end_of_input
Making fuzzers use dataProducer api instead of random seed for decisions
2019-09-09 12:03:26 -07:00
Bimba Shrestha
7d153a704d Making fuzzers use dataProducer api instead of random seed for decisions 2019-08-30 10:27:42 -07:00
Yann Collet
28964f4bea fixed #778
fixed assert() when divisor == 0
2019-08-21 13:44:24 +02:00
Yann Collet
09d9f27b0a
Merge pull request #779 from bimbashrestha/dev
Adding fuzz data producer for uint32 and using in decompress_fuzzer
2019-08-20 16:48:44 -07:00
bimbashrestha
dc17d39c2f Adding comments, fixing nit, and hiding the struct in data producer api 2019-08-16 17:14:47 -07:00
bimbashrestha
f839e9fe8a Seperating fuzz data producer api impl and header, using data producer on the easy fuzzers 2019-08-16 16:43:28 -07:00
bimbashrestha
a9ac056456 Created a data producer API and used in decompress_fuzzer 2019-08-16 14:19:06 -07:00
bimbashrestha
fad8c97532 Adding fuzz data producer for uint32 and using in decompress_fuzzer
Summary: Consuming bytes from the end of data instead of from the front to prevent "all-in-one" decisions.

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
2019-08-16 10:50:46 -07:00
Yann Collet
fdf2ef5809 fixed test error
could trigger %0 on exceptional circumstances
due to wrong buffer size parameter.
2019-08-15 13:59:59 +02:00
Yann Collet
dfad84ca3e
Merge pull request #777 from terrelln/off-by-one
[LZ4_compress_destSize] Fix off-by-one error
2019-08-10 02:08:03 +02:00
Nick Terrell
d7cad81093 [LZ4_compress_destSize] Fix off-by-one error
PR#756 fixed the data corruption bug, but didn't clear `ip`. PR#760
fixed that off-by-one error, but missed the case where `ip == filledIp`,
which is harder for the fuzzers to find (it took 20 days not 1 day).

Verified this fixed the issue reported by OSS-Fuzz.

Credit to OSS-Fuzz.
2019-08-09 10:36:46 -07:00
Yann Collet
1bcde6414a
Merge pull request #773 from felixhandte/attach-empty-dict-behavior-conformance
Make Attaching an Empty Dict Behave the Same as Using it Directly
2019-08-08 01:48:53 +02:00
W. Felix Handte
4c58006719 Only Bump Offset When Attaching Non-Null Dictionary
We do want to bump, even if the dictionary is empty, but we **don't** want to
bump if the dictionary is null.
2019-08-06 19:08:41 -04:00
W. Felix Handte
4f49d744e8 Add Attach Dict Debug Log 2019-08-06 18:54:03 -04:00
W. Felix Handte
918269a4e3 Make Attaching an Empty Dict Behave the Same as Using it Directly
When using an empty dictionary, we bail out of loading or attaching it in
ways that leave the working context in potentially slightly different states.
In particular, in some paths, we will cause the currentOffset to be non-zero,
while in others we would allow it to remain 0.

This difference in behavior is perfectly harmless, but in some situations, it
can produce slight differences in the compressed output. For sanity's sake,
we currently try to maintain a strict correspondence between the behavior of
the dict attachment and the dict loading paths. This patch restores them to
behaving identically.

This shouldn't have any negative side-effects, as far as I can tell. When
writing the dict attachment code, I tried to preserve zeroed currentOffsets
when possible, since they benchmarked as very slightly faster. However, the
case of attaching an empty dictionary is probably rare enought that it's
acceptable to minisculely degrade performance in that corner case.
2019-08-06 18:50:33 -04:00
Yann Collet
b5b9760c80
Merge pull request #772 from lz4/offset0
silence msan warning when offset==0
2019-08-06 19:17:16 +02:00
Yann Collet
e18fbd51c1 silence msan warning when offset==0 2019-08-06 15:35:49 +02:00
Yann Collet
0726bddabd
Merge pull request #771 from terrelln/rep-ext-fix
[lz4hc] Further improve pattern detection and chain swapping
2019-08-02 01:27:53 +02:00
Nick Terrell
064adb2e8d [lz4hc] Chain swap with acceleration 2019-07-31 10:17:26 -07:00
Nick Terrell
38c3945de3 [lz4hc] Only allow chain swapping forwards
When the match is very long and found quickly, we can do
matchLength * nbCompares iterations through the chain
swapping, which can really slow down compression.
2019-07-31 10:17:26 -07:00
Nick Terrell
be1738aa46 [lz4hc] Fix pattern detection end of dictionary
The pattern detection in extDict mode could put `matchIndex`
within the last 3 bytes of the dictionary. This would cause
a read out of bounds.
2019-07-31 10:17:21 -07:00
Nick Terrell
58ea585878 [lz4hc] Fix minor pessimization in extDict pattern matching
We should be comparing `matchPtr` not `ip`. This bug just means
that this branch was not taken, so we might miss some of the
forward length.
2019-07-31 10:16:25 -07:00
Nick Terrell
7e97bf377d [lz4hc] Improve pattern detection in ext dict
It is important to continue to look backwards if the current pattern
reaches `lowPrefixPtr`. If the pattern detection doesn't go all the
way to the beginning of the pattern, or the end of the pattern it
slows down the search instead of speeding it up.

The slow unit in `round_trip_stream_fuzzer` used to take 12 seconds
to run with -O3, now it takes 0.2 seconds.

Credit to OSS-Fuzz
2019-07-31 10:16:21 -07:00
Yann Collet
ce9176a68d
Merge pull request #768 from terrelln/rep-ext
[LZ4HC] Speed up pattern compression with external dictionary
2019-07-24 13:47:19 -07:00
Nick Terrell
4c1d4c437d [LZ4HC] Speed up pattern compression with external dictionary
Fixes #761.
2019-07-24 10:59:20 -07:00
Yann Collet
805947ffcb
Merge pull request #766 from Low-power/cli-option---best
Add option '--best' to lz4(1)
2019-07-23 01:00:53 -07:00