Nick Terrell
064adb2e8d
[lz4hc] Chain swap with acceleration
2019-07-31 10:17:26 -07:00
Nick Terrell
38c3945de3
[lz4hc] Only allow chain swapping forwards
...
When the match is very long and found quickly, we can do
matchLength * nbCompares iterations through the chain
swapping, which can really slow down compression.
2019-07-31 10:17:26 -07:00
Nick Terrell
be1738aa46
[lz4hc] Fix pattern detection end of dictionary
...
The pattern detection in extDict mode could put `matchIndex`
within the last 3 bytes of the dictionary. This would cause
a read out of bounds.
2019-07-31 10:17:21 -07:00
Nick Terrell
58ea585878
[lz4hc] Fix minor pessimization in extDict pattern matching
...
We should be comparing `matchPtr` not `ip`. This bug just means
that this branch was not taken, so we might miss some of the
forward length.
2019-07-31 10:16:25 -07:00
Nick Terrell
7e97bf377d
[lz4hc] Improve pattern detection in ext dict
...
It is important to continue to look backwards if the current pattern
reaches `lowPrefixPtr`. If the pattern detection doesn't go all the
way to the beginning of the pattern, or the end of the pattern it
slows down the search instead of speeding it up.
The slow unit in `round_trip_stream_fuzzer` used to take 12 seconds
to run with -O3, now it takes 0.2 seconds.
Credit to OSS-Fuzz
2019-07-31 10:16:21 -07:00
Nick Terrell
4c1d4c437d
[LZ4HC] Speed up pattern compression with external dictionary
...
Fixes #761 .
2019-07-24 10:59:20 -07:00
W. Felix Handte
40943ba0c9
Unconditionally Clear dictCtx
2019-07-18 13:35:12 -04:00
W. Felix Handte
369fb3900c
Fix Data Corruption Bug when Streaming with an Attached Dict in HC Mode
...
This diff fixes an issue in which we failed to clear the `dictCtx` in HC
compression. The `dictCtx` is not supposed to be used when an `extDict` is
present: matches found in the `dictCtx` do not account for the presence of an
`extDict` segment, and their offsets are therefore miscalculated when one is
present. This can lead to data corruption.
This diff clears the `dictCtx` whenever setting an `extDict`.
This issue was uncovered by @terrelln's fuzzing work.
2019-07-18 12:48:41 -04:00
Hitatm
8ac954aa71
bugfix: correctly control the offset < LZ4_DISTANCE_MAX,when change the value of LZ4_DISTANCE_MAX,
2019-07-15 22:53:46 +08:00
Yann Collet
ae199124e5
fixed read-after input in LZ4_decompress_safe()
2019-04-18 18:50:51 -07:00
Yann Collet
0b876db6d4
address a few minor Visual warnings
...
and created target cxx17build
2019-04-18 16:07:16 -07:00
Yann Collet
6fc763cd98
ensure consistent definition and usage of FREEMEM
...
as suggested by @sloutsky in #671
2019-04-16 11:26:03 -07:00
Yann Collet
474c17cdc4
unified limitedOutput_directive
...
between lz4.c and lz4hc.c .
was left in a strange state after the "amalgamation" patch.
Now only 3 directives remain,
same name across both implementations,
single definition place.
Might allow some light simplification due to reduced nb of states possible.
2019-04-15 11:09:56 -07:00
Yann Collet
dd43b913a2
fix minor visual warning
...
yet some overly cautious overflow risk flag,
while it's actually impossible, due to previous test just one line above.
Changing the cast position, just to be please the thing.
2019-04-12 16:56:22 -07:00
Yann Collet
8d76c8a44a
introduce LZ4_DISTANCE_MAX build macro
...
make it possible to generate LZ4-compressed block
with a controlled maximum offset (necessarily <= 65535).
This could be useful for compatibility with decoders
using a very limited memory budget (<64 KB).
Answer #154
2019-04-11 14:15:33 -07:00
Yann Collet
d8d5f14138
fixed loadDictHC
...
by making a full initialization
instead of a fast reset.
2019-04-09 15:37:59 -07:00
Yann Collet
14c71dfa9c
modified LZ4_initStreamHC() to look like LZ4_initStream()
...
it is now a pure initializer, for statically allocated states.
It can initialize any memory area, and because of this, requires size.
2019-04-09 13:55:42 -07:00
Yann Collet
c3f8928f87
fixed strict iso C90
2019-04-05 10:41:26 -07:00
Yann Collet
c491df54ec
created LZ4_initStreamHC()
...
- promoted LZ4_resetStreamHC_fast() to stable
- moved LZ4_resetStreamHC() to deprecated (but do not generate a warning yet)
- Updated doc, to highlight difference between init and reset
- switched all invocations of LZ4_resetStreamHC() onto LZ4_initStreamHC()
- misc: ensure `make all` also builds /tests
2019-04-04 17:05:11 -07:00
Tim Zakian
81441e2462
Make fact that certain variables that are passed into LZ4HC_encodeSequence are changed by the function call
2019-01-09 13:42:12 -08:00
qiuyangs
660d21272e
lz4hc.c: change (length >> 8) to (length / 255)
...
Every 0xff byte in the compressed block corresponds to a length of 255 (not 256) in the input data. For long repeating sequences, using (length >> 8) may generate bad compressed blocks.
2019-01-06 16:29:30 +08:00
Bing Xu
17f5071e72
Enable amalgamation of lz4hc.c and lz4.c
2018-11-15 22:24:25 -08:00
Oleg Khabinov
f27ea0774e
Adding information about dirty context for _HC_ family of functions
2018-10-10 10:33:04 -07:00
Yann Collet
8bea19d57c
fixed minor cppcheck warnings in lib
2018-09-18 15:51:26 -07:00
Yann Collet
86023f01f2
avoid final trailing comma for enum lists
...
as detected in #485 by @JoachimSchneider.
Refactored the c_standards tests
so that these issues get automatically detected in CI tests.
2018-09-13 14:29:41 -07:00
Yann Collet
30f6f34328
removed one assert() condition
...
which is not correct when using LZ4_HC with dictionary and starting from a low address (<0x10000).
2018-09-05 11:25:10 -07:00
Yann Collet
2e4847c2d5
fixed #560
...
it was a fairly complex scenario,
involving source files > 64K
and some extraordinary conditions related to specific layout of ranges of zeroes.
and only on level 9.
2018-09-04 18:21:40 -07:00
Yann Collet
ba1c7148a5
renamed variable for clarity
2018-05-07 12:14:26 -07:00
Yann Collet
200b2960d5
fixed minor conversion warning
2018-05-06 18:26:14 -07:00
Yann Collet
24b9c485db
small PA optimization
...
which measurably improves speed
on levels 9+
2018-05-06 16:53:33 -07:00
Yann Collet
cdb0275b7f
lz4hc: fixed PA / SC parameter order
...
also :
reserved PA for levels 9+ (instead of 8+).
In most cases, speed is lower, and compression benefit is not worth.
2018-05-05 14:32:57 -07:00
Yann Collet
a4e918d7a6
lz4hc: SC only enabled for opt parser
...
the trade off is not good for regular HC parser :
compression is a little bit better, but speed cost is too large in comparison.
2018-05-05 14:25:37 -07:00
Yann Collet
d097bf93f8
fixed SC.opt integration with regular HC parser
...
Only enabled when searching forward.
note : it slighly improves compression ratio,
but measurably decreases speed.
Trade-off to analyse.
2018-05-05 13:46:45 -07:00
Yann Collet
fa89a9e18b
lz4hc: fixed performance issue
...
when combining both PA and CS optimizations
2018-05-05 13:31:03 -07:00
Yann Collet
9699ba5ddf
integrated chain swapper into HC match finder
...
slower than expected
Pattern analyzer and Chain Swapper
work slower when both activated.
Reasons unclear.
2018-05-04 19:13:33 -07:00
Yann Collet
434ace7244
implemented search accelerator
...
greatly improves speed compared to non-accelerated,
especially for slower files.
On my laptop, -b12 :
```
calgary.tar : 4.3 MB/s => 9.0 MB/s
enwik7 : 10.2 MB/s => 13.3 MB/s
silesia.tar : 4.0 MB/s => 8.7 MB/s
```
Note : this is the simplified version,
without handling dictionaries, external buffer, nor pattern analyzer.
Current `dev` branch on these samples gives :
```
calgary.tar : 4.2 MB/s
enwik7 : 9.7 MB/s
silesia.tar : 3.5 MB/s
```
interestingly, it's slower,
presumably due to handling of dictionaries.
2018-05-03 16:31:41 -07:00
Yann Collet
dc42707107
created LZ4HC_FindLongestMatch()
...
simplified match finder
only searching forward and within current buffer,
for easier testing of optimizations.
2018-05-03 15:38:32 -07:00
Yann Collet
85be6b8f6d
increased nbAttempts for lz4 -12
...
shaves one more kilobyte from silesia.tar
2018-05-02 14:22:35 -07:00
Yann Collet
bd470ccd38
Merge pull request #521 from lz4/BD_deterministic
...
fix lz4hc -BD non-determinism
2018-04-30 20:40:34 -07:00
Cyan4973
6a7d501fed
renamed variable for clarity
...
lowLimit -> lowestMatchIndex
2018-04-30 18:56:16 -07:00
Yann Collet
8c574990a9
lz4hc changed variable
...
to reduce confusion
dictLowLimit => dictStart
2018-04-30 16:08:16 -07:00
Yann Collet
1e6ca25af3
Merge pull request #520 from felixhandte/frame-dict-nits
...
Minor Fixes to Dictionary Preparation in LZ4 Frame
2018-04-27 13:52:30 -07:00
Yann Collet
de7b274d99
Merge branch 'dev' into BD_deterministic
2018-04-27 12:59:20 -07:00
Yann Collet
19b1267d44
fix lz4hc -BD non-determinism
...
related to chain table update
2018-04-27 12:46:49 -07:00
Yann Collet
72e99c8939
lz4hc : minor editions for clarity
2018-04-27 12:28:58 -07:00
W. Felix Handte
fefc40fc0a
Avoid Possibly Redundant Table Clears When Loading HC Dict
2018-04-27 14:10:27 -04:00
Yann Collet
d294dd7fc6
ensure favorDecSpeed is properly initialized
...
also :
- fix a potential malloc error
- proper use of ALLOC macro inside lz4hc
- update html API doc
2018-04-27 09:04:09 -07:00
Yann Collet
0fb3a3b199
fixed a number of minor cast warnings
2018-04-26 18:08:28 -07:00
Yann Collet
5c7d3812d9
fasterDecSpeed can be triggered from cli with --favor-decSpeed
2018-04-26 15:49:32 -07:00
Yann Collet
3792d00168
favorDecSpeed feature can be triggered from lz4frame
...
and lz4hc.
2018-04-26 15:18:44 -07:00