Commit Graph

1813 Commits

Author SHA1 Message Date
Alexey Tourbin
ff9b4cf826 lz4_Block_format.md: clarify on short inputs and restrictions
It occurred to me that the formula "The last 5 bytes are always
literals", on the list of "assumptions made by the decoder", is
remarkably ambiguous.  Suppose the decoder is presented with 5 bytes.
Are they literals?  It may seem that the decoder degenerates
to memcpy on short inputs.  But of course the answer is no,
so the formula needs some clarification.

Parsing restrictions should be explained as well, otherwise they look
like arbitrary numbers.  The 5-byte restriction has been mentioned
recently in connection with the shortcut in LZ4_decompress_generic,
so I add that.  The second restriction is left to be explained
by the author.

I also took the liberty to explain that empty inputs "are either
unrepresentable or can be represented with a null byte".  This wording
may actually have some merit: it leaves for the implementation,
as opposed to the spec, to decide whether the encoder can compress
empty inputs, and whether the decoder can produce an empty output
(which the implementation should further clarify).
2018-04-25 02:39:28 +03:00
W. Felix Handte
27c6eec18d Multiply-Include Header to Check Guard Macro Correctness 2018-04-24 18:50:03 -04:00
W. Felix Handte
2dfc7cbe82 Change Over Includes in the Project 2018-04-24 16:22:28 -04:00
W. Felix Handte
2be3905fa4 Integrate lz4frame_static.h Declarations into lz4frame.h 2018-04-24 16:22:28 -04:00
Yann Collet
b2637ab7b2
Merge pull request #512 from lz4/HC_dict
In-place unmutable dictionaries for LZ4HC
2018-04-24 13:18:40 -07:00
Yann Collet
8c6ca6283d
Merge pull request #511 from lz4/decFast
Fixed performance issue with LZ4_decompress_fast()
2018-04-24 11:25:57 -07:00
Yann Collet
c92df76361
Merge pull request #488 from felixhandte/hc-dict-ctx
Use Dictionary In-Place in HC Mode
2018-04-24 10:49:41 -07:00
W. Felix Handte
5ed1463bf4 Remove Debug Log Statements 2018-04-24 11:58:51 -04:00
W. Felix Handte
db9deb7b74 Remove the Framebench Tool 2018-04-24 11:58:51 -04:00
W. Felix Handte
13271a88d7 Revert Stream Size Const to Correct Value 2018-04-24 11:55:53 -04:00
Yann Collet
092cb77597
Merge pull request #504 from baruchsiach/static-only-support
lib: allow to disable shared libraries
2018-04-23 23:44:04 -07:00
Cyan4973
44bff3fd3b re-ordered parenthesis
to avoid mixing && and &
as suggested by @terrelln
2018-04-23 19:26:02 -07:00
Yann Collet
0c2ae72ba8
Merge pull request #507 from lz4/clangPerf
fixed lz4_fast clang performance
2018-04-23 15:55:56 -07:00
Cyan4973
644b7bd2b6 fixed minor declaration issue with clang on msys 2018-04-23 15:52:44 -07:00
Cyan4973
cd0663456f disable shortcut for LZ4_decompress_fast()
improving speed
2018-04-23 15:47:08 -07:00
Cyan4973
bd06fde104 fullbench compiled without assert()
to better reflect release speed
2018-04-23 15:42:27 -07:00
Yann Collet
57cc7daf22
Merge pull request #510 from terrelln/bug-fix
Fix input size validation edge cases
2018-04-23 15:28:19 -07:00
Nick Terrell
672799e814 Fix compilation error and assert. 2018-04-23 14:21:02 -07:00
Nick Terrell
bb83cad98f Fix input size validation edge cases
The bug is a read up to 2 bytes past the end of the buffer.
There are three cases for this bug, one for each test case added.

* An empty input causes `token = *ip++` to read one byte too far.
* A one byte input with `(token >> ML_BITS) == RUN_MASK` causes
  one extra byte to be read without validation. This could be
  combined with the first bug to cause 2 extra bytes to be read.
* The case pointed out in issue #508, where `ip == iend` at the
  beginning of the loop after taking the shortcut.

Benchmarks show no regressions on clang or gcc-7 on both my mac
and devserver.

Fixes #508.
2018-04-23 13:34:18 -07:00
Yann Collet
996d211aca
Merge pull request #509 from svpv/clarifyFastRisks
lz4.h: clarify the risks of using LZ4_decompress_fast()
2018-04-22 19:30:24 -07:00
Alexey Tourbin
ab06ef97bb lz4.h: clarify the risks of using LZ4_decompress_fast()
The notes about "security guarantee" and "malicious inputs" seemed
a bit non-technical to me, so I took the liberty to tone them down
and instead describe the actual risks in technical terms.  Namely,
the function never writes past the end of the output buffer, so
a direct hostile takeover (resulting in arbitrary code execution
soon after the return from the function) is not possible.  However,
the application can crash because of reads from unmapped pages.

I also took the liberty to describe what I believe is the only sensible
usage scenario for the function: "This function is only usable if the
originalSize of uncompressed data is known in advance," etc.
2018-04-23 02:13:49 +03:00
Cyan4973
d1f21883d6 fixed incorrect comment 2018-04-21 00:11:51 -07:00
Yann Collet
a8a5dfd426 fixed clang performance in lz4_fast
The simple change from
`matchIndex+MAX_DISTANCE < current`
towards
`current - matchIndex > MAX_DISTANCE`

is enough to generate a 10% performance drop under clang.
Quite massive.
(I missed as my eyes were concentrated on gcc performance at that time).

The second version is more robust, because it also survives a situation where
`matchIndex > current`
due to overflows.

The first version requires matchIndex to not overflow.
Hence were added `assert()` conditions.

The only case where this can happen is with dictCtx compression,
in the case where the dictionary context is not initialized before loading the dictionary.
So it's enough to always initialize the context while loading the dictionary.
2018-04-20 18:09:51 -07:00
W. Felix Handte
ee67f25576 Change vLimit Calculation 2018-04-20 20:18:30 -04:00
W. Felix Handte
1895fa19a4 Remove Redundant Static Assert 2018-04-20 20:14:12 -04:00
W. Felix Handte
fcc99d1f31 Simpler loadDict() Reset 2018-04-20 19:37:28 -04:00
W. Felix Handte
a8cb2feffd Tolerate Base Pointer Underflow 2018-04-20 19:37:07 -04:00
W. Felix Handte
85cac61dd8 Don't Segfault on Malloc Failure 2018-04-20 19:35:51 -04:00
W. Felix Handte
756ed402da Sign-Extend -1 to Pointer Width 2018-04-20 17:56:26 -04:00
W. Felix Handte
86b381e40b Fix Constant Value 2018-04-20 17:13:40 -04:00
W. Felix Handte
1d2500d44e Handle Index Underflows Safely 2018-04-20 17:13:03 -04:00
W. Felix Handte
7874cf06b3 Consts and Asserts and Other Minor Nits 2018-04-20 15:30:08 -04:00
W. Felix Handte
209c9c29d1 Add Some Simple Fuzzer Tests 2018-04-20 15:16:41 -04:00
W. Felix Handte
3f087cf1cb Add Comments on New Public APIs 2018-04-20 15:00:53 -04:00
W. Felix Handte
d7347f9eea Add API for Attaching Dictionaries 2018-04-20 14:59:34 -04:00
W. Felix Handte
ca833f928f Also Reset the Chain Table 2018-04-20 14:16:27 -04:00
W. Felix Handte
8f118cf6e9 Remove inputBuffer from Context, Work Around its Absence 2018-04-20 14:08:06 -04:00
W. Felix Handte
0064e8ebc7 Remove Commented Out Support for Match Continuation over Segment Boundary 2018-04-20 13:14:37 -04:00
W. Felix Handte
14c577d4c9 Fix Signedness of Comparison 2018-04-19 20:54:35 -04:00
W. Felix Handte
f4b13e17ea Don't Clear the Dictionary Context Until No Longer Useful 2018-04-19 20:54:35 -04:00
W. Felix Handte
0abc23f72e Copy DictCtx into Working Context on Inputs Larger than 4 KB 2018-04-19 20:54:35 -04:00
W. Felix Handte
b67de2a327 Force Inline on HashChain 2018-04-19 20:54:35 -04:00
W. Felix Handte
22e16d5b50 Split DictCtx-using Code Into Separate Inlining Chain 2018-04-19 20:54:35 -04:00
W. Felix Handte
0a2abacd90 Use Fast Reset in LZ4F Again 2018-04-19 20:54:35 -04:00
W. Felix Handte
61c7ceffed Use Fast Reset API in LZ4F 2018-04-19 20:54:35 -04:00
W. Felix Handte
3591fe8ab8 Add Fast Reset Paths 2018-04-19 20:54:35 -04:00
W. Felix Handte
8db291bc1d Remove Match Upper Bounds Check 2018-04-19 20:54:35 -04:00
W. Felix Handte
8f9a2db0e1 Fix Some Cast/Conversion Warnings 2018-04-19 20:54:35 -04:00
W. Felix Handte
221211d7d0 Fix Offset Math 2018-04-19 20:54:35 -04:00
W. Felix Handte
a1beba13f7 Reset Stream in LZ4_compress_HC 2018-04-19 20:54:35 -04:00