commit
9a6e93859d
@ -16,7 +16,7 @@ Distribution of this document is unlimited.
|
|||||||
|
|
||||||
### Version
|
### Version
|
||||||
|
|
||||||
1.6.1 (30/01/2018)
|
1.6.2 (12/08/2020)
|
||||||
|
|
||||||
|
|
||||||
Introduction
|
Introduction
|
||||||
@ -75,7 +75,7 @@ __Frame Descriptor__
|
|||||||
3 to 15 Bytes, to be detailed in its own paragraph,
|
3 to 15 Bytes, to be detailed in its own paragraph,
|
||||||
as it is the most important part of the spec.
|
as it is the most important part of the spec.
|
||||||
|
|
||||||
The combined __Magic Number__ and __Frame Descriptor__ fields are sometimes
|
The combined _Magic_Number_ and _Frame_Descriptor_ fields are sometimes
|
||||||
called ___LZ4 Frame Header___. Its size varies between 7 and 19 bytes.
|
called ___LZ4 Frame Header___. Its size varies between 7 and 19 bytes.
|
||||||
|
|
||||||
__Data Blocks__
|
__Data Blocks__
|
||||||
@ -85,14 +85,13 @@ That’s where compressed data is stored.
|
|||||||
|
|
||||||
__EndMark__
|
__EndMark__
|
||||||
|
|
||||||
The flow of blocks ends when the last data block has a size of “0”.
|
The flow of blocks ends when the last data block is followed by
|
||||||
The size is expressed as a 32-bits value.
|
the 32-bit value `0x00000000`.
|
||||||
|
|
||||||
__Content Checksum__
|
__Content Checksum__
|
||||||
|
|
||||||
Content Checksum verify that the full content has been decoded correctly.
|
_Content_Checksum_ verify that the full content has been decoded correctly.
|
||||||
The content checksum is the result
|
The content checksum is the result of [xxHash-32 algorithm]
|
||||||
of [xxh32() hash function](https://github.com/Cyan4973/xxHash)
|
|
||||||
digesting the original (decoded) data as input, and a seed of zero.
|
digesting the original (decoded) data as input, and a seed of zero.
|
||||||
Content checksum is only present when its associated flag
|
Content checksum is only present when its associated flag
|
||||||
is set in the frame descriptor.
|
is set in the frame descriptor.
|
||||||
@ -101,7 +100,7 @@ that all blocks were fully transmitted in the correct order and without error,
|
|||||||
and also that the encoding/decoding process itself generated no distortion.
|
and also that the encoding/decoding process itself generated no distortion.
|
||||||
Its usage is recommended.
|
Its usage is recommended.
|
||||||
|
|
||||||
The combined __EndMark__ and __Content Checksum__ fields might sometimes be
|
The combined _EndMark_ and _Content_Checksum_ fields might sometimes be
|
||||||
referred to as ___LZ4 Frame Footer___. Its size varies between 4 and 8 bytes.
|
referred to as ___LZ4 Frame Footer___. Its size varies between 4 and 8 bytes.
|
||||||
|
|
||||||
__Frame Concatenation__
|
__Frame Concatenation__
|
||||||
@ -261,16 +260,24 @@ __Block Size__
|
|||||||
|
|
||||||
This field uses 4-bytes, format is little-endian.
|
This field uses 4-bytes, format is little-endian.
|
||||||
|
|
||||||
The highest bit is “1” if data in the block is uncompressed.
|
If the highest bit is set (`1`), the block is uncompressed.
|
||||||
|
|
||||||
The highest bit is “0” if data in the block is compressed by LZ4.
|
If the highest bit is not set (`0`), the block is LZ4-compressed,
|
||||||
|
using the [LZ4 block format specification](https://github.com/lz4/lz4/blob/master/doc/lz4_Block_format.md).
|
||||||
|
|
||||||
All other bits give the size, in bytes, of the following data block.
|
All other bits give the size, in bytes, of the data section.
|
||||||
The size does not include the block checksum if present.
|
The size does not include the block checksum if present.
|
||||||
|
|
||||||
Block Size shall never be larger than Block Maximum Size.
|
_Block_Size_ shall never be larger than _Block_Maximum_Size_.
|
||||||
Such a thing could potentially happen for non-compressible sources.
|
Such an outcome could potentially happen for non-compressible sources.
|
||||||
In such a case, such data block shall be passed using uncompressed format.
|
In such a case, such data block must be passed using uncompressed format.
|
||||||
|
|
||||||
|
A value of `0x00000000` is invalid, and signifies an _EndMark_ instead.
|
||||||
|
Note that this is different from a value of `0x80000000` (highest bit set),
|
||||||
|
which is an uncompressed block of size 0 (empty),
|
||||||
|
which is valid, and therefore doesn't end a frame.
|
||||||
|
Note that, if _Block_checksum_ is enabled,
|
||||||
|
even an empty block must be followed by a 32-bit block checksum.
|
||||||
|
|
||||||
__Data__
|
__Data__
|
||||||
|
|
||||||
@ -279,20 +286,22 @@ It might be compressed or not, depending on previous field indications.
|
|||||||
|
|
||||||
When compressed, the data must respect the [LZ4 block format specification](https://github.com/lz4/lz4/blob/master/doc/lz4_Block_format.md).
|
When compressed, the data must respect the [LZ4 block format specification](https://github.com/lz4/lz4/blob/master/doc/lz4_Block_format.md).
|
||||||
|
|
||||||
Note that the block is not necessarily full.
|
Note that a block is not necessarily full.
|
||||||
Uncompressed size of data can be any size, up to "Block Maximum Size”,
|
Uncompressed size of data can be any size __up to__ _Block_Maximum_Size_,
|
||||||
so it may contain less data than the maximum block size.
|
so it may contain less data than the maximum block size.
|
||||||
|
|
||||||
__Block checksum__
|
__Block checksum__
|
||||||
|
|
||||||
Only present if the associated flag is set.
|
Only present if the associated flag is set.
|
||||||
This is a 4-bytes checksum value, in little endian format,
|
This is a 4-bytes checksum value, in little endian format,
|
||||||
calculated by using the xxHash-32 algorithm on the raw (undecoded) data block,
|
calculated by using the [xxHash-32 algorithm] on the __raw__ (undecoded) data block,
|
||||||
and a seed of zero.
|
and a seed of zero.
|
||||||
The intention is to detect data corruption (storage or transmission errors)
|
The intention is to detect data corruption (storage or transmission errors)
|
||||||
before decoding.
|
before decoding.
|
||||||
|
|
||||||
Block checksum is cumulative with Content checksum.
|
_Block_checksum_ can be cumulative with _Content_checksum_.
|
||||||
|
|
||||||
|
[xxHash-32 algorithm]: https://github.com/Cyan4973/xxHash/blob/release/doc/xxhash_spec.md
|
||||||
|
|
||||||
|
|
||||||
Skippable Frames
|
Skippable Frames
|
||||||
@ -389,6 +398,8 @@ and trigger an error if it does not fit within acceptable range.
|
|||||||
Version changes
|
Version changes
|
||||||
---------------
|
---------------
|
||||||
|
|
||||||
|
1.6.2 : clarifies specification of _EndMark_
|
||||||
|
|
||||||
1.6.1 : introduced terms "LZ4 Frame Header" and "LZ4 Frame Footer"
|
1.6.1 : introduced terms "LZ4 Frame Header" and "LZ4 Frame Footer"
|
||||||
|
|
||||||
1.6.0 : restored Dictionary ID field in Frame header
|
1.6.0 : restored Dictionary ID field in Frame header
|
||||||
|
@ -1483,14 +1483,16 @@ size_t LZ4F_decompress(LZ4F_dctx* dctx,
|
|||||||
} /* if (dctx->dStage == dstage_storeBlockHeader) */
|
} /* if (dctx->dStage == dstage_storeBlockHeader) */
|
||||||
|
|
||||||
/* decode block header */
|
/* decode block header */
|
||||||
{ size_t const nextCBlockSize = LZ4F_readLE32(selectedIn) & 0x7FFFFFFFU;
|
{ U32 const blockHeader = LZ4F_readLE32(selectedIn);
|
||||||
|
size_t const nextCBlockSize = blockHeader & 0x7FFFFFFFU;
|
||||||
size_t const crcSize = dctx->frameInfo.blockChecksumFlag * BFSize;
|
size_t const crcSize = dctx->frameInfo.blockChecksumFlag * BFSize;
|
||||||
if (nextCBlockSize==0) { /* frameEnd signal, no more block */
|
if (blockHeader==0) { /* frameEnd signal, no more block */
|
||||||
dctx->dStage = dstage_getSuffix;
|
dctx->dStage = dstage_getSuffix;
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
if (nextCBlockSize > dctx->maxBlockSize)
|
if (nextCBlockSize > dctx->maxBlockSize) {
|
||||||
return err0r(LZ4F_ERROR_maxBlockSize_invalid);
|
return err0r(LZ4F_ERROR_maxBlockSize_invalid);
|
||||||
|
}
|
||||||
if (LZ4F_readLE32(selectedIn) & LZ4F_BLOCKUNCOMPRESSED_FLAG) {
|
if (LZ4F_readLE32(selectedIn) & LZ4F_BLOCKUNCOMPRESSED_FLAG) {
|
||||||
/* next block is uncompressed */
|
/* next block is uncompressed */
|
||||||
dctx->tmpInTarget = nextCBlockSize;
|
dctx->tmpInTarget = nextCBlockSize;
|
||||||
|
@ -995,13 +995,13 @@ int fuzzerTests(U32 seed, unsigned nbTests, unsigned startTest, double compressi
|
|||||||
BYTE* op = (BYTE*)compressedBuffer;
|
BYTE* op = (BYTE*)compressedBuffer;
|
||||||
BYTE* const oend = op + (neverFlush ? LZ4F_compressFrameBound(srcSize, prefsPtr) : compressedBufferSize); /* when flushes are possible, can't guarantee a max compressed size */
|
BYTE* const oend = op + (neverFlush ? LZ4F_compressFrameBound(srcSize, prefsPtr) : compressedBufferSize); /* when flushes are possible, can't guarantee a max compressed size */
|
||||||
unsigned const maxBits = FUZ_highbit((U32)srcSize);
|
unsigned const maxBits = FUZ_highbit((U32)srcSize);
|
||||||
size_t cSegmentSize;
|
|
||||||
LZ4F_compressOptions_t cOptions;
|
LZ4F_compressOptions_t cOptions;
|
||||||
memset(&cOptions, 0, sizeof(cOptions));
|
memset(&cOptions, 0, sizeof(cOptions));
|
||||||
cSegmentSize = LZ4F_compressBegin(cCtx, op, (size_t)(oend-op), prefsPtr);
|
{ size_t const fhSize = LZ4F_compressBegin(cCtx, op, (size_t)(oend-op), prefsPtr);
|
||||||
CHECK(LZ4F_isError(cSegmentSize), "Compression header failed (error %i)",
|
CHECK(LZ4F_isError(fhSize), "Compression header failed (error %i)",
|
||||||
(int)cSegmentSize);
|
(int)fhSize);
|
||||||
op += cSegmentSize;
|
op += fhSize;
|
||||||
|
}
|
||||||
while (ip < iend) {
|
while (ip < iend) {
|
||||||
unsigned const nbBitsSeg = FUZ_rand(&randState) % maxBits;
|
unsigned const nbBitsSeg = FUZ_rand(&randState) % maxBits;
|
||||||
size_t const sampleMax = (FUZ_rand(&randState) & ((1<<nbBitsSeg)-1)) + 1;
|
size_t const sampleMax = (FUZ_rand(&randState) & ((1<<nbBitsSeg)-1)) + 1;
|
||||||
@ -1024,8 +1024,20 @@ int fuzzerTests(U32 seed, unsigned nbTests, unsigned startTest, double compressi
|
|||||||
DISPLAYLEVEL(6,"flushing %u bytes \n", (unsigned)flushSize);
|
DISPLAYLEVEL(6,"flushing %u bytes \n", (unsigned)flushSize);
|
||||||
CHECK(LZ4F_isError(flushSize), "Compression failed (error %i)", (int)flushSize);
|
CHECK(LZ4F_isError(flushSize), "Compression failed (error %i)", (int)flushSize);
|
||||||
op += flushSize;
|
op += flushSize;
|
||||||
} }
|
if ((FUZ_rand(&randState) % 1024) == 3) {
|
||||||
}
|
/* add an empty block (requires uncompressed flag) */
|
||||||
|
op[0] = op[1] = op[2] = 0;
|
||||||
|
op[3] = 0x80; /* 0x80000000U in little-endian format */
|
||||||
|
op += 4;
|
||||||
|
if ((prefsPtr!= NULL) && prefsPtr->frameInfo.blockChecksumFlag) {
|
||||||
|
U32 const bc32 = XXH32(op, 0, 0);
|
||||||
|
op[0] = (BYTE)bc32; /* little endian format */
|
||||||
|
op[1] = (BYTE)(bc32>>8);
|
||||||
|
op[2] = (BYTE)(bc32>>16);
|
||||||
|
op[3] = (BYTE)(bc32>>24);
|
||||||
|
op += 4;
|
||||||
|
} } } }
|
||||||
|
} /* while (ip<iend) */
|
||||||
CHECK(op>=oend, "LZ4F_compressFrameBound overflow");
|
CHECK(op>=oend, "LZ4F_compressFrameBound overflow");
|
||||||
{ size_t const dstEndSafeSize = LZ4F_compressBound(0, prefsPtr);
|
{ size_t const dstEndSafeSize = LZ4F_compressBound(0, prefsPtr);
|
||||||
int const tooSmallDstEnd = ((FUZ_rand(&randState) & 31) == 3);
|
int const tooSmallDstEnd = ((FUZ_rand(&randState) & 31) == 3);
|
||||||
@ -1086,7 +1098,7 @@ int fuzzerTests(U32 seed, unsigned nbTests, unsigned startTest, double compressi
|
|||||||
DISPLAYLEVEL(6, "noisy decompression \n");
|
DISPLAYLEVEL(6, "noisy decompression \n");
|
||||||
test_lz4f_decompression(compressedBuffer, cSize, srcStart, srcSize, crcOrig, &randState, dCtxNoise, seed, testNb);
|
test_lz4f_decompression(compressedBuffer, cSize, srcStart, srcSize, crcOrig, &randState, dCtxNoise, seed, testNb);
|
||||||
/* note : we don't analyze result here : it probably failed, which is expected.
|
/* note : we don't analyze result here : it probably failed, which is expected.
|
||||||
* We just check for potential out-of-bound reads and writes. */
|
* The sole purpose is to catch potential out-of-bound reads and writes. */
|
||||||
LZ4F_resetDecompressionContext(dCtxNoise); /* context must be reset after an error */
|
LZ4F_resetDecompressionContext(dCtxNoise); /* context must be reset after an error */
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user