lz4/doc/lz4frame_manual.html
Yann Collet 0fea528e3a updated documentation regarding dictionary compression
following suggestion from @stbrumme (#558)

Also : bumped version number, regenerated man page and html doc
2018-09-05 14:05:08 -07:00

353 lines
18 KiB
HTML

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>1.8.3 Manual</title>
</head>
<body>
<h1>1.8.3 Manual</h1>
<hr>
<a name="Contents"></a><h2>Contents</h2>
<ol>
<li><a href="#Chapter1">Introduction</a></li>
<li><a href="#Chapter2">Compiler specifics</a></li>
<li><a href="#Chapter3">Error management</a></li>
<li><a href="#Chapter4">Frame compression types</a></li>
<li><a href="#Chapter5">Simple compression function</a></li>
<li><a href="#Chapter6">Advanced compression functions</a></li>
<li><a href="#Chapter7">Resource Management</a></li>
<li><a href="#Chapter8">Compression</a></li>
<li><a href="#Chapter9">Decompression functions</a></li>
<li><a href="#Chapter10">Streaming decompression functions</a></li>
<li><a href="#Chapter11">Bulk processing dictionary API</a></li>
</ol>
<hr>
<a name="Chapter1"></a><h2>Introduction</h2><pre>
lz4frame.h implements LZ4 frame specification (doc/lz4_Frame_format.md).
lz4frame.h provides frame compression functions that take care
of encoding standard metadata alongside LZ4-compressed blocks.
<BR></pre>
<a name="Chapter2"></a><h2>Compiler specifics</h2><pre></pre>
<a name="Chapter3"></a><h2>Error management</h2><pre></pre>
<pre><b>unsigned LZ4F_isError(LZ4F_errorCode_t code); </b>/**< tells when a function result is an error code */<b>
</b></pre><BR>
<pre><b>const char* LZ4F_getErrorName(LZ4F_errorCode_t code); </b>/**< return error code string; for debugging */<b>
</b></pre><BR>
<a name="Chapter4"></a><h2>Frame compression types</h2><pre></pre>
<pre><b>typedef enum {
LZ4F_default=0,
LZ4F_max64KB=4,
LZ4F_max256KB=5,
LZ4F_max1MB=6,
LZ4F_max4MB=7
LZ4F_OBSOLETE_ENUM(max64KB)
LZ4F_OBSOLETE_ENUM(max256KB)
LZ4F_OBSOLETE_ENUM(max1MB)
LZ4F_OBSOLETE_ENUM(max4MB)
} LZ4F_blockSizeID_t;
</b></pre><BR>
<pre><b>typedef enum {
LZ4F_blockLinked=0,
LZ4F_blockIndependent
LZ4F_OBSOLETE_ENUM(blockLinked)
LZ4F_OBSOLETE_ENUM(blockIndependent)
} LZ4F_blockMode_t;
</b></pre><BR>
<pre><b>typedef enum {
LZ4F_noContentChecksum=0,
LZ4F_contentChecksumEnabled
LZ4F_OBSOLETE_ENUM(noContentChecksum)
LZ4F_OBSOLETE_ENUM(contentChecksumEnabled)
} LZ4F_contentChecksum_t;
</b></pre><BR>
<pre><b>typedef enum {
LZ4F_noBlockChecksum=0,
LZ4F_blockChecksumEnabled
} LZ4F_blockChecksum_t;
</b></pre><BR>
<pre><b>typedef enum {
LZ4F_frame=0,
LZ4F_skippableFrame
LZ4F_OBSOLETE_ENUM(skippableFrame)
} LZ4F_frameType_t;
</b></pre><BR>
<pre><b>typedef struct {
LZ4F_blockSizeID_t blockSizeID; </b>/* max64KB, max256KB, max1MB, max4MB; 0 == default */<b>
LZ4F_blockMode_t blockMode; </b>/* LZ4F_blockLinked, LZ4F_blockIndependent; 0 == default */<b>
LZ4F_contentChecksum_t contentChecksumFlag; </b>/* 1: frame terminated with 32-bit checksum of decompressed data; 0: disabled (default) */<b>
LZ4F_frameType_t frameType; </b>/* read-only field : LZ4F_frame or LZ4F_skippableFrame */<b>
unsigned long long contentSize; </b>/* Size of uncompressed content ; 0 == unknown */<b>
unsigned dictID; </b>/* Dictionary ID, sent by compressor to help decoder select correct dictionary; 0 == no dictID provided */<b>
LZ4F_blockChecksum_t blockChecksumFlag; </b>/* 1: each block followed by a checksum of block's compressed data; 0: disabled (default) */<b>
} LZ4F_frameInfo_t;
</b><p> makes it possible to set or read frame parameters.
It's not required to set all fields, as long as the structure was initially memset() to zero.
For all fields, 0 sets it to default value
</p></pre><BR>
<pre><b>typedef struct {
LZ4F_frameInfo_t frameInfo;
int compressionLevel; </b>/* 0: default (fast mode); values > LZ4HC_CLEVEL_MAX count as LZ4HC_CLEVEL_MAX; values < 0 trigger "fast acceleration" */<b>
unsigned autoFlush; </b>/* 1: always flush, to reduce usage of internal buffers */<b>
unsigned favorDecSpeed; </b>/* 1: parser favors decompression speed vs compression ratio. Only works for high compression modes (>= LZ4LZ4HC_CLEVEL_OPT_MIN) */ /* >= v1.8.2 */<b>
unsigned reserved[3]; </b>/* must be zero for forward compatibility */<b>
} LZ4F_preferences_t;
</b><p> makes it possible to supply detailed compression parameters to the stream interface.
Structure is presumed initially memset() to zero, representing default settings.
All reserved fields must be set to zero.
</p></pre><BR>
<a name="Chapter5"></a><h2>Simple compression function</h2><pre></pre>
<pre><b>size_t LZ4F_compressFrameBound(size_t srcSize, const LZ4F_preferences_t* preferencesPtr);
</b><p> Returns the maximum possible compressed size with LZ4F_compressFrame() given srcSize and preferences.
`preferencesPtr` is optional. It can be replaced by NULL, in which case, the function will assume default preferences.
Note : this result is only usable with LZ4F_compressFrame().
It may also be used with LZ4F_compressUpdate() _if no flush() operation_ is performed.
</p></pre><BR>
<pre><b>size_t LZ4F_compressFrame(void* dstBuffer, size_t dstCapacity,
const void* srcBuffer, size_t srcSize,
const LZ4F_preferences_t* preferencesPtr);
</b><p> Compress an entire srcBuffer into a valid LZ4 frame.
dstCapacity MUST be >= LZ4F_compressFrameBound(srcSize, preferencesPtr).
The LZ4F_preferences_t structure is optional : you can provide NULL as argument. All preferences will be set to default.
@return : number of bytes written into dstBuffer.
or an error code if it fails (can be tested using LZ4F_isError())
</p></pre><BR>
<a name="Chapter6"></a><h2>Advanced compression functions</h2><pre></pre>
<pre><b>typedef struct {
unsigned stableSrc; </b>/* 1 == src content will remain present on future calls to LZ4F_compress(); skip copying src content within tmp buffer */<b>
unsigned reserved[3];
} LZ4F_compressOptions_t;
</b></pre><BR>
<a name="Chapter7"></a><h2>Resource Management</h2><pre></pre>
<pre><b>LZ4F_errorCode_t LZ4F_createCompressionContext(LZ4F_cctx** cctxPtr, unsigned version);
LZ4F_errorCode_t LZ4F_freeCompressionContext(LZ4F_cctx* cctx);
</b><p> The first thing to do is to create a compressionContext object, which will be used in all compression operations.
This is achieved using LZ4F_createCompressionContext(), which takes as argument a version.
The version provided MUST be LZ4F_VERSION. It is intended to track potential version mismatch, notably when using DLL.
The function will provide a pointer to a fully allocated LZ4F_cctx object.
If @return != zero, there was an error during context creation.
Object can release its memory using LZ4F_freeCompressionContext();
</p></pre><BR>
<a name="Chapter8"></a><h2>Compression</h2><pre></pre>
<pre><b>size_t LZ4F_compressBegin(LZ4F_cctx* cctx,
void* dstBuffer, size_t dstCapacity,
const LZ4F_preferences_t* prefsPtr);
</b><p> will write the frame header into dstBuffer.
dstCapacity must be >= LZ4F_HEADER_SIZE_MAX bytes.
`prefsPtr` is optional : you can provide NULL as argument, all preferences will then be set to default.
@return : number of bytes written into dstBuffer for the header
or an error code (which can be tested using LZ4F_isError())
</p></pre><BR>
<pre><b>size_t LZ4F_compressBound(size_t srcSize, const LZ4F_preferences_t* prefsPtr);
</b><p> Provides minimum dstCapacity required to guarantee compression success
given a srcSize and preferences, covering worst case scenario.
prefsPtr is optional : when NULL is provided, preferences will be set to cover worst case scenario.
Estimation is valid for either LZ4F_compressUpdate(), LZ4F_flush() or LZ4F_compressEnd(),
Estimation includes the possibility that internal buffer might already be filled by up to (blockSize-1) bytes.
It also includes frame footer (ending + checksum), which would have to be generated by LZ4F_compressEnd().
Estimation doesn't include frame header, as it was already generated by LZ4F_compressBegin().
Result is always the same for a srcSize and prefsPtr, so it can be trusted to size reusable buffers.
When srcSize==0, LZ4F_compressBound() provides an upper bound for LZ4F_flush() and LZ4F_compressEnd() operations.
</p></pre><BR>
<pre><b>size_t LZ4F_compressUpdate(LZ4F_cctx* cctx,
void* dstBuffer, size_t dstCapacity,
const void* srcBuffer, size_t srcSize,
const LZ4F_compressOptions_t* cOptPtr);
</b><p> LZ4F_compressUpdate() can be called repetitively to compress as much data as necessary.
Important rule: dstCapacity MUST be large enough to ensure operation success even in worst case situations.
This value is provided by LZ4F_compressBound().
If this condition is not respected, LZ4F_compress() will fail (result is an errorCode).
LZ4F_compressUpdate() doesn't guarantee error recovery.
When an error occurs, compression context must be freed or resized.
`cOptPtr` is optional : NULL can be provided, in which case all options are set to default.
@return : number of bytes written into `dstBuffer` (it can be zero, meaning input data was just buffered).
or an error code if it fails (which can be tested using LZ4F_isError())
</p></pre><BR>
<pre><b>size_t LZ4F_flush(LZ4F_cctx* cctx,
void* dstBuffer, size_t dstCapacity,
const LZ4F_compressOptions_t* cOptPtr);
</b><p> When data must be generated and sent immediately, without waiting for a block to be completely filled,
it's possible to call LZ4_flush(). It will immediately compress any data buffered within cctx.
`dstCapacity` must be large enough to ensure the operation will be successful.
`cOptPtr` is optional : it's possible to provide NULL, all options will be set to default.
@return : nb of bytes written into dstBuffer (can be zero, when there is no data stored within cctx)
or an error code if it fails (which can be tested using LZ4F_isError())
</p></pre><BR>
<pre><b>size_t LZ4F_compressEnd(LZ4F_cctx* cctx,
void* dstBuffer, size_t dstCapacity,
const LZ4F_compressOptions_t* cOptPtr);
</b><p> To properly finish an LZ4 frame, invoke LZ4F_compressEnd().
It will flush whatever data remained within `cctx` (like LZ4_flush())
and properly finalize the frame, with an endMark and a checksum.
`cOptPtr` is optional : NULL can be provided, in which case all options will be set to default.
@return : nb of bytes written into dstBuffer, necessarily >= 4 (endMark),
or an error code if it fails (which can be tested using LZ4F_isError())
A successful call to LZ4F_compressEnd() makes `cctx` available again for another compression task.
</p></pre><BR>
<a name="Chapter9"></a><h2>Decompression functions</h2><pre></pre>
<pre><b>typedef struct {
unsigned stableDst; </b>/* pledges that last 64KB decompressed data will remain available unmodified. This optimization skips storage operations in tmp buffers. */<b>
unsigned reserved[3]; </b>/* must be set to zero for forward compatibility */<b>
} LZ4F_decompressOptions_t;
</b></pre><BR>
<pre><b>LZ4F_errorCode_t LZ4F_createDecompressionContext(LZ4F_dctx** dctxPtr, unsigned version);
LZ4F_errorCode_t LZ4F_freeDecompressionContext(LZ4F_dctx* dctx);
</b><p> Create an LZ4F_dctx object, to track all decompression operations.
The version provided MUST be LZ4F_VERSION.
The function provides a pointer to an allocated and initialized LZ4F_dctx object.
The result is an errorCode, which can be tested using LZ4F_isError().
dctx memory can be released using LZ4F_freeDecompressionContext();
Result of LZ4F_freeDecompressionContext() indicates current state of decompressionContext when being released.
That is, it should be == 0 if decompression has been completed fully and correctly.
</p></pre><BR>
<a name="Chapter10"></a><h2>Streaming decompression functions</h2><pre></pre>
<pre><b>size_t LZ4F_getFrameInfo(LZ4F_dctx* dctx,
LZ4F_frameInfo_t* frameInfoPtr,
const void* srcBuffer, size_t* srcSizePtr);
</b><p> This function extracts frame parameters (max blockSize, dictID, etc.).
Its usage is optional.
Extracted information is typically useful for allocation and dictionary.
This function works in 2 situations :
- At the beginning of a new frame, in which case
it will decode information from `srcBuffer`, starting the decoding process.
Input size must be large enough to successfully decode the entire frame header.
Frame header size is variable, but is guaranteed to be <= LZ4F_HEADER_SIZE_MAX bytes.
It's allowed to provide more input data than this minimum.
- After decoding has been started.
In which case, no input is read, frame parameters are extracted from dctx.
- If decoding has barely started, but not yet extracted information from header,
LZ4F_getFrameInfo() will fail.
The number of bytes consumed from srcBuffer will be updated within *srcSizePtr (necessarily <= original value).
Decompression must resume from (srcBuffer + *srcSizePtr).
@return : an hint about how many srcSize bytes LZ4F_decompress() expects for next call,
or an error code which can be tested using LZ4F_isError().
note 1 : in case of error, dctx is not modified. Decoding operation can resume from beginning safely.
note 2 : frame parameters are *copied into* an already allocated LZ4F_frameInfo_t structure.
</p></pre><BR>
<pre><b>size_t LZ4F_decompress(LZ4F_dctx* dctx,
void* dstBuffer, size_t* dstSizePtr,
const void* srcBuffer, size_t* srcSizePtr,
const LZ4F_decompressOptions_t* dOptPtr);
</b><p> Call this function repetitively to regenerate compressed data from `srcBuffer`.
The function will read up to *srcSizePtr bytes from srcBuffer,
and decompress data into dstBuffer, of capacity *dstSizePtr.
The nb of bytes consumed from srcBuffer will be written into *srcSizePtr (necessarily <= original value).
The nb of bytes decompressed into dstBuffer will be written into *dstSizePtr (necessarily <= original value).
The function does not necessarily read all input bytes, so always check value in *srcSizePtr.
Unconsumed source data must be presented again in subsequent invocations.
`dstBuffer` can freely change between each consecutive function invocation.
`dstBuffer` content will be overwritten.
@return : an hint of how many `srcSize` bytes LZ4F_decompress() expects for next call.
Schematically, it's the size of the current (or remaining) compressed block + header of next block.
Respecting the hint provides some small speed benefit, because it skips intermediate buffers.
This is just a hint though, it's always possible to provide any srcSize.
When a frame is fully decoded, @return will be 0 (no more data expected).
When provided with more bytes than necessary to decode a frame,
LZ4F_decompress() will stop reading exactly at end of current frame, and @return 0.
If decompression failed, @return is an error code, which can be tested using LZ4F_isError().
After a decompression error, the `dctx` context is not resumable.
Use LZ4F_resetDecompressionContext() to return to clean state.
After a frame is fully decoded, dctx can be used again to decompress another frame.
</p></pre><BR>
<pre><b>void LZ4F_resetDecompressionContext(LZ4F_dctx* dctx); </b>/* always successful */<b>
</b><p> In case of an error, the context is left in "undefined" state.
In which case, it's necessary to reset it, before re-using it.
This method can also be used to abruptly stop any unfinished decompression,
and start a new one using same context resources.
</p></pre><BR>
<pre><b>typedef enum { LZ4F_LIST_ERRORS(LZ4F_GENERATE_ENUM) } LZ4F_errorCodes;
</b></pre><BR>
<a name="Chapter11"></a><h2>Bulk processing dictionary API</h2><pre></pre>
<pre><b>LZ4FLIB_STATIC_API LZ4F_CDict* LZ4F_createCDict(const void* dictBuffer, size_t dictSize);
LZ4FLIB_STATIC_API void LZ4F_freeCDict(LZ4F_CDict* CDict);
</b><p> When compressing multiple messages / blocks with the same dictionary, it's recommended to load it just once.
LZ4_createCDict() will create a digested dictionary, ready to start future compression operations without startup delay.
LZ4_CDict can be created once and shared by multiple threads concurrently, since its usage is read-only.
`dictBuffer` can be released after LZ4_CDict creation, since its content is copied within CDict
</p></pre><BR>
<pre><b>LZ4FLIB_STATIC_API size_t LZ4F_compressFrame_usingCDict(
LZ4F_cctx* cctx,
void* dst, size_t dstCapacity,
const void* src, size_t srcSize,
const LZ4F_CDict* cdict,
const LZ4F_preferences_t* preferencesPtr);
</b><p> Compress an entire srcBuffer into a valid LZ4 frame using a digested Dictionary.
cctx must point to a context created by LZ4F_createCompressionContext().
If cdict==NULL, compress without a dictionary.
dstBuffer MUST be >= LZ4F_compressFrameBound(srcSize, preferencesPtr).
If this condition is not respected, function will fail (@return an errorCode).
The LZ4F_preferences_t structure is optional : you may provide NULL as argument,
but it's not recommended, as it's the only way to provide dictID in the frame header.
@return : number of bytes written into dstBuffer.
or an error code if it fails (can be tested using LZ4F_isError())
</p></pre><BR>
<pre><b>LZ4FLIB_STATIC_API size_t LZ4F_compressBegin_usingCDict(
LZ4F_cctx* cctx,
void* dstBuffer, size_t dstCapacity,
const LZ4F_CDict* cdict,
const LZ4F_preferences_t* prefsPtr);
</b><p> Inits streaming dictionary compression, and writes the frame header into dstBuffer.
dstCapacity must be >= LZ4F_HEADER_SIZE_MAX bytes.
`prefsPtr` is optional : you may provide NULL as argument,
however, it's the only way to provide dictID in the frame header.
@return : number of bytes written into dstBuffer for the header,
or an error code (which can be tested using LZ4F_isError())
</p></pre><BR>
<pre><b>LZ4FLIB_STATIC_API size_t LZ4F_decompress_usingDict(
LZ4F_dctx* dctxPtr,
void* dstBuffer, size_t* dstSizePtr,
const void* srcBuffer, size_t* srcSizePtr,
const void* dict, size_t dictSize,
const LZ4F_decompressOptions_t* decompressOptionsPtr);
</b><p> Same as LZ4F_decompress(), using a predefined dictionary.
Dictionary is used "in place", without any preprocessing.
It must remain accessible throughout the entire frame decoding.
</p></pre><BR>
</html>
</body>