updated doc for streaming API

This commit is contained in:
Yann Collet 2016-08-14 00:16:20 +02:00
parent 6263ba5451
commit 655393cc72
3 changed files with 94 additions and 37 deletions

1
NEWS
View File

@ -1,4 +1,5 @@
v0.8.1
New streaming API
Changed : --ultra now enables levels beyond 19
Changed : -i# now selects benchmark time in second
Fixed : ZSTD_compress* can now compress > 4 GB in a single pass, reported by Nick Terrell

View File

@ -37,15 +37,18 @@ Other optional functionalities provided are :
- `legacy/` : source code to decompress previous versions of zstd, starting from `v0.1`.
This module also depends on `common/` and `decompress/` .
Note that it's required to compile the library with `ZSTD_LEGACY_SUPPORT = 1` .
Library compilation must include directive `ZSTD_LEGACY_SUPPORT = 1` .
The main API can be consulted in `legacy/zstd_legacy.h`.
Advanced API from each version can be found in its relevant header file.
For example, advanced API for version `v0.4` is in `zstd_v04.h` .
Advanced API from each version can be found in their relevant header file.
For example, advanced API for version `v0.4` is in `legacy/zstd_v04.h` .
#### Streaming API
#### Obsolete streaming API
Streaming is currently provided by `common/zbuff.h`.
Streaming is now provided within `zstd.h`.
Older streaming API is still provided within `common/zbuff.h`.
It is considered obsolete, and will be removed in a future version.
Consider migrating towards newer streaming API.
#### Miscellaneous
@ -55,3 +58,4 @@ The other files are not source code. There are :
- LICENSE : contains the BSD license text
- Makefile : script to compile or install zstd library (static and dynamic)
- libzstd.pc.in : for pkg-config (`make install`)
- README.md : this file

View File

@ -36,7 +36,7 @@
extern "C" {
#endif
/*====== Dependency ======*/
/*====== Dependency ======*/
#include <stddef.h> /* size_t */
@ -52,7 +52,7 @@ extern "C" {
#endif
/*====== Version ======*/
/*======= Version =======*/
#define ZSTD_VERSION_MAJOR 0
#define ZSTD_VERSION_MINOR 8
#define ZSTD_VERSION_RELEASE 1
@ -84,15 +84,13 @@ ZSTDLIB_API size_t ZSTD_compress( void* dst, size_t dstCapacity,
* potentially larger than what local system can handle as a single memory segment.
* In which case, it's necessary to use streaming mode to decompress data.
* note 2 : decompressed size is an optional field, that may not be present.
* When `return==0`, consider data to decompress could have any size.
* In which case, it's necessary to use streaming mode to decompress data,
* or rely on application's implied limits.
* (For example, it may know that its own data is necessarily cut into blocks <= 16 KB).
* When `return==0`, data to decompress can have any size.
* In which case, it's necessary to use streaming mode to decompress data.
* Optionally, application may rely on its own implied limits.
* (For example, application own data could be necessarily cut into blocks <= 16 KB).
* note 3 : decompressed size could be wrong or intentionally modified !
* Always ensure result fits within application's authorized limits !
* Each application can have its own set of conditions.
* If the intention is to decompress public data compressed by zstd command line utility,
* it is recommended to support at least 8 MB for extended compatibility.
* Each application can set its own limits.
* note 4 : when `return==0`, if precise failure cause is needed, use ZSTD_getFrameParams() to know more. */
unsigned long long ZSTD_getDecompressedSize(const void* src, size_t srcSize);
@ -100,7 +98,7 @@ unsigned long long ZSTD_getDecompressedSize(const void* src, size_t srcSize);
`compressedSize` : must be the _exact_ size of compressed input, otherwise decompression will fail.
`dstCapacity` must be equal or larger than originalSize (see ZSTD_getDecompressedSize() ).
If originalSize is unknown, and if there is no implied application-specific limitations,
it's necessary to use streaming mode to decompress data.
it's preferable to use streaming mode to decompress data.
@return : the number of bytes decompressed into `dst` (<= `dstCapacity`),
or an errorCode if it fails (which can be tested using ZSTD_isError()) */
ZSTDLIB_API size_t ZSTD_decompress( void* dst, size_t dstCapacity,
@ -141,7 +139,7 @@ ZSTDLIB_API size_t ZSTD_decompressDCtx(ZSTD_DCtx* ctx, void* dst, size_t dstCapa
***************************/
/*! ZSTD_compress_usingDict() :
* Compression using a predefined Dictionary (see dictBuilder/zdict.h).
* Note : This function load the dictionary, resulting in a significant startup time. */
* Note : This function load the dictionary, resulting in significant startup delay. */
ZSTDLIB_API size_t ZSTD_compress_usingDict(ZSTD_CCtx* ctx,
void* dst, size_t dstCapacity,
const void* src, size_t srcSize,
@ -151,7 +149,7 @@ ZSTDLIB_API size_t ZSTD_compress_usingDict(ZSTD_CCtx* ctx,
/*! ZSTD_decompress_usingDict() :
* Decompression using a predefined Dictionary (see dictBuilder/zdict.h).
* Dictionary must be identical to the one used during compression.
* Note : This function load the dictionary, resulting in a significant startup time */
* Note : This function load the dictionary, resulting in significant startup delay */
ZSTDLIB_API size_t ZSTD_decompress_usingDict(ZSTD_DCtx* dctx,
void* dst, size_t dstCapacity,
const void* src, size_t srcSize,
@ -163,7 +161,7 @@ ZSTDLIB_API size_t ZSTD_decompress_usingDict(ZSTD_DCtx* dctx,
****************************/
/*! ZSTD_createCDict() :
* Create a digested dictionary, ready to start compression operation without startup delay.
* `dict` can be released after creation */
* `dict` can be released after ZSTD_CDict creation */
typedef struct ZSTD_CDict_s ZSTD_CDict;
ZSTDLIB_API ZSTD_CDict* ZSTD_createCDict(const void* dict, size_t dictSize, int compressionLevel);
ZSTDLIB_API size_t ZSTD_freeCDict(ZSTD_CDict* CDict);
@ -232,19 +230,19 @@ static const size_t ZSTD_skippableHeaderSize = 8; /* magic number + skippable f
typedef enum { ZSTD_fast, ZSTD_dfast, ZSTD_greedy, ZSTD_lazy, ZSTD_lazy2, ZSTD_btlazy2, ZSTD_btopt } ZSTD_strategy; /*< from faster to stronger */
typedef struct {
unsigned windowLog; /*< largest match distance : larger == more compression, more memory needed during decompression */
unsigned chainLog; /*< fully searched segment : larger == more compression, slower, more memory (useless for fast) */
unsigned hashLog; /*< dispatch table : larger == faster, more memory */
unsigned searchLog; /*< nb of searches : larger == more compression, slower */
unsigned searchLength; /*< match length searched : larger == faster decompression, sometimes less compression */
unsigned targetLength; /*< acceptable match size for optimal parser (only) : larger == more compression, slower */
unsigned windowLog; /**< largest match distance : larger == more compression, more memory needed during decompression */
unsigned chainLog; /**< fully searched segment : larger == more compression, slower, more memory (useless for fast) */
unsigned hashLog; /**< dispatch table : larger == faster, more memory */
unsigned searchLog; /**< nb of searches : larger == more compression, slower */
unsigned searchLength; /**< match length searched : larger == faster decompression, sometimes less compression */
unsigned targetLength; /**< acceptable match size for optimal parser (only) : larger == more compression, slower */
ZSTD_strategy strategy;
} ZSTD_compressionParameters;
typedef struct {
unsigned contentSizeFlag; /*< 1: content size will be in frame header (if known). */
unsigned checksumFlag; /*< 1: will generate a 22-bits checksum at end of frame, to be used for error detection by decompressor */
unsigned noDictIDFlag; /*< 1: no dict ID will be saved into frame header (if dictionary compression) */
unsigned contentSizeFlag; /**< 1: content size will be in frame header (if known). */
unsigned checksumFlag; /**< 1: will generate a 22-bits checksum at end of frame, to be used for error detection by decompressor */
unsigned noDictIDFlag; /**< 1: no dict ID will be saved into frame header (if dictionary compression) */
} ZSTD_frameParameters;
typedef struct {
@ -327,25 +325,59 @@ ZSTDLIB_API size_t ZSTD_sizeofDCtx(const ZSTD_DCtx* dctx);
********************************************************************/
typedef struct ZSTD_readCursor_s {
const void* ptr; /* position of cursor - update to new position */
size_t size; /* remaining buffer size to read - update preserves end of buffer */
const void* ptr; /**< position of cursor - update to new position */
size_t size; /**< remaining buffer size to read - update preserves end of buffer */
} ZSTD_rCursor;
typedef struct ZSTD_writeCursor_s {
void* ptr; /* position of cursor - update to new position */
size_t size; /* remaining buffer size to write - update preserves end of buffer */
size_t nbBytesWritten; /* already written bytes - update adds bytes newly written (accumulator) */
void* ptr; /**< position of cursor - update to new position */
size_t size; /**< remaining buffer size to write - update preserves end of buffer */
size_t nbBytesWritten; /**< already written bytes - update adds bytes newly written (accumulator) */
} ZSTD_wCursor;
/*====== compression ======*/
/*-***********************************************************************
* Streaming compression - howto
*
* A ZSTD_CStream object is required to track streaming operation.
* Use ZSTD_createCStream() and ZSTD_freeCStream() to create/release resources.
* ZSTD_CStream objects can be reused multiple times on consecutive compression operations.
*
* Start by initializing ZSTD_CStream.
* Use ZSTD_initCStream() to start a new compression operation.
* Use ZSTD_initCStream_usingDict() for a compression which requires a dictionary.
*
* Use ZSTD_compressStream() repetitively to consume input stream.
* The function will automatically advance cursors.
* Note that it may not consume the entire input, in which case `input->size > 0`,
* and it's up to the caller to present again remaining data.
* @return : a hint to preferred nb of bytes to use as input for next function call (it's just a hint, to improve latency)
* or an error code, which can be tested using ZSTD_isError().
*
* At any moment, it's possible to flush whatever data remains within buffer, using ZSTD_flushStream().
* Cursor will be updated.
* Note some content might still be left within internal buffer if `output->size` is too small.
* @return : nb of bytes still present within internal buffer (0 if it's empty)
* or an error code, which can be tested using ZSTD_isError().
*
* ZSTD_endStream() instructs to finish a frame.
* It will perform a flush and write frame epilogue.
* The epilogue is required for decoders to consider a frame completed.
* Similar to ZSTD_flushStream(), it may not be able to flush the full content if `output->size` is too small.
* In which case, call again ZSTD_endStream() to complete the flush.
* @return : nb of bytes still present within internal buffer (0 if it's empty)
* or an error code, which can be tested using ZSTD_isError().
*
* *******************************************************************/
typedef struct ZSTD_CStream_s ZSTD_CStream;
ZSTD_CStream* ZSTD_createCStream(void);
size_t ZSTD_freeCStream(ZSTD_CStream* zcs);
size_t ZSTD_CStreamInSize(void); /*!< recommended size for input buffer */
size_t ZSTD_CStreamOutSize(void); /*!< recommended size for output buffer */
size_t ZSTD_CStreamInSize(void); /**< recommended size for input buffer */
size_t ZSTD_CStreamOutSize(void); /**< recommended size for output buffer */
size_t ZSTD_initCStream(ZSTD_CStream* zcs, int compressionLevel);
size_t ZSTD_compressStream(ZSTD_CStream* zcs, ZSTD_wCursor* output, ZSTD_rCursor* input);
@ -361,6 +393,27 @@ size_t ZSTD_initCStream_advanced(ZSTD_CStream* zcs, const void* dict, size_t dic
/*====== decompression ======*/
/*-***************************************************************************
* Streaming decompression howto
*
* A ZSTD_DStream object is required to track streaming operations.
* Use ZSTD_createDStream() and ZSTD_freeDStream() to create/release resources.
* ZSTD_DStream objects can be re-init multiple times.
*
* Use ZSTD_initDStream() to start a new decompression operation,
* or ZSTD_initDStream_usingDict() if decompression requires a dictionary.
*
* Use ZSTD_decompressStream() repetitively to consume your input.
* The function will update cursors.
* Note that it may not consume the entire input (input->size > 0),
* in which case it's up to the caller to present remaining input again.
* @return : 0 when a frame is completely decoded and fully flushed,
* 1 when there is still some data left within internal buffer to flush,
* >1 when more data is expected, with value being a suggested next input size (it's just a hint, which helps latency, any size is accepted),
* or an error code, which can be tested using ZSTD_isError().
*
* *******************************************************************************/
typedef struct ZSTD_DStream_s ZSTD_DStream;
ZSTD_DStream* ZSTD_createDStream(void);
size_t ZSTD_freeDStream(ZSTD_DStream* zds);
@ -381,9 +434,8 @@ size_t ZSTD_initDStream_usingDict(ZSTD_DStream* zds, const void* dict, size_t di
* Buffer-less and synchronous inner streaming functions
********************************************************************/
/* This is an advanced API, giving full control over buffer management, for users which need direct control over memory.
* But it's also a complex one, with a lot of restrictions (documented below).
* For an easier streaming API, look into common/zbuff.h
* which removes all restrictions by allocating and managing its own internal buffer */
* But it's also a complex one, with many restrictions (documented below).
* Prefer using normal streaming API for an easier experience */
ZSTDLIB_API size_t ZSTD_compressBegin(ZSTD_CCtx* cctx, int compressionLevel);
ZSTDLIB_API size_t ZSTD_compressBegin_usingDict(ZSTD_CCtx* cctx, const void* dict, size_t dictSize, int compressionLevel);