zstd/lib
Yann Collet 281f06e01f saves 3-bytes on small input with streaming API
zstd streaming API was adding a null-block at end of frame for small input.

Reason is : on small input, a single block is enough.
ZSTD_CStream would size its input buffer to expect a single block of this size,
automatically triggering a flush on reaching this size.

Unfortunately, that last byte was generally received before the "end" directive (at least in `fileio`).
The later "end" directive would force the creation of a 3-bytes last block to indicate end of frame.

The solution is to not flush automatically, which is btw the expected behavior.
It happens in this case because blocksize is defined with exactly the same size as input.
Just adding one-byte is enough to stop triggering the automatic flush.

I initially looked at another solution, solving the problem directly in the compression context.
But it felt awkward.
Now, the underlying compression API `ZSTD_compressContinue()` would take the decision the close a frame
on reaching its expected end (`pledgedSrcSize`).
This feels awkward, a responsability over-reach, beyond the definition of this API.
ZSTD_compressContinue() is clearly documented as a guaranteed flush,
with ZSTD_compressEnd() generating a guaranteed end.

I faced similar issue when trying to port a similar mechanism at the higher streaming layer.
Having ZSTD_CStream end a frame automatically on reaching `pledgedSrcSize` can surprise the caller,
since it did not explicitly requested an end of frame.
The only sensible action remaining after that is to end the frame with no additional input.
This adds additional logic in the ZSTD_CStream state to check this condition.
Plus some potential confusion on the meaning of ZSTD_endStream() with no additional input (ending confirmation ? new 0-size frame ?)

In the end, just enlarging input buffer by 1 byte feels the least intrusive change.
It's also a contract remaining inside the streaming layer, so the logic is contained in this part of the code.

The patch also introduces a new test checking that size of small frame is as expected, without additional 3-bytes null block.
2017-12-14 11:47:02 -08:00
..
common no longer supported starting C++17 2017-12-04 18:00:53 -08:00
compress saves 3-bytes on small input with streaming API 2017-12-14 11:47:02 -08:00
decompress re-added test case 2017-12-12 14:01:54 -08:00
deprecated fixed zbufftest 2017-10-19 14:06:02 -07:00
dictBuilder no longer supported starting C++17 2017-12-04 18:00:53 -08:00
dll fixed a bunch of headers after license change (#825) 2017-08-31 11:24:54 -07:00
legacy no longer supported starting C++17 2017-12-04 18:00:53 -08:00
.gitignore makes it possible to compile libzstd in single-thread mode without zstdmt_compress.c (#819) 2017-09-11 14:09:34 -07:00
BUCK Update BUCK files 2017-10-25 12:47:57 -07:00
libzstd.pc.in updated pkg config file 2016-11-30 11:06:58 -08:00
Makefile makes it possible to compile libzstd in single-thread mode without zstdmt_compress.c (#819) 2017-09-11 14:09:34 -07:00
README.md programs/Makefile : better support for GNU conventions 2017-09-06 16:53:59 -07:00
zstd.h re-added test case 2017-12-12 14:01:54 -08:00

Zstandard library files

The lib directory is split into several sub-directories, in order to make it easier to select or exclude specific features.

Building

Makefile script is provided, supporting the standard set of commands, directories, and variables (see https://www.gnu.org/prep/standards/html_node/Command-Variables.html).

  • make : generates both static and dynamic libraries
  • make install : install libraries in default system directories

API

Zstandard's stable API is exposed within lib/zstd.h.

Advanced API

Optional advanced features are exposed via :

  • lib/common/zstd_errors.h : translates size_t function results into an ZSTD_ErrorCode, for accurate error handling.
  • ZSTD_STATIC_LINKING_ONLY : if this macro is defined before including zstd.h, it unlocks access to advanced experimental API, exposed in second part of zstd.h. These APIs shall never be used with dynamic library ! They are not "stable", their definition may change in the future. Only static linking is allowed.

Modular build

  • Directory lib/common is always required, for all variants.
  • Compression source code lies in lib/compress
  • Decompression source code lies in lib/decompress
  • It's possible to include only compress or only decompress, they don't depend on each other.
  • lib/dictBuilder : makes it possible to generate dictionaries from a set of samples. The API is exposed in lib/dictBuilder/zdict.h. This module depends on both lib/common and lib/compress .
  • lib/legacy : source code to decompress older zstd formats, starting from v0.1. This module depends on lib/common and lib/decompress. To enable this feature, it's necessary to define ZSTD_LEGACY_SUPPORT = 1 during compilation. Typically, with gcc, add argument -DZSTD_LEGACY_SUPPORT=1. Using higher number limits the number of version supported. For example, ZSTD_LEGACY_SUPPORT=2 means : "support legacy formats starting from v0.2+". The API is exposed in lib/legacy/zstd_legacy.h. Each version also provides a (dedicated) set of advanced API. For example, advanced API for version v0.4 is exposed in lib/legacy/zstd_v04.h .

Multithreading support

Multithreading is disabled by default when building with make. Enabling multithreading requires 2 conditions :

  • set macro ZSTD_MULTITHREAD
  • on POSIX systems : compile with pthread (-pthread compilation flag for gcc for example)

Both conditions are automatically triggered by invoking make lib-mt target. Note that, when linking a POSIX program with a multithreaded version of libzstd, it's necessary to trigger -pthread flag during link stage.

Multithreading capabilities are exposed via :

  • private API lib/compress/zstdmt_compress.h. Symbols defined in this header are currently exposed in libzstd, hence usable. Note however that this API is planned to be locked and remain strictly internal in the future.
  • advanced API ZSTD_compress_generic(), defined in lib/zstd.h, experimental section. This API is still considered experimental, but is designed to be labelled "stable" at some point in the future. It's the recommended entry point for multi-threading operations.

Windows : using MinGW+MSYS to create DLL

DLL can be created using MinGW+MSYS with the make libzstd command. This command creates dll\libzstd.dll and the import library dll\libzstd.lib. The import library is only required with Visual C++. The header file zstd.h and the dynamic library dll\libzstd.dll are required to compile a project using gcc/MinGW. The dynamic library has to be added to linking options. It means that if a project that uses ZSTD consists of a single test-dll.c file it should be linked with dll\libzstd.dll. For example:

    gcc $(CFLAGS) -Iinclude/ test-dll.c -o test-dll dll\libzstd.dll

The compiled executable will require ZSTD DLL which is available at dll\libzstd.dll.

Deprecated API

Obsolete API on their way out are stored in directory lib/deprecated. At this stage, it contains older streaming prototypes, in lib/deprecated/zbuff.h. Presence in this directory is temporary. These prototypes will be removed in some future version. Consider migrating code towards supported streaming API exposed in zstd.h.

Miscellaneous

The other files are not source code. There are :

  • LICENSE : contains the BSD license text
  • Makefile : make script to build and install zstd library (static and dynamic)
  • BUCK : support for buck build system (https://buckbuild.com/)
  • libzstd.pc.in : for pkg-config (used in make install)
  • README.md : this file