zstd/lib
Nick Terrell 674534a700 [zstd] Fix data corruption in niche use case
* Extract the overflow correction into a helper function.
* Load the dictionary `ZSTD_CHUNKSIZE_MAX = 512 MB` bytes at a time
  and overflow correct between each chunk.

Data corruption could happen when all these conditions are true:

* You are using multithreading mode
* Your overlap size is >= 512 MB (implies window size >= 512 MB)
* You are using a strategy >= ZSTD_btlazy
* You are compressing more than 4 GB

The problem is that when loading a large dictionary we don't do
overflow correction. We can only load 512 MB at a time, and may
need to do overflow correction before each chunk.
2019-06-21 15:47:31 -07:00
..
common Spelling (#1582) 2019-04-12 11:18:11 -07:00
compress [zstd] Fix data corruption in niche use case 2019-06-21 15:47:31 -07:00
decompress [libzstd] Add a ZSTD_STATIC_ASSERT for BIT_DStream_status 2019-04-23 14:22:16 -07:00
deprecated fixed zbufftest 2017-10-19 14:06:02 -07:00
dictBuilder [dictBuilder] Be more specific than ERROR(generic) (#1616) 2019-05-22 18:57:50 -07:00
dll/example [Windows] Don't use a .def file 2019-02-19 16:52:38 -08:00
legacy [legacy] Fix ZSTDv0*_decodeSequence() 2019-04-19 11:34:52 -07:00
.gitignore makes it possible to compile libzstd in single-thread mode without zstdmt_compress.c (#819) 2017-09-11 14:09:34 -07:00
BUCK Fix buck for lib 2018-11-30 13:45:16 +00:00
libzstd.pc.in updated pkg config file 2016-11-30 11:06:58 -08:00
Makefile [libzstd] Remove ZSTDMT from the shared object 2019-04-07 18:47:52 -07:00
README.md Remove double the from README 2019-04-08 16:50:18 -07:00
zstd.h better title formatting for html documentation 2019-06-04 10:35:40 -07:00

Zstandard library files

The lib directory is split into several sub-directories, in order to make it easier to select or exclude features.

Building

Makefile script is provided, supporting Makefile conventions, including commands variables, staged install, directory variables and standard targets.

  • make : generates both static and dynamic libraries
  • make install : install libraries and headers in target system directories

libzstd default scope is pretty large, including compression, decompression, dictionary builder, and support for decoding legacy formats >= v0.5.0. The scope can be reduced on demand (see paragraph modular build).

Multithreading support

Multithreading is disabled by default when building with make. Enabling multithreading requires 2 conditions :

  • set build macro ZSTD_MULTITHREAD (-DZSTD_MULTITHREAD for gcc)
  • for POSIX systems : compile with pthread (-pthread compilation flag for gcc)

Both conditions are automatically applied when invoking make lib-mt target.

When linking a POSIX program with a multithreaded version of libzstd, note that it's necessary to request the -pthread flag during link stage.

Multithreading capabilities are exposed via the advanced API defined in lib/zstd.h.

API

Zstandard's stable API is exposed within lib/zstd.h.

Advanced API

Optional advanced features are exposed via :

  • lib/common/zstd_errors.h : translates size_t function results into a ZSTD_ErrorCode, for accurate error handling.

  • ZSTD_STATIC_LINKING_ONLY : if this macro is defined before including zstd.h, it unlocks access to the experimental API, exposed in the second part of zstd.h. All definitions in the experimental APIs are unstable, they may still change in the future, or even be removed. As a consequence, experimental definitions shall never be used with dynamic library ! Only static linking is allowed.

Modular build

It's possible to compile only a limited set of features within libzstd. The file structure is designed to make this selection manually achievable for any build system :

  • Directory lib/common is always required, for all variants.

  • Compression source code lies in lib/compress

  • Decompression source code lies in lib/decompress

  • It's possible to include only compress or only decompress, they don't depend on each other.

  • lib/dictBuilder : makes it possible to generate dictionaries from a set of samples. The API is exposed in lib/dictBuilder/zdict.h. This module depends on both lib/common and lib/compress .

  • lib/legacy : makes it possible to decompress legacy zstd formats, starting from v0.1.0. This module depends on lib/common and lib/decompress. To enable this feature, define ZSTD_LEGACY_SUPPORT during compilation. Specifying a number limits versions supported to that version onward. For example, ZSTD_LEGACY_SUPPORT=2 means : "support legacy formats >= v0.2.0". Conversely, ZSTD_LEGACY_SUPPORT=0 means "do not support legacy formats". By default, this build macro is set as ZSTD_LEGACY_SUPPORT=5. Decoding supported legacy format is a transparent capability triggered within decompression functions. It's also allowed to invoke legacy API directly, exposed in lib/legacy/zstd_legacy.h. Each version does also provide its own set of advanced API. For example, advanced API for version v0.4 is exposed in lib/legacy/zstd_v04.h .

  • While invoking make libzstd, it's possible to define build macros ZSTD_LIB_COMPRESSION, ZSTD_LIB_DECOMPRESSION, ZSTD_LIB_DICTBUILDER, and ZSTD_LIB_DEPRECATED as 0 to forgo compilation of the corresponding features. This will also disable compilation of all dependencies (eg. ZSTD_LIB_COMPRESSION=0 will also disable dictBuilder).

  • There are some additional build macros that can be used to minify the decoder.

    Zstandard often has more than one implementation of a piece of functionality, where each implementation optimizes for different scenarios. For example, the Huffman decoder has complementary implementations that decode the stream one symbol at a time or two symbols at a time. Zstd normally includes both (and dispatches between them at runtime), but by defining HUF_FORCE_DECOMPRESS_X1 or HUF_FORCE_DECOMPRESS_X2, you can force the use of one or the other, avoiding compilation of the other. Similarly, ZSTD_FORCE_DECOMPRESS_SEQUENCES_SHORT and ZSTD_FORCE_DECOMPRESS_SEQUENCES_LONG force the compilation and use of only one or the other of two decompression implementations. The smallest binary is achieved by using HUF_FORCE_DECOMPRESS_X1 and ZSTD_FORCE_DECOMPRESS_SEQUENCES_SHORT.

    For squeezing the last ounce of size out, you can also define ZSTD_NO_INLINE, which disables inlining, and ZSTD_STRIP_ERROR_STRINGS, which removes the error messages that are otherwise returned by ZSTD_getErrorName.

  • While invoking make libzstd, the build macro ZSTD_LEGACY_MULTITHREADED_API=1 will expose the deprecated ZSTDMT API exposed by zstdmt_compress.h in the shared library, which is now hidden by default.

Windows : using MinGW+MSYS to create DLL

DLL can be created using MinGW+MSYS with the make libzstd command. This command creates dll\libzstd.dll and the import library dll\libzstd.lib. The import library is only required with Visual C++. The header file zstd.h and the dynamic library dll\libzstd.dll are required to compile a project using gcc/MinGW. The dynamic library has to be added to linking options. It means that if a project that uses ZSTD consists of a single test-dll.c file it should be linked with dll\libzstd.dll. For example:

    gcc $(CFLAGS) -Iinclude/ test-dll.c -o test-dll dll\libzstd.dll

The compiled executable will require ZSTD DLL which is available at dll\libzstd.dll.

Deprecated API

Obsolete API on their way out are stored in directory lib/deprecated. At this stage, it contains older streaming prototypes, in lib/deprecated/zbuff.h. These prototypes will be removed in some future version. Consider migrating code towards supported streaming API exposed in zstd.h.

Miscellaneous

The other files are not source code. There are :

  • BUCK : support for buck build system (https://buckbuild.com/)
  • Makefile : make script to build and install zstd library (static and dynamic)
  • README.md : this file
  • dll/ : resources directory for Windows compilation
  • libzstd.pc.in : script for pkg-config (used in make install)