There was a flaw in the formula which compared literal cost with match cost : at a given position, a non-null literal suite is going to be part of next sequence, while if position ends a previous match, to immediately start another match, next sequence will have a litlength of zero. A litlength of zero has a non-null cost. It follows that literals cost should be compared to match cost + litlength==0. Not doing so gave a structural advantage to matches, which would be selected more often. I believe that's what led to the creation of the strange heuristic which added a complex cost to matches. The heuristic was actually compensating. It was probably created through multiple trials, settling for best outcome on a given scenario (I suspect silesia.tar). The problem with this heuristic is that it's hard to understand, and unfortunately, any future change in the parser would impact the way it should be calculated and its effects. The "proper" formula makes it possible to remove this heuristic. Now, the problem is : in a head to head comparison, it's sometimes better, sometimes worse. Note that all differences are small (< 0.01 ratio). In general, the newer formula is better for smaller files (for example, calgary.tar and enwik7). I suspect that's because starting statistics are pretty poor (another area of improvement). However, for silesia.tar specifically, it's worse at level 22 (while being better at level 17, so even compression level has an impact ...). It's a pity that zstd -22 gets worse on silesia.tar. That being said, I like that the new code gets rid of strange variables, which were introducing complexity for any future evolution (faster variants being in mind). Therefore, in spite of this detrimental side effect, I tend to be in favor of it. |
||
---|---|---|
.. | ||
common | ||
compress | ||
decompress | ||
deprecated | ||
dictBuilder | ||
dll | ||
legacy | ||
.gitignore | ||
BUCK | ||
libzstd.pc.in | ||
Makefile | ||
README.md | ||
zstd.h |
Zstandard library files
The lib directory is split into several sub-directories, in order to make it easier to select or exclude specific features.
Building
Makefile
script is provided, supporting the standard set of commands,
directories, and variables (see https://www.gnu.org/prep/standards/html_node/Command-Variables.html).
make
: generates both static and dynamic librariesmake install
: install libraries in default system directories
API
Zstandard's stable API is exposed within lib/zstd.h.
Advanced API
Optional advanced features are exposed via :
lib/common/zstd_errors.h
: translatessize_t
function results into anZSTD_ErrorCode
, for accurate error handling.ZSTD_STATIC_LINKING_ONLY
: if this macro is defined before includingzstd.h
, it unlocks access to advanced experimental API, exposed in second part ofzstd.h
. These APIs shall never be used with dynamic library ! They are not "stable", their definition may change in the future. Only static linking is allowed.
Modular build
- Directory
lib/common
is always required, for all variants. - Compression source code lies in
lib/compress
- Decompression source code lies in
lib/decompress
- It's possible to include only
compress
or onlydecompress
, they don't depend on each other. lib/dictBuilder
: makes it possible to generate dictionaries from a set of samples. The API is exposed inlib/dictBuilder/zdict.h
. This module depends on bothlib/common
andlib/compress
.lib/legacy
: source code to decompress older zstd formats, starting fromv0.1
. This module depends onlib/common
andlib/decompress
. To enable this feature, it's necessary to defineZSTD_LEGACY_SUPPORT = 1
during compilation. Typically, withgcc
, add argument-DZSTD_LEGACY_SUPPORT=1
. Using higher number limits the number of version supported. For example,ZSTD_LEGACY_SUPPORT=2
means : "support legacy formats starting from v0.2+". The API is exposed inlib/legacy/zstd_legacy.h
. Each version also provides a (dedicated) set of advanced API. For example, advanced API for versionv0.4
is exposed inlib/legacy/zstd_v04.h
.
Multithreading support
Multithreading is disabled by default when building with make
.
Enabling multithreading requires 2 conditions :
- set macro
ZSTD_MULTITHREAD
- on POSIX systems : compile with pthread (
-pthread
compilation flag forgcc
for example)
Both conditions are automatically triggered by invoking make lib-mt
target.
Note that, when linking a POSIX program with a multithreaded version of libzstd
,
it's necessary to trigger -pthread
flag during link stage.
Multithreading capabilities are exposed via :
- private API
lib/compress/zstdmt_compress.h
. Symbols defined in this header are currently exposed inlibzstd
, hence usable. Note however that this API is planned to be locked and remain strictly internal in the future. - advanced API
ZSTD_compress_generic()
, defined inlib/zstd.h
, experimental section. This API is still considered experimental, but is designed to be labelled "stable" at some point in the future. It's the recommended entry point for multi-threading operations.
Windows : using MinGW+MSYS to create DLL
DLL can be created using MinGW+MSYS with the make libzstd
command.
This command creates dll\libzstd.dll
and the import library dll\libzstd.lib
.
The import library is only required with Visual C++.
The header file zstd.h
and the dynamic library dll\libzstd.dll
are required to
compile a project using gcc/MinGW.
The dynamic library has to be added to linking options.
It means that if a project that uses ZSTD consists of a single test-dll.c
file it should be linked with dll\libzstd.dll
. For example:
gcc $(CFLAGS) -Iinclude/ test-dll.c -o test-dll dll\libzstd.dll
The compiled executable will require ZSTD DLL which is available at dll\libzstd.dll
.
Deprecated API
Obsolete API on their way out are stored in directory lib/deprecated
.
At this stage, it contains older streaming prototypes, in lib/deprecated/zbuff.h
.
Presence in this directory is temporary.
These prototypes will be removed in some future version.
Consider migrating code towards supported streaming API exposed in zstd.h
.
Miscellaneous
The other files are not source code. There are :
LICENSE
: contains the BSD license textMakefile
:make
script to build and install zstd library (static and dynamic)BUCK
: support forbuck
build system (https://buckbuild.com/)libzstd.pc.in
: forpkg-config
(used inmake install
)README.md
: this file