e9cfa49d0d
EndMark, the 4-bytes value indicating the end of frame, must be `0x00000000`. Previously, it was just mentioned as a `0-size` block. But such definition could encompass uncompressed blocks of size 0, with a header of value `0x80000000`. But the intention was to also support uncompressed empty blocks. They could be used as a keep-alive signal. Note that compressed empty blocks are already supported, it's just that they have a size 1 instead of 0 (for the `0` token). Unfortunately, the decoder implementation was also wrong, and would also interpret a `0x80000000` block header as an endMark. This issue evaded detection so far simply because this situation never happens, as LZ4Frame always issues a clean 0x00000000 value as a endMark. It also does not flush empty blocks. This is fixed in this PR. The decoder can now deal with empty uncompressed blocks, and do not confuse them with EndMark. The specification is also clarified. Finally, FrameTest is updated to randomly insert empty blocks during fuzzing. |
||
---|---|---|
.circleci | ||
.github/ISSUE_TEMPLATE | ||
contrib | ||
doc | ||
examples | ||
lib | ||
ossfuzz | ||
programs | ||
tests | ||
visual | ||
.cirrus.yml | ||
.gitattributes | ||
.gitignore | ||
.travis.yml | ||
appveyor.yml | ||
INSTALL | ||
LICENSE | ||
Makefile | ||
Makefile.inc | ||
NEWS | ||
README.md |
LZ4 - Extremely fast compression
LZ4 is lossless compression algorithm, providing compression speed > 500 MB/s per core, scalable with multi-cores CPU. It features an extremely fast decoder, with speed in multiple GB/s per core, typically reaching RAM speed limits on multi-core systems.
Speed can be tuned dynamically, selecting an "acceleration" factor which trades compression ratio for faster speed. On the other end, a high compression derivative, LZ4_HC, is also provided, trading CPU time for improved compression ratio. All versions feature the same decompression speed.
LZ4 is also compatible with dictionary compression, both at API and CLI levels. It can ingest any input file as dictionary, though only the final 64KB are used. This capability can be combined with the Zstandard Dictionary Builder, in order to drastically improve compression performance on small files.
LZ4 library is provided as open-source software using BSD 2-Clause license.
Branch | Status |
---|---|
master | |
dev |
Branch Policy:
- The "master" branch is considered stable, at all times.
- The "dev" branch is the one where all contributions must be merged before being promoted to master.
- If you plan to propose a patch, please commit into the "dev" branch, or its own feature branch. Direct commit to "master" are not permitted.
Benchmarks
The benchmark uses lzbench, from @inikep compiled with GCC v8.2.0 on Linux 64-bits (Ubuntu 4.18.0-17). The reference system uses a Core i7-9700K CPU @ 4.9GHz (w/ turbo boost). Benchmark evaluates the compression of reference Silesia Corpus in single-thread mode.
Compressor | Ratio | Compression | Decompression |
---|---|---|---|
memcpy | 1.000 | 13700 MB/s | 13700 MB/s |
LZ4 default (v1.9.0) | 2.101 | 780 MB/s | 4970 MB/s |
LZO 2.09 | 2.108 | 670 MB/s | 860 MB/s |
QuickLZ 1.5.0 | 2.238 | 575 MB/s | 780 MB/s |
Snappy 1.1.4 | 2.091 | 565 MB/s | 1950 MB/s |
Zstandard 1.4.0 -1 | 2.883 | 515 MB/s | 1380 MB/s |
LZF v3.6 | 2.073 | 415 MB/s | 910 MB/s |
zlib deflate 1.2.11 -1 | 2.730 | 100 MB/s | 415 MB/s |
LZ4 HC -9 (v1.9.0) | 2.721 | 41 MB/s | 4900 MB/s |
zlib deflate 1.2.11 -6 | 3.099 | 36 MB/s | 445 MB/s |
LZ4 is also compatible and optimized for x32 mode, for which it provides additional speed performance.
Installation
make
make install # this command may require root permissions
LZ4's Makefile
supports standard Makefile conventions,
including staged installs, redirection, or command redefinition.
It is compatible with parallel builds (-j#
).
Building LZ4 - Using vcpkg
You can download and install LZ4 using the vcpkg dependency manager:
git clone https://github.com/Microsoft/vcpkg.git
cd vcpkg
./bootstrap-vcpkg.sh
./vcpkg integrate install
vcpkg install lz4
The LZ4 port in vcpkg is kept up to date by Microsoft team members and community contributors. If the version is out of date, please create an issue or pull request on the vcpkg repository.
Documentation
The raw LZ4 block compression format is detailed within lz4_Block_format.
Arbitrarily long files or data streams are compressed using multiple blocks, for streaming requirements. These blocks are organized into a frame, defined into lz4_Frame_format. Interoperable versions of LZ4 must also respect the frame format.
Other source versions
Beyond the C reference source, many contributors have created versions of lz4 in multiple languages (Java, C#, Python, Perl, Ruby, etc.). A list of known source ports is maintained on the LZ4 Homepage.