Merge pull request #563 from lz4/docDict

updated documentation for dictionary compression
This commit is contained in:
Yann Collet 2018-09-06 12:43:29 -07:00 committed by GitHub
commit 0f08c22c31
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
9 changed files with 53 additions and 27 deletions

6
NEWS
View File

@ -1,3 +1,9 @@
v1.8.3
fix : data corruption for files > 64KB at level 9 under specific conditions (#560)
cli : new command --fast, by @jennifermliu
build : added Haiku target, by @fbrosson
doc : updated documentation regarding dictionary compression
v1.8.2 v1.8.2
perf: *much* faster dictionary compression on small files, by @felixhandte perf: *much* faster dictionary compression on small files, by @felixhandte
perf: improved decompression speed and binary size, by Alexey Tourbin (@svpv) perf: improved decompression speed and binary size, by Alexey Tourbin (@svpv)

View File

@ -2,18 +2,23 @@ LZ4 - Extremely fast compression
================================ ================================
LZ4 is lossless compression algorithm, LZ4 is lossless compression algorithm,
providing compression speed at 400 MB/s per core, providing compression speed > 500 MB/s per core,
scalable with multi-cores CPU. scalable with multi-cores CPU.
It features an extremely fast decoder, It features an extremely fast decoder,
with speed in multiple GB/s per core, with speed in multiple GB/s per core,
typically reaching RAM speed limits on multi-core systems. typically reaching RAM speed limits on multi-core systems.
Speed can be tuned dynamically, selecting an "acceleration" factor Speed can be tuned dynamically, selecting an "acceleration" factor
which trades compression ratio for more speed up. which trades compression ratio for faster speed.
On the other end, a high compression derivative, LZ4_HC, is also provided, On the other end, a high compression derivative, LZ4_HC, is also provided,
trading CPU time for improved compression ratio. trading CPU time for improved compression ratio.
All versions feature the same decompression speed. All versions feature the same decompression speed.
LZ4 is also compatible with [dictionary compression](https://github.com/facebook/zstd#the-case-for-small-data-compression),
and can ingest any input file as dictionary,
including those created by [Zstandard Dictionary Builder](https://github.com/facebook/zstd/blob/v1.3.5/programs/zstd.1.md#dictionary-builder).
(note: only the final 64KB are used).
LZ4 library is provided as open-source software using BSD 2-Clause license. LZ4 library is provided as open-source software using BSD 2-Clause license.
@ -67,8 +72,8 @@ in single-thread mode.
[zlib]: http://www.zlib.net/ [zlib]: http://www.zlib.net/
[Zstandard]: http://www.zstd.net/ [Zstandard]: http://www.zstd.net/
LZ4 is also compatible and well optimized for x32 mode, LZ4 is also compatible and optimized for x32 mode,
for which it provides some additional speed performance. for which it provides additional speed performance.
Installation Installation
@ -76,7 +81,7 @@ Installation
``` ```
make make
make install # this command may require root access make install # this command may require root permissions
``` ```
LZ4's `Makefile` supports standard [Makefile conventions], LZ4's `Makefile` supports standard [Makefile conventions],
@ -94,10 +99,10 @@ Documentation
The raw LZ4 block compression format is detailed within [lz4_Block_format]. The raw LZ4 block compression format is detailed within [lz4_Block_format].
To compress an arbitrarily long file or data stream, multiple blocks are required. Arbitrarily long files or data streams are compressed using multiple blocks,
Organizing these blocks and providing a common header format to handle their content for streaming requirements. These blocks are organized into a frame,
is the purpose of the Frame format, defined into [lz4_Frame_format]. defined into [lz4_Frame_format].
Interoperable versions of LZ4 must respect this frame format. Interoperable versions of LZ4 must also respect the frame format.
[lz4_Block_format]: doc/lz4_Block_format.md [lz4_Block_format]: doc/lz4_Block_format.md
[lz4_Frame_format]: doc/lz4_Frame_format.md [lz4_Frame_format]: doc/lz4_Frame_format.md

View File

@ -1,10 +1,10 @@
<html> <html>
<head> <head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>1.8.2 Manual</title> <title>1.8.3 Manual</title>
</head> </head>
<body> <body>
<h1>1.8.2 Manual</h1> <h1>1.8.3 Manual</h1>
<hr> <hr>
<a name="Contents"></a><h2>Contents</h2> <a name="Contents"></a><h2>Contents</h2>
<ol> <ol>
@ -179,7 +179,7 @@ int LZ4_freeStream (LZ4_stream_t* streamPtr);
'dst' buffer must be already allocated. 'dst' buffer must be already allocated.
If dstCapacity >= LZ4_compressBound(srcSize), compression is guaranteed to succeed, and runs faster. If dstCapacity >= LZ4_compressBound(srcSize), compression is guaranteed to succeed, and runs faster.
Important : The previous 64KB of compressed data is assumed to remain present and unmodified in memory! Important : The previous 64KB of source data is assumed to remain present and unmodified in memory!
Special 1 : When input is a double-buffer, they can have any size, including < 64 KB. Special 1 : When input is a double-buffer, they can have any size, including < 64 KB.
Make sure that buffers are separated by at least one byte. Make sure that buffers are separated by at least one byte.

View File

@ -1,10 +1,10 @@
<html> <html>
<head> <head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>1.8.2 Manual</title> <title>1.8.3 Manual</title>
</head> </head>
<body> <body>
<h1>1.8.2 Manual</h1> <h1>1.8.3 Manual</h1>
<hr> <hr>
<a name="Contents"></a><h2>Contents</h2> <a name="Contents"></a><h2>Contents</h2>
<ol> <ol>

View File

@ -93,7 +93,7 @@ extern "C" {
/*------ Version ------*/ /*------ Version ------*/
#define LZ4_VERSION_MAJOR 1 /* for breaking interface changes */ #define LZ4_VERSION_MAJOR 1 /* for breaking interface changes */
#define LZ4_VERSION_MINOR 8 /* for new (non-breaking) interface capabilities */ #define LZ4_VERSION_MINOR 8 /* for new (non-breaking) interface capabilities */
#define LZ4_VERSION_RELEASE 2 /* for tweaks, bug-fixes, or development */ #define LZ4_VERSION_RELEASE 3 /* for tweaks, bug-fixes, or development */
#define LZ4_VERSION_NUMBER (LZ4_VERSION_MAJOR *100*100 + LZ4_VERSION_MINOR *100 + LZ4_VERSION_RELEASE) #define LZ4_VERSION_NUMBER (LZ4_VERSION_MAJOR *100*100 + LZ4_VERSION_MINOR *100 + LZ4_VERSION_RELEASE)

View File

@ -1,5 +1,5 @@
. .
.TH "LZ4" "1" "2018-01-13" "lz4 1.8.1" "User Commands" .TH "LZ4" "1" "September 2018" "lz4 1.8.3" "User Commands"
. .
.SH "NAME" .SH "NAME"
\fBlz4\fR \- lz4, unlz4, lz4cat \- Compress or decompress \.lz4 files \fBlz4\fR \- lz4, unlz4, lz4cat \- Compress or decompress \.lz4 files
@ -115,7 +115,11 @@ Benchmark mode, using \fB#\fR compression level\.
. .
.TP .TP
\fB\-#\fR \fB\-#\fR
Compression level, with # being any value from 1 to 16\. Higher values trade compression speed for compression ratio\. Values above 16 are considered the same as 16\. Recommended values are 1 for fast compression (default), and 9 for high compression\. Speed/compression trade\-off will vary depending on data to compress\. Decompression speed remains fast at all settings\. Compression level, with # being any value from 1 to 12\. Higher values trade compression speed for compression ratio\. Values above 12 are considered the same as 12\. Recommended values are 1 for fast compression (default), and 9 for high compression\. Speed/compression trade\-off will vary depending on data to compress\. Decompression speed remains fast at all settings\.
.
.TP
\fB\-D dictionaryName\fR
Compress, decompress or benchmark using dictionary \fIdictionaryName\fR\. Compression and decompression must use the same dictionary to be compatible\. Using a different dictionary during decompression will either abort due to decompression error, or generate a checksum error\.
. .
.TP .TP
\fB\-f\fR \fB\-\-[no\-]force\fR \fB\-f\fR \fB\-\-[no\-]force\fR
@ -151,6 +155,10 @@ Block size [4\-7](default : 7)
Block Dependency (improves compression ratio on small blocks) Block Dependency (improves compression ratio on small blocks)
. .
.TP .TP
\fB\-\-fast[=#]\fR
switch to ultra\-fast compression levels\. If \fB=#\fR is not present, it defaults to \fB1\fR\. The higher the value, the faster the compression speed, at the cost of some compression ratio\. This setting overwrites compression level if one was set previously\. Similarly, if a compression level is set after \fB\-\-fast\fR, it overrides it\.
.
.TP
\fB\-\-[no\-]frame\-crc\fR \fB\-\-[no\-]frame\-crc\fR
Select frame checksum (default:enabled) Select frame checksum (default:enabled)
. .
@ -214,7 +222,7 @@ Benchmark multiple compression levels, from b# to e# (included)
. .
.TP .TP
\fB\-i#\fR \fB\-i#\fR
Minimum evaluation in seconds [1\-9] (default : 3) Minimum evaluation time in seconds [1\-9] (default : 3)
. .
.SH "BUGS" .SH "BUGS"
Report bugs at: https://github\.com/lz4/lz4/issues Report bugs at: https://github\.com/lz4/lz4/issues

View File

@ -125,6 +125,19 @@ only the latest one will be applied.
Speed/compression trade-off will vary depending on data to compress. Speed/compression trade-off will vary depending on data to compress.
Decompression speed remains fast at all settings. Decompression speed remains fast at all settings.
* `--fast[=#]`:
switch to ultra-fast compression levels.
The higher the value, the faster the compression speed, at the cost of some compression ratio.
If `=#` is not present, it defaults to `1`.
This setting overrides compression level if one was set previously.
Similarly, if a compression level is set after `--fast`, it overrides it.
* `-D dictionaryName`:
Compress, decompress or benchmark using dictionary _dictionaryName_.
Compression and decompression must use the same dictionary to be compatible.
Using a different dictionary during decompression will either
abort due to decompression error, or generate a checksum error.
* `-f` `--[no-]force`: * `-f` `--[no-]force`:
This option has several effects: This option has several effects:
@ -156,13 +169,6 @@ only the latest one will be applied.
* `-BD`: * `-BD`:
Block Dependency (improves compression ratio on small blocks) Block Dependency (improves compression ratio on small blocks)
* `--fast[=#]`:
switch to ultra-fast compression levels.
If `=#` is not present, it defaults to `1`.
The higher the value, the faster the compression speed, at the cost of some compression ratio.
This setting overwrites compression level if one was set previously.
Similarly, if a compression level is set after `--fast`, it overrides it.
* `--[no-]frame-crc`: * `--[no-]frame-crc`:
Select frame checksum (default:enabled) Select frame checksum (default:enabled)

View File

@ -110,7 +110,7 @@ static int usage(const char* exeName)
DISPLAY( " -9 : High compression \n"); DISPLAY( " -9 : High compression \n");
DISPLAY( " -d : decompression (default for %s extension)\n", LZ4_EXTENSION); DISPLAY( " -d : decompression (default for %s extension)\n", LZ4_EXTENSION);
DISPLAY( " -z : force compression \n"); DISPLAY( " -z : force compression \n");
DISPLAY( " -D FILE: use dictionary in FILE \n"); DISPLAY( " -D FILE: use FILE as dictionary \n");
DISPLAY( " -f : overwrite output without prompting \n"); DISPLAY( " -f : overwrite output without prompting \n");
DISPLAY( " -k : preserve source files(s) (default) \n"); DISPLAY( " -k : preserve source files(s) (default) \n");
DISPLAY( "--rm : remove source file(s) after successful de/compression \n"); DISPLAY( "--rm : remove source file(s) after successful de/compression \n");

View File

@ -117,7 +117,8 @@ clean:
fullbench$(EXT) fullbench32$(EXT) \ fullbench$(EXT) fullbench32$(EXT) \
fuzzer$(EXT) fuzzer32$(EXT) \ fuzzer$(EXT) fuzzer32$(EXT) \
frametest$(EXT) frametest32$(EXT) \ frametest$(EXT) frametest32$(EXT) \
fasttest$(EXT) datagen$(EXT) checkTag$(EXT) fasttest$(EXT) roundTripTest$(EXT) \
datagen$(EXT) checkTag$(EXT)
@rm -fR $(TESTDIR) @rm -fR $(TESTDIR)
@echo Cleaning completed @echo Cleaning completed