b4016ff02f
Tested with `make install && man zstd` and visual inspection.
408 lines
12 KiB
Groff
408 lines
12 KiB
Groff
\"
|
|
\" zstd.1: This is a manual page for 'zstd' program. This file is part of the
|
|
\" zstd <http://www.zstd.net/> project.
|
|
\" Author: Yann Collet
|
|
\"
|
|
|
|
\" No hyphenation
|
|
.hy 0
|
|
.nr HY 0
|
|
|
|
.TH zstd "1" "2015-08-22" "zstd" "User Commands"
|
|
.SH NAME
|
|
\fBzstd, unzstd, zstdcat\fR - Compress or decompress .zst files
|
|
|
|
.SH SYNOPSIS
|
|
.TP 5
|
|
\fBzstd\fR [\fBOPTIONS\fR] [-|INPUT-FILE] [-o <OUTPUT-FILE>]
|
|
.PP
|
|
.B unzstd
|
|
is equivalent to
|
|
.BR "zstd \-d"
|
|
.br
|
|
.B zstdcat
|
|
is equivalent to
|
|
.BR "zstd \-dcf"
|
|
.br
|
|
|
|
.SH DESCRIPTION
|
|
.PP
|
|
\fBzstd\fR is a fast lossless compression algorithm
|
|
and data compression tool,
|
|
with command line syntax similar to \fB gzip (1) \fR and \fB xz (1) \fR .
|
|
It is based on the \fBLZ77\fR family, with further FSE & huff0 entropy stages.
|
|
\fBzstd\fR offers highly configurable compression speed,
|
|
with fast modes at > 200 MB/s per core,
|
|
and strong modes nearing lzma compression ratios.
|
|
It also features a very fast decoder, with speeds > 500 MB/s per core.
|
|
|
|
\fBzstd\fR command line syntax is generally similar to gzip,
|
|
but features the following differences :
|
|
- Source files are preserved by default.
|
|
It's possible to remove them automatically by using \fB--rm\fR command.
|
|
- When compressing a single file, \fBzstd\fR displays progress notifications and result summary by default.
|
|
Use \fB-q\fR to turn them off
|
|
|
|
.PP
|
|
.B zstd
|
|
compresses or decompresses each
|
|
.I file
|
|
according to the selected operation mode.
|
|
If no
|
|
.I files
|
|
are given or
|
|
.I file
|
|
is
|
|
.BR \- ,
|
|
.B zstd
|
|
reads from standard input and writes the processed data
|
|
to standard output.
|
|
.B zstd
|
|
will refuse (display an error and skip the
|
|
.IR file )
|
|
to write compressed data to standard output if it is a terminal.
|
|
Similarly,
|
|
.B zstd
|
|
will refuse to read compressed data
|
|
from standard input if it is a terminal.
|
|
|
|
.PP
|
|
Unless
|
|
.B \-\-stdout
|
|
or
|
|
.B \-o
|
|
is specified,
|
|
.I files
|
|
are written to a new file whose name is derived from the source
|
|
.I file
|
|
name:
|
|
.IP \(bu 3
|
|
When compressing, the suffix
|
|
.B .zst
|
|
is appended to the source filename to get the target filename.
|
|
.IP \(bu 3
|
|
When decompressing, the
|
|
.B .zst
|
|
suffix is removed from the filename to get the target filename.
|
|
|
|
.SS "Concatenation with .zst files"
|
|
It is possible to concatenate
|
|
.B .zst
|
|
files as is.
|
|
.B zstd
|
|
will decompress such files as if they were a single
|
|
.B .zst
|
|
file.
|
|
|
|
|
|
|
|
.SH OPTIONS
|
|
|
|
.
|
|
.SS "Integer suffixes and special values"
|
|
In most places where an integer argument is expected,
|
|
an optional suffix is supported to easily indicate large integers.
|
|
There must be no space between the integer and the suffix.
|
|
.TP
|
|
.B KiB
|
|
Multiply the integer by 1,024 (2^10).
|
|
.BR Ki ,
|
|
.BR K ,
|
|
and
|
|
.B KB
|
|
are accepted as synonyms for
|
|
.BR KiB .
|
|
.TP
|
|
.B MiB
|
|
Multiply the integer by 1,048,576 (2^20).
|
|
.BR Mi ,
|
|
.BR M ,
|
|
and
|
|
.B MB
|
|
are accepted as synonyms for
|
|
.BR MiB .
|
|
|
|
.
|
|
.SS "Operation mode"
|
|
If multiple operation mode options are given,
|
|
the last one takes effect.
|
|
.TP
|
|
.BR \-z ", " \-\-compress
|
|
Compress.
|
|
This is the default operation mode when no operation mode option
|
|
is specified and no other operation mode is implied from
|
|
the command name (for example,
|
|
.B unzstd
|
|
implies
|
|
.BR \-\-decompress ).
|
|
.TP
|
|
.BR \-d ", " \-\-decompress ", " \-\-uncompress
|
|
Decompress.
|
|
.TP
|
|
.BR \-t ", " \-\-test
|
|
Test the integrity of compressed
|
|
.IR files .
|
|
This option is equivalent to
|
|
.B "\-\-decompress \-\-stdout"
|
|
except that the decompressed data is discarded instead of being
|
|
written to standard output.
|
|
No files are created or removed.
|
|
.TP
|
|
.B \-b#
|
|
benchmark file(s) using compression level #
|
|
.TP
|
|
.B \--train FILEs
|
|
use FILEs as training set to create a dictionary. The training set should contain a lot of small files (> 100).
|
|
|
|
.
|
|
.SS "Operation modifiers"
|
|
.TP
|
|
.B \-#
|
|
# compression level [1-19] (default:3)
|
|
.TP
|
|
.BR \--ultra
|
|
unlocks high compression levels 20+ (maximum 22), using a lot more memory.
|
|
Note that decompression will also require more memory when using these levels.
|
|
.TP
|
|
.B \-D file
|
|
use `file` as Dictionary to compress or decompress FILE(s)
|
|
.TP
|
|
.BR \--no-dictID
|
|
do not store dictionary ID within frame header (dictionary compression).
|
|
The decoder will have to rely on implicit knowledge about which dictionary to use,
|
|
it won't be able to check if it's correct.
|
|
.TP
|
|
.B \-o file
|
|
save result into `file` (only possible with a single INPUT-FILE)
|
|
.TP
|
|
.BR \-f ", " --force
|
|
overwrite output without prompting
|
|
.TP
|
|
.BR \-c ", " --stdout
|
|
force write to standard output, even if it is the console
|
|
.TP
|
|
.BR \--[no-]sparse
|
|
enable / disable sparse FS support, to make files with many zeroes smaller on disk.
|
|
Creating sparse files may save disk space and speed up the decompression
|
|
by reducing the amount of disk I/O.
|
|
default : enabled when output is into a file, and disabled when output is stdout.
|
|
This setting overrides default and can force sparse mode over stdout.
|
|
.TP
|
|
.BR \--rm
|
|
remove source file(s) after successful compression or decompression
|
|
.TP
|
|
.BR \-k ", " --keep
|
|
keep source file(s) after successful compression or decompression.
|
|
This is the default behavior.
|
|
.TP
|
|
.BR \-r
|
|
operate recursively on directories
|
|
.TP
|
|
.BR \-h/\-H ", " --help
|
|
display help/long help and exit
|
|
.TP
|
|
.BR \-V ", " --version
|
|
display Version number and exit
|
|
.TP
|
|
.BR \-v ", " --verbose
|
|
verbose mode
|
|
.TP
|
|
.BR \-q ", " --quiet
|
|
suppress warnings, interactivity and notifications.
|
|
specify twice to suppress errors too.
|
|
.TP
|
|
.BR \-C ", " --[no-]check
|
|
add integrity check computed from uncompressed data (default : enabled)
|
|
.TP
|
|
.BR \-t ", " --test
|
|
Test the integrity of compressed files. This option is equivalent to \fB--decompress --stdout > /dev/null\fR.
|
|
No files are created or removed.
|
|
.TP
|
|
.BR --
|
|
All arguments after -- are treated as files
|
|
|
|
|
|
.SH DICTIONARY BUILDER
|
|
.PP
|
|
\fBzstd\fR offers \fIdictionary\fR compression, useful for very small files and messages.
|
|
It's possible to train \fBzstd\fR with some samples, the result of which is saved into a file called `dictionary`.
|
|
Then during compression and decompression, make reference to the same dictionary.
|
|
It will improve compression ratio of small files.
|
|
Typical gains range from ~10% (at 64KB) to x5 better (at <1KB).
|
|
.TP
|
|
.B \--train FILEs
|
|
use FILEs as training set to create a dictionary. The training set should contain a lot of small files (> 100),
|
|
and weight typically 100x the target dictionary size (for example, 10 MB for a 100 KB dictionary)
|
|
.TP
|
|
.B \-o file
|
|
dictionary saved into `file` (default: dictionary)
|
|
.TP
|
|
.B \--maxdict #
|
|
limit dictionary to specified size (default : 112640)
|
|
.TP
|
|
.B \--dictID #
|
|
A dictionary ID is a locally unique ID that a decoder can use to verify it is using the right dictionary.
|
|
By default, zstd will create a 4-bytes random number ID.
|
|
It's possible to give a precise number instead.
|
|
Short numbers have an advantage : an ID < 256 will only need 1 byte in the compressed frame header,
|
|
and an ID < 65536 will only need 2 bytes. This compares favorably to 4 bytes default.
|
|
However, it's up to the dictionary manager to not assign twice the same ID to 2 different dictionaries.
|
|
.TP
|
|
.B \-s#
|
|
dictionary selectivity level (default: 9)
|
|
the smaller the value, the denser the dictionary, improving its efficiency but reducing its possible maximum size.
|
|
.TP
|
|
.B \--cover=k=#,d=#
|
|
Use alternate dictionary builder algorithm named cover with parameters \fIk\fR and \fId\fR with \fId\fR <= \fIk\fR.
|
|
Selects segments of size \fIk\fR with the highest score to put in the dictionary.
|
|
The score of a segment is computed by the sum of the frequencies of all the subsegments of of size \fId\fR.
|
|
Generally \fId\fR should be in the range [6, 24].
|
|
Good values for \fIk\fR vary widely based on the input data, but a safe range is [32, 2048].
|
|
Example: \fB--train --cover=k=64,d=8 FILEs\fR.
|
|
.TP
|
|
.B \--optimize-cover[=steps=#,k=#,d=#]
|
|
If \fIsteps\fR is not specified, the default value of 32 is used.
|
|
If \fIk\fR is not specified, \fIsteps\fR values in [16, 2048] are checked for each value of \fId\fR.
|
|
If \fId\fR is not specified, the values checked are [6, 8, ..., 16].
|
|
|
|
Runs the cover dictionary builder for each parameter set saves the optimal parameters and dictionary.
|
|
Prints the optimal parameters and writes the optimal dictionary to the output file.
|
|
Supports multithreading if \fBzstd\fR is compiled with threading support.
|
|
|
|
The parameter \fIk\fR is more sensitve than \fId\fR, and is faster to optimize over.
|
|
Suggested use is to run with a \fIsteps\fR <= 32 with neither \fIk\fR nor \fId\fR set.
|
|
Once it completes, use the value of \fId\fR it selects with a higher \fIsteps\fR (in the range [256, 1024]).
|
|
\fBzstd --train --optimize-cover FILEs
|
|
\fBzstd --train --optimize-cover=d=d,steps=512 FILEs
|
|
.TP
|
|
|
|
.SH BENCHMARK
|
|
.TP
|
|
.B \-b#
|
|
benchmark file(s) using compression level #
|
|
.TP
|
|
.B \-e#
|
|
benchmark file(s) using multiple compression levels, from -b# to -e# (included).
|
|
.TP
|
|
.B \-i#
|
|
minimum evaluation time, in seconds (default : 3s), benchmark mode only
|
|
.TP
|
|
.B \-B#
|
|
cut file into independent blocks of size # (default: no block)
|
|
|
|
|
|
.SH ADVANCED COMPRESSION OPTIONS
|
|
.TP
|
|
.B \--zstd[=\fIoptions\fR]
|
|
.PD
|
|
\fBzstd\fR provides 22 predefined compression levels. The selected or default predefined compression level can be changed with advanced compression options.
|
|
The \fIoptions\fR are provided as a comma-separated list. You may specify only the \fIoptions\fR you want to change and the rest will be taken from the selected or default compression level.
|
|
The list of available \fIoptions\fR:
|
|
.RS
|
|
|
|
.TP
|
|
.BI strategy= strat
|
|
.PD 0
|
|
.TP
|
|
.BI strat= strat
|
|
.PD
|
|
Specify a strategy used by a match finder.
|
|
.IP ""
|
|
There are 8 strategies numbered from 0 to 7, from faster to stronger:
|
|
0=ZSTD_fast, 1=ZSTD_dfast, 2=ZSTD_greedy, 3=ZSTD_lazy, 4=ZSTD_lazy2, 5=ZSTD_btlazy2, 6=ZSTD_btopt, 7=ZSTD_btopt2.
|
|
.IP ""
|
|
|
|
.TP
|
|
.BI windowLog= wlog
|
|
.PD 0
|
|
.TP
|
|
.BI wlog= wlog
|
|
.PD
|
|
Specify the maximum number of bits for a match distance.
|
|
.IP ""
|
|
The higher number of bits increases the chance to find a match what usually improves compression ratio.
|
|
It also increases memory requirements for compressor and decompressor.
|
|
.IP ""
|
|
The minimum \fIwlog\fR is 10 (1 KiB) and the maximum is 25 (32 MiB) for 32-bit compilation and 27 (128 MiB) for 64-bit compilation.
|
|
.IP ""
|
|
|
|
.TP
|
|
.BI hashLog= hlog
|
|
.PD 0
|
|
.TP
|
|
.BI hlog= hlog
|
|
.PD
|
|
Specify the maximum number of bits for a hash table.
|
|
.IP ""
|
|
The bigger hash table causes less collisions what usually make compression faster but requires more memory during compression.
|
|
.IP ""
|
|
The minimum \fIhlog\fR is 6 (64 B) and the maximum is 25 (32 MiB) for 32-bit compilation and 27 (128 MiB) for 64-bit compilation.
|
|
|
|
.TP
|
|
.BI chainLog= clog
|
|
.PD 0
|
|
.TP
|
|
.BI clog= clog
|
|
.PD
|
|
Specify the maximum number of bits for a hash chain or a binary tree.
|
|
.IP ""
|
|
The higher number of bits increases the chance to find a match what usually improves compression ratio.
|
|
It also slows down compression speed and increases memory requirements for compression.
|
|
This option is ignored for the ZSTD_fast strategy.
|
|
.IP ""
|
|
The minimum \fIclog\fR is 6 (64 B) and the maximum is 26 (64 MiB) for 32-bit compilation and 28 (256 MiB) for 64-bit compilation.
|
|
.IP ""
|
|
|
|
.TP
|
|
.BI searchLog= slog
|
|
.PD 0
|
|
.TP
|
|
.BI slog= slog
|
|
.PD
|
|
Specify the maximum number of searches in a hash chain or a binary tree using logarithmic scale.
|
|
.IP ""
|
|
The bigger number of searches increases the chance to find a match what usually improves compression ratio but decreases compression speed.
|
|
.IP ""
|
|
The minimum \fIslog\fR is 1 and the maximum is 24 for 32-bit compilation and 26 for 64-bit compilation.
|
|
.IP ""
|
|
|
|
.TP
|
|
.BI searchLength= slen
|
|
.PD 0
|
|
.TP
|
|
.BI slen= slen
|
|
.PD
|
|
Specify the minimum searched length of a match in a hash table.
|
|
.IP ""
|
|
The bigger search length usually decreases compression ratio but improves decompression speed.
|
|
.IP ""
|
|
The minimum \fIslen\fR is 3 and the maximum is 7.
|
|
.IP ""
|
|
|
|
.TP
|
|
.BI targetLength= tlen
|
|
.PD 0
|
|
.TP
|
|
.BI tlen= tlen
|
|
.PD
|
|
Specify the minimum match length that causes a match finder to interrupt searching of better matches.
|
|
.IP ""
|
|
The bigger minimum match length usually improves compression ratio but decreases compression speed.
|
|
This option is used only with ZSTD_btopt and ZSTD_btopt2 strategies.
|
|
.IP ""
|
|
The minimum \fItlen\fR is 4 and the maximum is 999.
|
|
.IP ""
|
|
|
|
.PP
|
|
.B An example
|
|
.br
|
|
The following parameters sets advanced compression options to predefined level 19 for files bigger than 256 KB:
|
|
.IP ""
|
|
\fB--zstd=\fRwindowLog=23,chainLog=23,hashLog=22,searchLog=6,searchLength=3,targetLength=48,strategy=6
|
|
|
|
.SH BUGS
|
|
Report bugs at:- https://github.com/facebook/zstd/issues
|
|
|
|
.SH AUTHOR
|
|
Yann Collet
|