[+] Initial commit

[+] Win32 patches
[+] liblzma build json
[+] .gitignore
[-] Non-public domain code
This commit is contained in:
Reece Wilson 2023-01-25 21:03:07 +00:00
commit 45d7729b48
424 changed files with 112774 additions and 0 deletions

1
ABOUT-NLS Normal file
View File

@ -0,0 +1 @@
<https://www.gnu.org/software/gettext/manual/html_node/Users.html>

39
AUTHORS Normal file
View File

@ -0,0 +1,39 @@
Authors of XZ Utils
===================
XZ Utils is developed and maintained by Lasse Collin
<lasse.collin@tukaani.org> and Jia Tan <jiat0218@gmail.com>.
Major parts of liblzma are based on code written by Igor Pavlov,
specifically the LZMA SDK <https://7-zip.org/sdk.html>. Without
this code, XZ Utils wouldn't exist.
The SHA-256 implementation in liblzma is based on the code found from
7-Zip <https://7-zip.org/>, which has a modified version of the SHA-256
code found from Crypto++ <https://www.cryptopp.com/>. The SHA-256 code
in Crypto++ was written by Kevin Springle and Wei Dai.
Some scripts have been adapted from gzip. The original versions
were written by Jean-loup Gailly, Charles Levert, and Paul Eggert.
Andrew Dudman helped adapting the scripts and their man pages for
XZ Utils.
The initial version of the threaded .xz decompressor was written
by Sebastian Andrzej Siewior.
The initial version of the .lz (lzip) decoder was written
by Michał Górny.
CLMUL-accelerated CRC code was contributed by Ilya Kurdyukov.
Other authors:
- Jonathan Nieder
- Joachim Henke
The GNU Autotools-based build system contains files from many authors,
which I'm not trying to list here.
Several people have contributed fixes or reported bugs. Most of them
are mentioned in the file THANKS.

49
COPYING Normal file
View File

@ -0,0 +1,49 @@
XZ Utils Licensing
==================
Different licenses apply to different files in this package. Here
is a rough summary of which licenses apply to which parts of this
package (but check the individual files to be sure!):
- liblzma is in the public domain.
- All the documentation in the doc directory and most of the
XZ Utils specific documentation files in other directories
are in the public domain.
- Translated messages are in the public domain.
- Test files and test code in the tests directory, and debugging
utilities in the debug directory are in the public domain.
- The extra directory may contain public domain files, and files
that are under various free software licenses.
You can do whatever you want with the files that have been put into
the public domain. If you find public domain legally problematic,
take the previous sentence as a license grant. If you still find
the lack of copyright legally problematic, you have too many
lawyers.
As usual, this software is provided "as is", without any warranty.
If you copy significant amounts of public domain code from XZ Utils
into your project, acknowledging this somewhere in your software is
polite (especially if it is proprietary, non-free software), but
naturally it is not legally required. Here is an example of a good
notice to put into "about box" or into documentation:
This software includes code from XZ Utils <https://tukaani.org/xz/>.
Note that the toolchain (compiler, linker etc.) may add some code
pieces that are copyrighted. Thus, it is possible that e.g. liblzma
binary wouldn't actually be in the public domain in its entirety
even though it contains no copyrighted code from the XZ Utils source
package.
If you have questions, don't hesitate to ask the author(s) for more
information.

8654
ChangeLog Normal file

File diff suppressed because it is too large Load Diff

1623
NEWS Normal file

File diff suppressed because it is too large Load Diff

233
PACKAGERS Normal file
View File

@ -0,0 +1,233 @@
Information to packagers of XZ Utils
====================================
0. Preface
1. Package naming
2. Package description
3. License
4. configure options
5. Additional documentation
6. Extra files
7. Installing XZ Utils and LZMA Utils in parallel
8. Example
0. Preface
----------
This document is meant for people who create and maintain XZ Utils
packages for operating system distributions. The focus is on GNU/Linux
systems, but most things apply to other systems too.
While the standard "configure && make DESTDIR=$PKG install" should
give a pretty good package, there are some details which packagers
may want to tweak.
Packagers should also read the INSTALL file.
1. Package naming
-----------------
The preferred name for the XZ Utils package is "xz", because that's
the name of the upstream tarball. Naturally you may have good reasons
to use some other name; I won't get angry about it. ;-) It's just nice
to be able to point people to the correct package name without asking
what distro they have.
If your distro policy is to split things into small pieces, here is
one suggestion:
xz xz, xzdec, scripts (xzdiff, xzgrep, etc.), docs
xz-lzma lzma, unlzma, lzcat, lzgrep etc. symlinks and
lzmadec binary for compatibility with LZMA Utils
liblzma liblzma.so.*
liblzma-devel liblzma.so, liblzma.a, API headers
2. Package description
----------------------
Here is a suggestion which you may use as the package description.
If you can use only one-line description, pick only the first line.
Naturally, feel free to use some other description if you find it
better, and maybe send it to me too.
Library and command line tools for XZ and LZMA compressed files
XZ Utils provide a general purpose data compression library
and command line tools. The native file format is the .xz
format, but also the legacy .lzma format is supported. The .xz
format supports multiple compression algorithms, of which LZMA2
is currently the primary algorithm. With typical files, XZ Utils
create about 30 % smaller files than gzip.
If you are splitting XZ Utils into multiple packages, here are some
suggestions for package descriptions:
xz:
Command line tools for XZ and LZMA compressed files
This package includes the xz compression tool and other command
line tools from XZ Utils. xz has command line syntax similar to
that of gzip. The native file format is the .xz format, but also
the legacy .lzma format is supported. The .xz format supports
multiple compression algorithms, of which LZMA2 is currently the
primary algorithm. With typical files, XZ Utils create about 30 %
smaller files than gzip.
Note that this package doesn't include the files needed for
LZMA Utils 4.32.x compatibility. Install also the xz-lzma
package to make XZ Utils emulate LZMA Utils 4.32.x.
xz-lzma:
LZMA Utils emulation with XZ Utils
This package includes executables and symlinks to make
XZ Utils emulate lzma, unlzma, lzcat, and other command
line tools found from the legacy LZMA Utils 4.32.x package.
liblzma:
Library for XZ and LZMA compressed files
liblzma is a general purpose data compression library with
an API similar to that of zlib. liblzma supports multiple
algorithms, of which LZMA2 is currently the primary algorithm.
The native file format is .xz, but also the legacy .lzma
format and raw streams (no headers at all) are supported.
This package includes the shared library.
liblzma-devel:
Library for XZ and LZMA compressed files
This package includes the API headers, static library, and
other development files related to liblzma.
3. License
----------
If the package manager supports a license field, you probably should
put GPLv2+ there (GNU GPL v2 or later). The interesting parts of
XZ Utils are in the public domain, but some less important files
ending up into the binary package are under GPLv2+. So it is simplest
to just say GPLv2+ if you cannot specify "public domain and GPLv2+".
If you split XZ Utils into multiple packages as described earlier
in this file, liblzma and liblzma-dev packages will contain only
public domain code (from XZ Utils at least; compiler or linker may
add some third-party code, which may be copyrighted).
4. configure options
--------------------
Unless you are building a package for a distribution that is meant
only for embedded systems, don't use the following configure options:
--enable-debug
--enable-encoders (*)
--enable-decoders
--enable-match-finders
--enable-checks
--enable-small (*)
--disable-threads (*)
--disable-microlzma (*)
--disable-lzip-decoder (*)
(*) These are OK when building xzdec and lzmadec as described
in INSTALL.
xzdec and lzmadec don't provide any functionality that isn't already
available in the xz tool. Shipping xzdec and lzmadec without size
optimization and statically-linked liblzma isn't very useful. Doing
that would give users the xzdec man page, which may make it easier
for people to find out that such tools exists, but the executables
wouldn't have any advantage over the full-featured xz.
5. Additional documentation
---------------------------
"make install" copies some additional documentation to $docdir
(--docdir in configure). There is a copy of the GNU GPL v2, which
can be replaced with a symlink if your distro ships with shared
copies of the common license texts.
liblzma API is currently only documented using Doxygen tags in the
API headers. It hasn't been tested much how good results Doxygen
is able to make from the tags (e.g. Doxyfile might need tweaking,
the tagging may need to be improved etc.), so it might be simpler
to just let people read docs directly from the .h files for now,
and also save quite a bit in package size at the same time.
6. Extra files
--------------
The "extra" directory contains some small extra tools or other files.
The exact set of extra files can vary between XZ Utils releases. The
extra files have only limited use or they are too dangerous to be
put directly to $bindir (7z2lzma.sh is a good example, since it can
silently create corrupt output if certain conditions are not met).
If you feel like it, you may copy the extra directory under the doc
directory (e.g. /usr/share/doc/xz/extra). Maybe some people will find
them useful. However, most people needing these tools probably are
able to find them from the source package too.
The "debug" directory contains some tools that are useful only when
hacking on XZ Utils. Don't package these tools.
7. Installing XZ Utils and LZMA Utils in parallel
-------------------------------------------------
XZ Utils and LZMA Utils 4.32.x can be installed in parallel by
omitting the compatibility symlinks (lzma, unlzma, lzcat, lzgrep etc.)
from the XZ Utils package. It's probably a good idea to still package
the symlinks into a separate package so that users may choose if they
want to use XZ Utils or LZMA Utils for handling .lzma files.
8. Example
----------
Here is an example for i686 GNU/Linux that
- links xz and lzmainfo against shared liblzma;
- links size-optimized xzdec and lzmadec against static liblzma
while avoiding libpthread dependency;
- includes only shared liblzma in the final package; and
- copies also the "extra" directory to the package.
PKG=/tmp/xz-pkg
tar xf xz-x.y.z.tar.gz
cd xz-x.y.z
./configure \
--prefix=/usr \
--disable-static \
--disable-xzdec \
--disable-lzmadec \
CFLAGS='-march=i686 -mtune=generic -O2'
make
make DESTDIR=$PKG install-strip
make clean
./configure \
--prefix=/usr \
--disable-shared \
--disable-nls \
--disable-encoders \
--enable-small \
--disable-threads \
CFLAGS='-march=i686 -mtune=generic -Os'
make -C src/liblzma
make -C src/xzdec
make -C src/xzdec DESTDIR=$PKG install-strip
cp -a extra $PKG/usr/share/doc/xz

303
README Normal file
View File

@ -0,0 +1,303 @@
XZ Utils
========
0. Overview
1. Documentation
1.1. Overall documentation
1.2. Documentation for command-line tools
1.3. Documentation for liblzma
2. Version numbering
3. Reporting bugs
4. Translations
5. Other implementations of the .xz format
6. Contact information
0. Overview
-----------
XZ Utils provide a general-purpose data-compression library plus
command-line tools. The native file format is the .xz format, but
also the legacy .lzma format is supported. The .xz format supports
multiple compression algorithms, which are called "filters" in the
context of XZ Utils. The primary filter is currently LZMA2. With
typical files, XZ Utils create about 30 % smaller files than gzip.
To ease adapting support for the .xz format into existing applications
and scripts, the API of liblzma is somewhat similar to the API of the
popular zlib library. For the same reason, the command-line tool xz
has a command-line syntax similar to that of gzip.
When aiming for the highest compression ratio, the LZMA2 encoder uses
a lot of CPU time and may use, depending on the settings, even
hundreds of megabytes of RAM. However, in fast modes, the LZMA2 encoder
competes with bzip2 in compression speed, RAM usage, and compression
ratio.
LZMA2 is reasonably fast to decompress. It is a little slower than
gzip, but a lot faster than bzip2. Being fast to decompress means
that the .xz format is especially nice when the same file will be
decompressed very many times (usually on different computers), which
is the case e.g. when distributing software packages. In such
situations, it's not too bad if the compression takes some time,
since that needs to be done only once to benefit many people.
With some file types, combining (or "chaining") LZMA2 with an
additional filter can improve the compression ratio. A filter chain may
contain up to four filters, although usually only one or two are used.
For example, putting a BCJ (Branch/Call/Jump) filter before LZMA2
in the filter chain can improve compression ratio of executable files.
Since the .xz format allows adding new filter IDs, it is possible that
some day there will be a filter that is, for example, much faster to
compress than LZMA2 (but probably with worse compression ratio).
Similarly, it is possible that some day there is a filter that will
compress better than LZMA2.
XZ Utils supports multithreaded compression. XZ Utils doesn't support
multithreaded decompression yet. It has been planned though and taken
into account when designing the .xz file format. In the future, files
that were created in threaded mode can be decompressed in threaded
mode too.
1. Documentation
----------------
1.1. Overall documentation
README This file
INSTALL.generic Generic install instructions for those not familiar
with packages using GNU Autotools
INSTALL Installation instructions specific to XZ Utils
PACKAGERS Information to packagers of XZ Utils
COPYING XZ Utils copyright and license information
COPYING.GPLv2 GNU General Public License version 2
COPYING.GPLv3 GNU General Public License version 3
COPYING.LGPLv2.1 GNU Lesser General Public License version 2.1
AUTHORS The main authors of XZ Utils
THANKS Incomplete list of people who have helped making
this software
NEWS User-visible changes between XZ Utils releases
ChangeLog Detailed list of changes (commit log)
TODO Known bugs and some sort of to-do list
Note that only some of the above files are included in binary
packages.
1.2. Documentation for command-line tools
The command-line tools are documented as man pages. In source code
releases (and possibly also in some binary packages), the man pages
are also provided in plain text (ASCII only) and PDF formats in the
directory "doc/man" to make the man pages more accessible to those
whose operating system doesn't provide an easy way to view man pages.
1.3. Documentation for liblzma
The liblzma API headers include short docs about each function
and data type as Doxygen tags. These docs should be quite OK as
a quick reference.
There are a few example/tutorial programs that should help in
getting started with liblzma. In the source package the examples
are in "doc/examples" and in binary packages they may be under
"examples" in the same directory as this README.
Since the liblzma API has similarities to the zlib API, some people
may find it useful to read the zlib docs and tutorial too:
http://zlib.net/manual.html
http://zlib.net/zlib_how.html
2. Version numbering
--------------------
The version number format of XZ Utils is X.Y.ZS:
- X is the major version. When this is incremented, the library
API and ABI break.
- Y is the minor version. It is incremented when new features
are added without breaking the existing API or ABI. An even Y
indicates a stable release and an odd Y indicates unstable
(alpha or beta version).
- Z is the revision. This has a different meaning for stable and
unstable releases:
* Stable: Z is incremented when bugs get fixed without adding
any new features. This is intended to be convenient for
downstream distributors that want bug fixes but don't want
any new features to minimize the risk of introducing new bugs.
* Unstable: Z is just a counter. API or ABI of features added
in earlier unstable releases having the same X.Y may break.
- S indicates stability of the release. It is missing from the
stable releases, where Y is an even number. When Y is odd, S
is either "alpha" or "beta" to make it very clear that such
versions are not stable releases. The same X.Y.Z combination is
not used for more than one stability level, i.e. after X.Y.Zalpha,
the next version can be X.Y.(Z+1)beta but not X.Y.Zbeta.
3. Reporting bugs
-----------------
Naturally it is easiest for me if you already know what causes the
unexpected behavior. Even better if you have a patch to propose.
However, quite often the reason for unexpected behavior is unknown,
so here are a few things to do before sending a bug report:
1. Try to create a small example how to reproduce the issue.
2. Compile XZ Utils with debugging code using configure switches
--enable-debug and, if possible, --disable-shared. If you are
using GCC, use CFLAGS='-O0 -ggdb3'. Don't strip the resulting
binaries.
3. Turn on core dumps. The exact command depends on your shell;
for example in GNU bash it is done with "ulimit -c unlimited",
and in tcsh with "limit coredumpsize unlimited".
4. Try to reproduce the suspected bug. If you get "assertion failed"
message, be sure to include the complete message in your bug
report. If the application leaves a coredump, get a backtrace
using gdb:
$ gdb /path/to/app-binary # Load the app to the debugger.
(gdb) core core # Open the coredump.
(gdb) bt # Print the backtrace. Copy & paste to bug report.
(gdb) quit # Quit gdb.
Report your bug via email or IRC (see Contact information below).
Don't send core dump files or any executables. If you have a small
example file(s) (total size less than 256 KiB), please include
it/them as an attachment. If you have bigger test files, put them
online somewhere and include a URL to the file(s) in the bug report.
Always include the exact version number of XZ Utils in the bug report.
If you are using a snapshot from the git repository, use "git describe"
to get the exact snapshot version. If you are using XZ Utils shipped
in an operating system distribution, mention the distribution name,
distribution version, and exact xz package version; if you cannot
repeat the bug with the code compiled from unpatched source code,
you probably need to report a bug to your distribution's bug tracking
system.
4. Translations
---------------
The xz command line tool and all man pages can be translated.
The translations are handled via the Translation Project. If you
wish to help translating xz, please join the Translation Project:
https://translationproject.org/html/translators.html
Below are notes and testing instructions specific to xz
translations.
Testing can be done by installing xz into a temporary directory:
./configure --disable-shared --prefix=/tmp/xz-test
# <Edit the .po file in the po directory.>
make -C po update-po
make install
bash debug/translation.bash | less
bash debug/translation.bash | less -S # For --list outputs
Repeat the above as needed (no need to re-run configure though).
Note especially the following:
- The output of --help and --long-help must look nice on
an 80-column terminal. It's OK to add extra lines if needed.
- In contrast, don't add extra lines to error messages and such.
They are often preceded with e.g. a filename on the same line,
so you have no way to predict where to put a \n. Let the terminal
do the wrapping even if it looks ugly. Adding new lines will be
even uglier in the generic case even if it looks nice in a few
limited examples.
- Be careful with column alignment in tables and table-like output
(--list, --list --verbose --verbose, --info-memory, --help, and
--long-help):
* All descriptions of options in --help should start in the
same column (but it doesn't need to be the same column as
in the English messages; just be consistent if you change it).
Check that both --help and --long-help look OK, since they
share several strings.
* --list --verbose and --info-memory print lines that have
the format "Description: %s". If you need a longer
description, you can put extra space between the colon
and %s. Then you may need to add extra space to other
strings too so that the result as a whole looks good (all
values start at the same column).
* The columns of the actual tables in --list --verbose --verbose
should be aligned properly. Abbreviate if necessary. It might
be good to keep at least 2 or 3 spaces between column headings
and avoid spaces in the headings so that the columns stand out
better, but this is a matter of opinion. Do what you think
looks best.
- Be careful to put a period at the end of a sentence when the
original version has it, and don't put it when the original
doesn't have it. Similarly, be careful with \n characters
at the beginning and end of the strings.
- Read the TRANSLATORS comments that have been extracted from the
source code and included in xz.pot. Some comments suggest
testing with a specific command which needs an .xz file. You
may use e.g. any tests/files/good-*.xz. However, these test
commands are included in translations.bash output, so reading
translations.bash output carefully can be enough.
- If you find language problems in the original English strings,
feel free to suggest improvements. Ask if something is unclear.
- The translated messages should be understandable (sometimes this
may be a problem with the original English messages too). Don't
make a direct word-by-word translation from English especially if
the result doesn't sound good in your language.
Thanks for your help!
5. Other implementations of the .xz format
------------------------------------------
7-Zip and the p7zip port of 7-Zip support the .xz format starting
from the version 9.00alpha.
http://7-zip.org/
http://p7zip.sourceforge.net/
XZ Embedded is a limited implementation written for use in the Linux
kernel, but it is also suitable for other embedded use.
https://tukaani.org/xz/embedded.html
6. Contact information
----------------------
If you have questions, bug reports, patches etc. related to XZ Utils,
the project maintainers Lasse Collin and Jia Tan can be reached via
<xz@tukaani.org>.
You might find Lasse also from #tukaani on Libera Chat (IRC).
The nick is Larhzu. The channel tends to be pretty quiet,
so just ask your question and someone might wake up.

163
THANKS Normal file
View File

@ -0,0 +1,163 @@
Thanks
======
Some people have helped more, some less, but nevertheless everyone's help
has been important. :-) In alphabetical order:
- Mark Adler
- H. Peter Anvin
- Jeff Bastian
- Nelson H. F. Beebe
- Karl Beldan
- Karl Berry
- Anders F. Björklund
- Emmanuel Blot
- Melanie Blower
- Alexander Bluhm
- Martin Blumenstingl
- Ben Boeckel
- Jakub Bogusz
- Adam Borowski
- Maarten Bosmans
- Trent W. Buck
- Kevin R. Bulgrien
- James Buren
- David Burklund
- Daniel Mealha Cabrita
- Milo Casagrande
- Marek Černocký
- Tomer Chachamu
- Vitaly Chikunov
- Antoine Cœur
- Gabi Davar
- İhsan Doğan
- Chris Donawa
- Andrew Dudman
- Markus Duft
- İsmail Dönmez
- Paul Eggert
- Robert Elz
- Gilles Espinasse
- Denis Excoffier
- Michael Felt
- Michael Fox
- Mike Frysinger
- Daniel Richard G.
- Tomasz Gajc
- Bjarni Ingi Gislason
- John Paul Adrian Glaubitz
- Bill Glessner
- Michał Górny
- Jason Gorski
- Juan Manuel Guerrero
- Diederik de Haas
- Joachim Henke
- Christian Hesse
- Vincenzo Innocente
- Peter Ivanov
- Nicholas Jackson
- Sam James
- Hajin Jang
- Jouk Jansen
- Jun I Jin
- Kiyoshi Kanazawa
- Per Øyvind Karlsen
- Iouri Kharon
- Thomas Klausner
- Richard Koch
- Ville Koskinen
- Jan Kratochvil
- Christian Kujau
- Stephan Kulow
- Ilya Kurdyukov
- Peter Lawler
- James M Leddy
- Vincent Lefevre
- Hin-Tak Leung
- Andraž 'ruskie' Levstik
- Cary Lewis
- Wim Lewis
- Xin Li
- Eric Lindblad
- Lorenzo De Liso
- H.J. Lu
- Bela Lubkin
- Gregory Margo
- Julien Marrec
- Ed Maste
- Martin Matuška
- Ivan A. Melnikov
- Jim Meyering
- Arkadiusz Miskiewicz
- Nathan Moinvaziri
- Étienne Mollier
- Conley Moorhous
- Rafał Mużyło
- Adrien Nader
- Evan Nemerson
- Hongbo Ni
- Jonathan Nieder
- Andre Noll
- Peter O'Gorman
- Daniel Packard
- Filip Palian
- Peter Pallinger
- Rui Paulo
- Igor Pavlov
- Diego Elio Pettenò
- Elbert Pol
- Mikko Pouru
- Rich Prohaska
- Trần Ngọc Quân
- Pavel Raiskup
- Ole André Vadla Ravnås
- Eric S. Raymond
- Robert Readman
- Bernhard Reutner-Fischer
- Markus Rickert
- Cristian Rodríguez
- Christian von Roques
- Boud Roukema
- Torsten Rupp
- Stephen Sachs
- Jukka Salmi
- Alexandre Sauvé
- Benno Schulenberg
- Andreas Schwab
- Bhargava Shastry
- Dan Shechter
- Stuart Shelton
- Sebastian Andrzej Siewior
- Ville Skyttä
- Brad Smith
- Bruce Stark
- Pippijn van Steenhoven
- Jonathan Stott
- Dan Stromberg
- Jia Tan
- Vincent Torri
- Paul Townsend
- Mohammed Adnène Trojette
- Alexey Tourbin
- Loganaden Velvindron
- Patrick J. Volkerding
- Martin Väth
- Adam Walling
- Jeffrey Walton
- Christian Weisgerber
- Dan Weiss
- Bert Wesarg
- Fredrik Wikstrom
- Jim Wilcoxson
- Ralf Wildenhues
- Charles Wilson
- Lars Wirzenius
- Pilorz Wojciech
- Ryan Young
- Andreas Zieringer
Also thanks to all the people who have participated in the Tukaani project.
I have probably forgot to add some names to the above list. Sorry about
that and thanks for your help.

109
TODO Normal file
View File

@ -0,0 +1,109 @@
XZ Utils To-Do List
===================
Known bugs
----------
The test suite is too incomplete.
If the memory usage limit is less than about 13 MiB, xz is unable to
automatically scale down the compression settings enough even though
it would be possible by switching from BT2/BT3/BT4 match finder to
HC3/HC4.
XZ Utils compress some files significantly worse than LZMA Utils.
This is due to faster compression presets used by XZ Utils, and
can often be worked around by using "xz --extreme". With some files
--extreme isn't enough though: it's most likely with files that
compress extremely well, so going from compression ratio of 0.003
to 0.004 means big relative increase in the compressed file size.
xz doesn't quote unprintable characters when it displays file names
given on the command line.
tuklib_exit() doesn't block signals => EINTR is possible.
SIGTSTP is not handled. If xz is stopped, the estimated remaining
time and calculated (de)compression speed won't make sense in the
progress indicator (xz --verbose).
If liblzma has created threads and fork() gets called, liblzma
code will break in the child process unless it calls exec() and
doesn't touch liblzma.
Missing features
----------------
Add support for storing metadata in .xz files. A preliminary
idea is to create a new Stream type for metadata. When both
metadata and data are wanted in the same .xz file, two or more
Streams would be concatenated.
The state stored in lzma_stream should be cloneable, which would
be mostly useful when using a preset dictionary in LZMA2, but
it may have other uses too. Compare to deflateCopy() in zlib.
Support LZMA_FINISH in raw decoder to indicate end of LZMA1 and
other streams that don't have an end of payload marker.
Adjust dictionary size when the input file size is known.
Maybe do this only if an option is given.
xz doesn't support copying extended attributes, access control
lists etc. from source to target file.
Multithreaded compression:
- Reduce memory usage of the current method.
- Implement threaded match finders.
- Implement pigz-style threading in LZMA2.
Buffer-to-buffer coding could use less RAM (especially when
decompressing LZMA1 or LZMA2).
I/O library is not implemented (similar to gzopen() in zlib).
It will be a separate library that supports uncompressed, .gz,
.bz2, .lzma, and .xz files.
Support changing lzma_options_lzma.mode with lzma_filters_update().
Support LZMA_FULL_FLUSH for lzma_stream_decoder() to stop at
Block and Stream boundaries.
lzma_strerror() to convert lzma_ret to human readable form?
This is tricky, because the same error codes are used with
slightly different meanings, and this cannot be fixed anymore.
Make it possible to adjust LZMA2 options in the middle of a Block
so that the encoding speed vs. compression ratio can be optimized
when the compressed data is streamed over network.
Improved BCJ filters. The current filters are small but they aren't
so great when compressing binary packages that contain various file
types. Specifically, they make things worse if there are static
libraries or Linux kernel modules. The filtering could also be
more effective (without getting overly complex), for example,
streamable variant BCJ2 from 7-Zip could be implemented.
Filter that autodetects specific data types in the input stream
and applies appropriate filters for the corrects parts of the input.
Perhaps combine this with the BCJ filter improvement point above.
Long-range LZ77 method as a separate filter or as a new LZMA2
match finder.
Documentation
-------------
More tutorial programs are needed for liblzma.
Document the LZMA1 and LZMA2 algorithms.
Miscellaneous
------------
Try to get the media type for .xz registered at IANA.

601
config.h.in Normal file
View File

@ -0,0 +1,601 @@
/* config.h.in. Generated from configure.ac by autoheader. */
/* Define if building universal (internal helper macro) */
#undef AC_APPLE_UNIVERSAL_BUILD
/* How many MiB of RAM to assume if the real amount cannot be determined. */
#undef ASSUME_RAM
/* Define to 1 if translation of program messages to the user's native
language is requested. */
#undef ENABLE_NLS
/* Define to 1 if bswap_16 is available. */
#undef HAVE_BSWAP_16
/* Define to 1 if bswap_32 is available. */
#undef HAVE_BSWAP_32
/* Define to 1 if bswap_64 is available. */
#undef HAVE_BSWAP_64
/* Define to 1 if you have the <byteswap.h> header file. */
#undef HAVE_BYTESWAP_H
/* Define to 1 if Capsicum is available. */
#undef HAVE_CAPSICUM
/* Define to 1 if the system has the type `CC_SHA256_CTX'. */
#undef HAVE_CC_SHA256_CTX
/* Define to 1 if you have the `CC_SHA256_Init' function. */
#undef HAVE_CC_SHA256_INIT
/* Define to 1 if you have the Mac OS X function
CFLocaleCopyPreferredLanguages in the CoreFoundation framework. */
#undef HAVE_CFLOCALECOPYPREFERREDLANGUAGES
/* Define to 1 if you have the Mac OS X function CFPreferencesCopyAppValue in
the CoreFoundation framework. */
#undef HAVE_CFPREFERENCESCOPYAPPVALUE
/* Define to 1 if crc32 integrity check is enabled. */
#undef HAVE_CHECK_CRC32
/* Define to 1 if crc64 integrity check is enabled. */
#undef HAVE_CHECK_CRC64
/* Define to 1 if sha256 integrity check is enabled. */
#undef HAVE_CHECK_SHA256
/* Define to 1 if you have the `clock_gettime' function. */
#undef HAVE_CLOCK_GETTIME
/* Define to 1 if `CLOCK_MONOTONIC' is declared in <time.h>. */
#undef HAVE_CLOCK_MONOTONIC
/* Define to 1 if you have the <CommonCrypto/CommonDigest.h> header file. */
#undef HAVE_COMMONCRYPTO_COMMONDIGEST_H
/* Define to 1 if you have the <cpuid.h> header file. */
#undef HAVE_CPUID_H
/* Define if the GNU dcgettext() function is already present or preinstalled.
*/
#undef HAVE_DCGETTEXT
/* Define to 1 if any of HAVE_DECODER_foo have been defined. */
#undef HAVE_DECODERS
/* Define to 1 if arm decoder is enabled. */
#undef HAVE_DECODER_ARM
/* Define to 1 if arm64 decoder is enabled. */
#undef HAVE_DECODER_ARM64
/* Define to 1 if armthumb decoder is enabled. */
#undef HAVE_DECODER_ARMTHUMB
/* Define to 1 if delta decoder is enabled. */
#undef HAVE_DECODER_DELTA
/* Define to 1 if ia64 decoder is enabled. */
#undef HAVE_DECODER_IA64
/* Define to 1 if lzma1 decoder is enabled. */
#undef HAVE_DECODER_LZMA1
/* Define to 1 if lzma2 decoder is enabled. */
#undef HAVE_DECODER_LZMA2
/* Define to 1 if powerpc decoder is enabled. */
#undef HAVE_DECODER_POWERPC
/* Define to 1 if sparc decoder is enabled. */
#undef HAVE_DECODER_SPARC
/* Define to 1 if x86 decoder is enabled. */
#undef HAVE_DECODER_X86
/* Define to 1 if you have the <dlfcn.h> header file. */
#undef HAVE_DLFCN_H
/* Define to 1 if any of HAVE_ENCODER_foo have been defined. */
#undef HAVE_ENCODERS
/* Define to 1 if arm encoder is enabled. */
#undef HAVE_ENCODER_ARM
/* Define to 1 if arm64 encoder is enabled. */
#undef HAVE_ENCODER_ARM64
/* Define to 1 if armthumb encoder is enabled. */
#undef HAVE_ENCODER_ARMTHUMB
/* Define to 1 if delta encoder is enabled. */
#undef HAVE_ENCODER_DELTA
/* Define to 1 if ia64 encoder is enabled. */
#undef HAVE_ENCODER_IA64
/* Define to 1 if lzma1 encoder is enabled. */
#undef HAVE_ENCODER_LZMA1
/* Define to 1 if lzma2 encoder is enabled. */
#undef HAVE_ENCODER_LZMA2
/* Define to 1 if powerpc encoder is enabled. */
#undef HAVE_ENCODER_POWERPC
/* Define to 1 if sparc encoder is enabled. */
#undef HAVE_ENCODER_SPARC
/* Define to 1 if x86 encoder is enabled. */
#undef HAVE_ENCODER_X86
/* Define to 1 if you have the <fcntl.h> header file. */
#undef HAVE_FCNTL_H
/* Define to 1 if __attribute__((__constructor__)) is supported for functions.
*/
#undef HAVE_FUNC_ATTRIBUTE_CONSTRUCTOR
/* Define to 1 if you have the `futimens' function. */
#undef HAVE_FUTIMENS
/* Define to 1 if you have the `futimes' function. */
#undef HAVE_FUTIMES
/* Define to 1 if you have the `futimesat' function. */
#undef HAVE_FUTIMESAT
/* Define to 1 if you have the <getopt.h> header file. */
#undef HAVE_GETOPT_H
/* Define to 1 if you have the `getopt_long' function. */
#undef HAVE_GETOPT_LONG
/* Define if the GNU gettext() function is already present or preinstalled. */
#undef HAVE_GETTEXT
/* Define if you have the iconv() function and it works. */
#undef HAVE_ICONV
/* Define to 1 if you have the <immintrin.h> header file. */
#undef HAVE_IMMINTRIN_H
/* Define to 1 if you have the <inttypes.h> header file. */
#undef HAVE_INTTYPES_H
/* Define to 1 if you have the <limits.h> header file. */
#undef HAVE_LIMITS_H
/* Define to 1 if .lz (lzip) decompression support is enabled. */
#undef HAVE_LZIP_DECODER
/* Define to 1 if mbrtowc and mbstate_t are properly declared. */
#undef HAVE_MBRTOWC
/* Define to 1 to enable bt2 match finder. */
#undef HAVE_MF_BT2
/* Define to 1 to enable bt3 match finder. */
#undef HAVE_MF_BT3
/* Define to 1 to enable bt4 match finder. */
#undef HAVE_MF_BT4
/* Define to 1 to enable hc3 match finder. */
#undef HAVE_MF_HC3
/* Define to 1 to enable hc4 match finder. */
#undef HAVE_MF_HC4
/* Define to 1 if you have the <minix/config.h> header file. */
#undef HAVE_MINIX_CONFIG_H
/* Define to 1 if getopt.h declares extern int optreset. */
#undef HAVE_OPTRESET
/* Define to 1 if you have the `pledge' function. */
#undef HAVE_PLEDGE
/* Define to 1 if you have the `posix_fadvise' function. */
#undef HAVE_POSIX_FADVISE
/* Define to 1 if `program_invocation_name' is declared in <errno.h>. */
#undef HAVE_PROGRAM_INVOCATION_NAME
/* Define to 1 if you have the `pthread_condattr_setclock' function. */
#undef HAVE_PTHREAD_CONDATTR_SETCLOCK
/* Have PTHREAD_PRIO_INHERIT. */
#undef HAVE_PTHREAD_PRIO_INHERIT
/* Define to 1 if you have the `SHA256Init' function. */
#undef HAVE_SHA256INIT
/* Define to 1 if the system has the type `SHA256_CTX'. */
#undef HAVE_SHA256_CTX
/* Define to 1 if you have the <sha256.h> header file. */
#undef HAVE_SHA256_H
/* Define to 1 if you have the `SHA256_Init' function. */
#undef HAVE_SHA256_INIT
/* Define to 1 if the system has the type `SHA2_CTX'. */
#undef HAVE_SHA2_CTX
/* Define to 1 if you have the <sha2.h> header file. */
#undef HAVE_SHA2_H
/* Define to 1 if optimizing for size. */
#undef HAVE_SMALL
/* Define to 1 if stdbool.h conforms to C99. */
#undef HAVE_STDBOOL_H
/* Define to 1 if you have the <stdint.h> header file. */
#undef HAVE_STDINT_H
/* Define to 1 if you have the <stdio.h> header file. */
#undef HAVE_STDIO_H
/* Define to 1 if you have the <stdlib.h> header file. */
#undef HAVE_STDLIB_H
/* Define to 1 if you have the <strings.h> header file. */
#undef HAVE_STRINGS_H
/* Define to 1 if you have the <string.h> header file. */
#undef HAVE_STRING_H
/* Define to 1 if `st_atimensec' is a member of `struct stat'. */
#undef HAVE_STRUCT_STAT_ST_ATIMENSEC
/* Define to 1 if `st_atimespec.tv_nsec' is a member of `struct stat'. */
#undef HAVE_STRUCT_STAT_ST_ATIMESPEC_TV_NSEC
/* Define to 1 if `st_atim.st__tim.tv_nsec' is a member of `struct stat'. */
#undef HAVE_STRUCT_STAT_ST_ATIM_ST__TIM_TV_NSEC
/* Define to 1 if `st_atim.tv_nsec' is a member of `struct stat'. */
#undef HAVE_STRUCT_STAT_ST_ATIM_TV_NSEC
/* Define to 1 if `st_uatime' is a member of `struct stat'. */
#undef HAVE_STRUCT_STAT_ST_UATIME
/* Define to 1 to if GNU/Linux-specific details are unconditionally wanted for
symbol versioning. Define to 2 to if these are wanted only if also PIC is
defined (allows building both shared and static liblzma at the same time
with Libtool if neither --with-pic nor --without-pic is used). This define
must be used together with liblzma_linux.map. */
#undef HAVE_SYMBOL_VERSIONS_LINUX
/* Define to 1 if you have the <sys/byteorder.h> header file. */
#undef HAVE_SYS_BYTEORDER_H
/* Define to 1 if you have the <sys/capsicum.h> header file. */
#undef HAVE_SYS_CAPSICUM_H
/* Define to 1 if you have the <sys/endian.h> header file. */
#undef HAVE_SYS_ENDIAN_H
/* Define to 1 if you have the <sys/param.h> header file. */
#undef HAVE_SYS_PARAM_H
/* Define to 1 if you have the <sys/stat.h> header file. */
#undef HAVE_SYS_STAT_H
/* Define to 1 if you have the <sys/time.h> header file. */
#undef HAVE_SYS_TIME_H
/* Define to 1 if you have the <sys/types.h> header file. */
#undef HAVE_SYS_TYPES_H
/* Define to 1 if the system has the type `uintptr_t'. */
#undef HAVE_UINTPTR_T
/* Define to 1 if you have the <unistd.h> header file. */
#undef HAVE_UNISTD_H
/* Define to 1 if _mm_set_epi64x and _mm_clmulepi64_si128 are usable. See
configure.ac for details. */
#undef HAVE_USABLE_CLMUL
/* Define to 1 if you have the `utime' function. */
#undef HAVE_UTIME
/* Define to 1 if you have the `utimes' function. */
#undef HAVE_UTIMES
/* Define to 1 or 0, depending whether the compiler supports simple visibility
declarations. */
#undef HAVE_VISIBILITY
/* Define to 1 if you have the <wchar.h> header file. */
#undef HAVE_WCHAR_H
/* Define to 1 if you have the `wcwidth' function. */
#undef HAVE_WCWIDTH
/* Define to 1 if the system has the type `_Bool'. */
#undef HAVE__BOOL
/* Define to 1 if you have the `_futime' function. */
#undef HAVE__FUTIME
/* Define to 1 if _mm_movemask_epi8 is available. */
#undef HAVE__MM_MOVEMASK_EPI8
/* Define to 1 if the GNU C extension __builtin_assume_aligned is supported.
*/
#undef HAVE___BUILTIN_ASSUME_ALIGNED
/* Define to 1 if the GNU C extensions __builtin_bswap16/32/64 are supported.
*/
#undef HAVE___BUILTIN_BSWAPXX
/* Define to the sub-directory where libtool stores uninstalled libraries. */
#undef LT_OBJDIR
/* Define to 1 when using POSIX threads (pthreads). */
#undef MYTHREAD_POSIX
/* Define to 1 when using Windows Vista compatible threads. This uses features
that are not available on Windows XP. */
#undef MYTHREAD_VISTA
/* Define to 1 when using Windows 95 (and thus XP) compatible threads. This
avoids use of features that were added in Windows Vista. */
#undef MYTHREAD_WIN95
/* Define to 1 to disable debugging code. */
#undef NDEBUG
/* Name of package */
#undef PACKAGE
/* Define to the address where bug reports for this package should be sent. */
#undef PACKAGE_BUGREPORT
/* Define to the full name of this package. */
#undef PACKAGE_NAME
/* Define to the full name and version of this package. */
#undef PACKAGE_STRING
/* Define to the one symbol short name of this package. */
#undef PACKAGE_TARNAME
/* Define to the home page for this package. */
#undef PACKAGE_URL
/* Define to the version of this package. */
#undef PACKAGE_VERSION
/* Define to necessary symbol if this constant uses a non-standard name on
your system. */
#undef PTHREAD_CREATE_JOINABLE
/* The size of `size_t', as computed by sizeof. */
#undef SIZEOF_SIZE_T
/* Define to 1 if all of the C90 standard headers exist (not just the ones
required in a freestanding environment). This macro is provided for
backward compatibility; new code need not use it. */
#undef STDC_HEADERS
/* Define to 1 if the number of available CPU cores can be detected with
cpuset(2). */
#undef TUKLIB_CPUCORES_CPUSET
/* Define to 1 if the number of available CPU cores can be detected with
pstat_getdynamic(). */
#undef TUKLIB_CPUCORES_PSTAT_GETDYNAMIC
/* Define to 1 if the number of available CPU cores can be detected with
sched_getaffinity() */
#undef TUKLIB_CPUCORES_SCHED_GETAFFINITY
/* Define to 1 if the number of available CPU cores can be detected with
sysconf(_SC_NPROCESSORS_ONLN) or sysconf(_SC_NPROC_ONLN). */
#undef TUKLIB_CPUCORES_SYSCONF
/* Define to 1 if the number of available CPU cores can be detected with
sysctl(). */
#undef TUKLIB_CPUCORES_SYSCTL
/* Define to 1 if the system supports fast unaligned access to 16-bit, 32-bit,
and 64-bit integers. */
#undef TUKLIB_FAST_UNALIGNED_ACCESS
/* Define to 1 if the amount of physical memory can be detected with
_system_configuration.physmem. */
#undef TUKLIB_PHYSMEM_AIX
/* Define to 1 if the amount of physical memory can be detected with
getinvent_r(). */
#undef TUKLIB_PHYSMEM_GETINVENT_R
/* Define to 1 if the amount of physical memory can be detected with
getsysinfo(). */
#undef TUKLIB_PHYSMEM_GETSYSINFO
/* Define to 1 if the amount of physical memory can be detected with
pstat_getstatic(). */
#undef TUKLIB_PHYSMEM_PSTAT_GETSTATIC
/* Define to 1 if the amount of physical memory can be detected with
sysconf(_SC_PAGESIZE) and sysconf(_SC_PHYS_PAGES). */
#undef TUKLIB_PHYSMEM_SYSCONF
/* Define to 1 if the amount of physical memory can be detected with sysctl().
*/
#undef TUKLIB_PHYSMEM_SYSCTL
/* Define to 1 if the amount of physical memory can be detected with Linux
sysinfo(). */
#undef TUKLIB_PHYSMEM_SYSINFO
/* Define to 1 to use unsafe type punning, e.g. char *x = ...; *(int *)x =
123; which violates strict aliasing rules and thus is undefined behavior
and might result in broken code. */
#undef TUKLIB_USE_UNSAFE_TYPE_PUNNING
/* Enable extensions on AIX 3, Interix. */
#ifndef _ALL_SOURCE
# undef _ALL_SOURCE
#endif
/* Enable general extensions on macOS. */
#ifndef _DARWIN_C_SOURCE
# undef _DARWIN_C_SOURCE
#endif
/* Enable general extensions on Solaris. */
#ifndef __EXTENSIONS__
# undef __EXTENSIONS__
#endif
/* Enable GNU extensions on systems that have them. */
#ifndef _GNU_SOURCE
# undef _GNU_SOURCE
#endif
/* Enable X/Open compliant socket functions that do not require linking
with -lxnet on HP-UX 11.11. */
#ifndef _HPUX_ALT_XOPEN_SOCKET_API
# undef _HPUX_ALT_XOPEN_SOCKET_API
#endif
/* Identify the host operating system as Minix.
This macro does not affect the system headers' behavior.
A future release of Autoconf may stop defining this macro. */
#ifndef _MINIX
# undef _MINIX
#endif
/* Enable general extensions on NetBSD.
Enable NetBSD compatibility extensions on Minix. */
#ifndef _NETBSD_SOURCE
# undef _NETBSD_SOURCE
#endif
/* Enable OpenBSD compatibility extensions on NetBSD.
Oddly enough, this does nothing on OpenBSD. */
#ifndef _OPENBSD_SOURCE
# undef _OPENBSD_SOURCE
#endif
/* Define to 1 if needed for POSIX-compatible behavior. */
#ifndef _POSIX_SOURCE
# undef _POSIX_SOURCE
#endif
/* Define to 2 if needed for POSIX-compatible behavior. */
#ifndef _POSIX_1_SOURCE
# undef _POSIX_1_SOURCE
#endif
/* Enable POSIX-compatible threading on Solaris. */
#ifndef _POSIX_PTHREAD_SEMANTICS
# undef _POSIX_PTHREAD_SEMANTICS
#endif
/* Enable extensions specified by ISO/IEC TS 18661-5:2014. */
#ifndef __STDC_WANT_IEC_60559_ATTRIBS_EXT__
# undef __STDC_WANT_IEC_60559_ATTRIBS_EXT__
#endif
/* Enable extensions specified by ISO/IEC TS 18661-1:2014. */
#ifndef __STDC_WANT_IEC_60559_BFP_EXT__
# undef __STDC_WANT_IEC_60559_BFP_EXT__
#endif
/* Enable extensions specified by ISO/IEC TS 18661-2:2015. */
#ifndef __STDC_WANT_IEC_60559_DFP_EXT__
# undef __STDC_WANT_IEC_60559_DFP_EXT__
#endif
/* Enable extensions specified by ISO/IEC TS 18661-4:2015. */
#ifndef __STDC_WANT_IEC_60559_FUNCS_EXT__
# undef __STDC_WANT_IEC_60559_FUNCS_EXT__
#endif
/* Enable extensions specified by ISO/IEC TS 18661-3:2015. */
#ifndef __STDC_WANT_IEC_60559_TYPES_EXT__
# undef __STDC_WANT_IEC_60559_TYPES_EXT__
#endif
/* Enable extensions specified by ISO/IEC TR 24731-2:2010. */
#ifndef __STDC_WANT_LIB_EXT2__
# undef __STDC_WANT_LIB_EXT2__
#endif
/* Enable extensions specified by ISO/IEC 24747:2009. */
#ifndef __STDC_WANT_MATH_SPEC_FUNCS__
# undef __STDC_WANT_MATH_SPEC_FUNCS__
#endif
/* Enable extensions on HP NonStop. */
#ifndef _TANDEM_SOURCE
# undef _TANDEM_SOURCE
#endif
/* Enable X/Open extensions. Define to 500 only if necessary
to make mbstate_t available. */
#ifndef _XOPEN_SOURCE
# undef _XOPEN_SOURCE
#endif
/* Version number of package */
#undef VERSION
/* Define WORDS_BIGENDIAN to 1 if your processor stores words with the most
significant byte first (like Motorola and SPARC, unlike Intel). */
#if defined AC_APPLE_UNIVERSAL_BUILD
# if defined __BIG_ENDIAN__
# define WORDS_BIGENDIAN 1
# endif
#else
# ifndef WORDS_BIGENDIAN
# undef WORDS_BIGENDIAN
# endif
#endif
/* Number of bits in a file offset, on hosts where this is settable. */
#undef _FILE_OFFSET_BITS
/* Define for large files, on AIX-style hosts. */
#undef _LARGE_FILES
/* Define for Solaris 2.5.1 so the uint32_t typedef from <sys/synch.h>,
<pthread.h>, or <semaphore.h> is not used. If the typedef were allowed, the
#define below would cause a syntax error. */
#undef _UINT32_T
/* Define for Solaris 2.5.1 so the uint64_t typedef from <sys/synch.h>,
<pthread.h>, or <semaphore.h> is not used. If the typedef were allowed, the
#define below would cause a syntax error. */
#undef _UINT64_T
/* Define for Solaris 2.5.1 so the uint8_t typedef from <sys/synch.h>,
<pthread.h>, or <semaphore.h> is not used. If the typedef were allowed, the
#define below would cause a syntax error. */
#undef _UINT8_T
/* Define to rpl_ if the getopt replacement functions and variables should be
used. */
#undef __GETOPT_PREFIX
/* Define to the type of a signed integer type of width exactly 32 bits if
such a type exists and the standard includes do not define it. */
#undef int32_t
/* Define to the type of a signed integer type of width exactly 64 bits if
such a type exists and the standard includes do not define it. */
#undef int64_t
/* Define to the type of an unsigned integer type of width exactly 16 bits if
such a type exists and the standard includes do not define it. */
#undef uint16_t
/* Define to the type of an unsigned integer type of width exactly 32 bits if
such a type exists and the standard includes do not define it. */
#undef uint32_t
/* Define to the type of an unsigned integer type of width exactly 64 bits if
such a type exists and the standard includes do not define it. */
#undef uint64_t
/* Define to the type of an unsigned integer type of width exactly 8 bits if
such a type exists and the standard includes do not define it. */
#undef uint8_t
/* Define to the type of an unsigned integer type wide enough to hold a
pointer, if such a type exists, and if the system does not define it. */
#undef uintptr_t

17
debug/README Normal file
View File

@ -0,0 +1,17 @@
Debug tools
-----------
This directory contains a few tiny programs that may be helpful when
debugging XZ Utils.
These tools are not meant to be installed. Often one needs to edit
the source code a little to make the programs do the wanted things.
If you don't know how these programs could help you, it is likely
that they really are useless to you.
These aren't intended to be used as example programs. They take some
shortcuts here and there, which correct programs should not do. Many
possible errors (especially I/O errors) are ignored. Don't report
bugs or send patches to fix this kind of bugs.

39
debug/crc32.c Normal file
View File

@ -0,0 +1,39 @@
///////////////////////////////////////////////////////////////////////////////
//
/// \file crc32.c
/// \brief Primitive CRC32 calculation tool
//
// Author: Lasse Collin
//
// This file has been put into the public domain.
// You can do whatever you want with this file.
//
///////////////////////////////////////////////////////////////////////////////
#include "sysdefs.h"
#include "lzma.h"
#include <stdio.h>
int
main(void)
{
uint32_t crc = 0;
do {
uint8_t buf[BUFSIZ];
const size_t size = fread(buf, 1, sizeof(buf), stdin);
crc = lzma_crc32(buf, size, crc);
} while (!ferror(stdin) && !feof(stdin));
//printf("%08" PRIX32 "\n", crc);
// I want it little endian so it's easy to work with hex editor.
printf("%02" PRIX32 " ", crc & 0xFF);
printf("%02" PRIX32 " ", (crc >> 8) & 0xFF);
printf("%02" PRIX32 " ", (crc >> 16) & 0xFF);
printf("%02" PRIX32 " ", crc >> 24);
printf("\n");
return 0;
}

103
debug/full_flush.c Normal file
View File

@ -0,0 +1,103 @@
///////////////////////////////////////////////////////////////////////////////
//
/// \file full_flush.c
/// \brief Encode files using LZMA_FULL_FLUSH
//
// Author: Lasse Collin
//
// This file has been put into the public domain.
// You can do whatever you want with this file.
//
///////////////////////////////////////////////////////////////////////////////
#include "sysdefs.h"
#include "lzma.h"
#include <stdio.h>
#define CHUNK 64
static lzma_stream strm = LZMA_STREAM_INIT;
static FILE *file_in;
static void
encode(size_t size, lzma_action action)
{
uint8_t in[CHUNK];
uint8_t out[CHUNK];
lzma_ret ret;
do {
if (strm.avail_in == 0 && size > 0) {
const size_t amount = my_min(size, CHUNK);
strm.avail_in = fread(in, 1, amount, file_in);
strm.next_in = in;
size -= amount; // Intentionally not using avail_in.
}
strm.next_out = out;
strm.avail_out = CHUNK;
ret = lzma_code(&strm, size == 0 ? action : LZMA_RUN);
if (ret != LZMA_OK && ret != LZMA_STREAM_END) {
fprintf(stderr, "%s:%u: %s: ret == %d\n",
__FILE__, __LINE__, __func__, ret);
exit(1);
}
fwrite(out, 1, CHUNK - strm.avail_out, stdout);
} while (size > 0 || strm.avail_out == 0);
if ((action == LZMA_RUN && ret != LZMA_OK)
|| (action != LZMA_RUN && ret != LZMA_STREAM_END)) {
fprintf(stderr, "%s:%u: %s: ret == %d\n",
__FILE__, __LINE__, __func__, ret);
exit(1);
}
}
int
main(int argc, char **argv)
{
file_in = argc > 1 ? fopen(argv[1], "rb") : stdin;
// Config
lzma_options_lzma opt_lzma;
if (lzma_lzma_preset(&opt_lzma, 1)) {
fprintf(stderr, "preset failed\n");
exit(1);
}
lzma_filter filters[LZMA_FILTERS_MAX + 1];
filters[0].id = LZMA_FILTER_LZMA2;
filters[0].options = &opt_lzma;
filters[1].id = LZMA_VLI_UNKNOWN;
// Init
if (lzma_stream_encoder(&strm, filters, LZMA_CHECK_CRC32) != LZMA_OK) {
fprintf(stderr, "init failed\n");
exit(1);
}
// if (lzma_easy_encoder(&strm, 1)) {
// fprintf(stderr, "init failed\n");
// exit(1);
// }
// Encoding
encode(0, LZMA_FULL_FLUSH);
encode(6, LZMA_FULL_FLUSH);
encode(0, LZMA_FULL_FLUSH);
encode(7, LZMA_FULL_FLUSH);
encode(0, LZMA_FULL_FLUSH);
encode(0, LZMA_FINISH);
// Clean up
lzma_end(&strm);
return 0;
}

53
debug/hex2bin.c Normal file
View File

@ -0,0 +1,53 @@
///////////////////////////////////////////////////////////////////////////////
//
/// \file hex2bin.c
/// \brief Converts hexadecimal input strings to binary
//
// Author: Lasse Collin
//
// This file has been put into the public domain.
// You can do whatever you want with this file.
//
///////////////////////////////////////////////////////////////////////////////
#include "sysdefs.h"
#include <stdio.h>
#include <ctype.h>
static int
getbin(int x)
{
if (x >= '0' && x <= '9')
return x - '0';
if (x >= 'A' && x <= 'F')
return x - 'A' + 10;
return x - 'a' + 10;
}
int
main(void)
{
while (true) {
int byte = getchar();
if (byte == EOF)
return 0;
if (!isxdigit(byte))
continue;
const int digit = getchar();
if (digit == EOF || !isxdigit(digit)) {
fprintf(stderr, "Invalid input\n");
return 1;
}
byte = (getbin(byte) << 4) | getbin(digit);
if (putchar(byte) == EOF) {
perror(NULL);
return 1;
}
}
}

129
debug/known_sizes.c Normal file
View File

@ -0,0 +1,129 @@
///////////////////////////////////////////////////////////////////////////////
//
/// \file known_sizes.c
/// \brief Encodes .lzma Stream with sizes known in Block Header
///
/// The input file is encoded in RAM, and the known Compressed Size
/// and/or Uncompressed Size values are stored in the Block Header.
/// As of writing there's no such Stream encoder in liblzma.
//
// Author: Lasse Collin
//
// This file has been put into the public domain.
// You can do whatever you want with this file.
//
///////////////////////////////////////////////////////////////////////////////
#include "sysdefs.h"
#include "lzma.h"
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/unistd.h>
#include <stdio.h>
// Support file sizes up to 1 MiB. We use this for output space too, so files
// close to 1 MiB had better compress at least a little or we have a buffer
// overflow.
#define BUFFER_SIZE (1U << 20)
int
main(void)
{
// Allocate the buffers.
uint8_t *in = malloc(BUFFER_SIZE);
uint8_t *out = malloc(BUFFER_SIZE);
if (in == NULL || out == NULL)
return 1;
// Fill the input buffer.
const size_t in_size = fread(in, 1, BUFFER_SIZE, stdin);
// Filter setup
lzma_options_lzma opt_lzma;
if (lzma_lzma_preset(&opt_lzma, 1))
return 1;
lzma_filter filters[] = {
{
.id = LZMA_FILTER_LZMA2,
.options = &opt_lzma
},
{
.id = LZMA_VLI_UNKNOWN
}
};
lzma_block block = {
.check = LZMA_CHECK_CRC32,
.compressed_size = BUFFER_SIZE, // Worst case reserve
.uncompressed_size = in_size,
.filters = filters,
};
lzma_stream strm = LZMA_STREAM_INIT;
if (lzma_block_encoder(&strm, &block) != LZMA_OK)
return 1;
// Reserve space for Stream Header and Block Header. We need to
// calculate the size of the Block Header first.
if (lzma_block_header_size(&block) != LZMA_OK)
return 1;
size_t out_size = LZMA_STREAM_HEADER_SIZE + block.header_size;
strm.next_in = in;
strm.avail_in = in_size;
strm.next_out = out + out_size;
strm.avail_out = BUFFER_SIZE - out_size;
if (lzma_code(&strm, LZMA_FINISH) != LZMA_STREAM_END)
return 1;
out_size += strm.total_out;
if (lzma_block_header_encode(&block, out + LZMA_STREAM_HEADER_SIZE)
!= LZMA_OK)
return 1;
lzma_index *idx = lzma_index_init(NULL);
if (idx == NULL)
return 1;
if (lzma_index_append(idx, NULL, block.header_size + strm.total_out,
strm.total_in) != LZMA_OK)
return 1;
if (lzma_index_encoder(&strm, idx) != LZMA_OK)
return 1;
if (lzma_code(&strm, LZMA_RUN) != LZMA_STREAM_END)
return 1;
out_size += strm.total_out;
lzma_end(&strm);
lzma_index_end(idx, NULL);
// Encode the Stream Header and Stream Footer. backwards_size is
// needed only for the Stream Footer.
lzma_stream_flags sf = {
.backward_size = strm.total_out,
.check = block.check,
};
if (lzma_stream_header_encode(&sf, out) != LZMA_OK)
return 1;
if (lzma_stream_footer_encode(&sf, out + out_size) != LZMA_OK)
return 1;
out_size += LZMA_STREAM_HEADER_SIZE;
// Write out the file.
fwrite(out, 1, out_size, stdout);
return 0;
}

51
debug/memusage.c Normal file
View File

@ -0,0 +1,51 @@
///////////////////////////////////////////////////////////////////////////////
//
/// \file memusage.c
/// \brief Calculates memory usage using lzma_memory_usage()
//
// Author: Lasse Collin
//
// This file has been put into the public domain.
// You can do whatever you want with this file.
//
///////////////////////////////////////////////////////////////////////////////
#include "sysdefs.h"
#include "lzma.h"
#include <stdio.h>
int
main(void)
{
lzma_options_lzma lzma = {
.dict_size = (1U << 30) + (1U << 29),
.lc = 3,
.lp = 0,
.pb = 2,
.preset_dict = NULL,
.preset_dict_size = 0,
.mode = LZMA_MODE_NORMAL,
.nice_len = 48,
.mf = LZMA_MF_BT4,
.depth = 0,
};
/*
lzma_options_filter filters[] = {
{ LZMA_FILTER_LZMA1,
(lzma_options_lzma *)&lzma_preset_lzma[6 - 1] },
{ UINT64_MAX, NULL }
};
*/
lzma_filter filters[] = {
{ LZMA_FILTER_LZMA1, &lzma },
{ UINT64_MAX, NULL }
};
printf("Encoder: %10" PRIu64 " B\n",
lzma_raw_encoder_memusage(filters));
printf("Decoder: %10" PRIu64 " B\n",
lzma_raw_decoder_memusage(filters));
return 0;
}

36
debug/repeat.c Normal file
View File

@ -0,0 +1,36 @@
///////////////////////////////////////////////////////////////////////////////
//
/// \file repeat.c
/// \brief Repeats given string given times
///
/// This program can be useful when debugging run-length encoder in
/// the Subblock filter, especially the condition when repeat count
/// doesn't fit into 28-bit integer.
//
// Author: Lasse Collin
//
// This file has been put into the public domain.
// You can do whatever you want with this file.
//
///////////////////////////////////////////////////////////////////////////////
#include "sysdefs.h"
#include <stdio.h>
int
main(int argc, char **argv)
{
if (argc != 3) {
fprintf(stderr, "Usage: %s COUNT STRING\n", argv[0]);
exit(1);
}
unsigned long long count = strtoull(argv[1], NULL, 10);
const size_t size = strlen(argv[2]);
while (count-- != 0)
fwrite(argv[2], 1, size, stdout);
return !!(ferror(stdout) || fclose(stdout));
}

125
debug/sync_flush.c Normal file
View File

@ -0,0 +1,125 @@
///////////////////////////////////////////////////////////////////////////////
//
/// \file sync_flush.c
/// \brief Encode files using LZMA_SYNC_FLUSH
//
// Author: Lasse Collin
//
// This file has been put into the public domain.
// You can do whatever you want with this file.
//
///////////////////////////////////////////////////////////////////////////////
#include "sysdefs.h"
#include "lzma.h"
#include <stdio.h>
#define CHUNK 64
static lzma_stream strm = LZMA_STREAM_INIT;
static FILE *file_in;
static void
encode(size_t size, lzma_action action)
{
uint8_t in[CHUNK];
uint8_t out[CHUNK];
lzma_ret ret;
do {
if (strm.avail_in == 0 && size > 0) {
const size_t amount = my_min(size, CHUNK);
strm.avail_in = fread(in, 1, amount, file_in);
strm.next_in = in;
size -= amount; // Intentionally not using avail_in.
}
strm.next_out = out;
strm.avail_out = CHUNK;
ret = lzma_code(&strm, size == 0 ? action : LZMA_RUN);
if (ret != LZMA_OK && ret != LZMA_STREAM_END) {
fprintf(stderr, "%s:%u: %s: ret == %d\n",
__FILE__, __LINE__, __func__, ret);
exit(1);
}
fwrite(out, 1, CHUNK - strm.avail_out, stdout);
} while (size > 0 || strm.avail_out == 0);
if ((action == LZMA_RUN && ret != LZMA_OK)
|| (action != LZMA_RUN && ret != LZMA_STREAM_END)) {
fprintf(stderr, "%s:%u: %s: ret == %d\n",
__FILE__, __LINE__, __func__, ret);
exit(1);
}
}
int
main(int argc, char **argv)
{
file_in = argc > 1 ? fopen(argv[1], "rb") : stdin;
// Config
lzma_options_lzma opt_lzma = {
.dict_size = 1U << 16,
.lc = LZMA_LC_DEFAULT,
.lp = LZMA_LP_DEFAULT,
.pb = LZMA_PB_DEFAULT,
.preset_dict = NULL,
.mode = LZMA_MODE_NORMAL,
.nice_len = 32,
.mf = LZMA_MF_HC3,
.depth = 0,
};
lzma_options_delta opt_delta = {
.dist = 16
};
lzma_filter filters[LZMA_FILTERS_MAX + 1];
filters[0].id = LZMA_FILTER_LZMA2;
filters[0].options = &opt_lzma;
filters[1].id = LZMA_VLI_UNKNOWN;
// Init
if (lzma_stream_encoder(&strm, filters, LZMA_CHECK_CRC32) != LZMA_OK) {
fprintf(stderr, "init failed\n");
exit(1);
}
// Encoding
encode(0, LZMA_SYNC_FLUSH);
encode(6, LZMA_SYNC_FLUSH);
encode(0, LZMA_SYNC_FLUSH);
encode(7, LZMA_SYNC_FLUSH);
encode(0, LZMA_SYNC_FLUSH);
encode(0, LZMA_FINISH);
/*
encode(53, LZMA_SYNC_FLUSH);
opt_lzma.lc = 2;
opt_lzma.lp = 1;
opt_lzma.pb = 0;
if (lzma_filters_update(&strm, filters) != LZMA_OK) {
fprintf(stderr, "update failed\n");
exit(1);
}
encode(404, LZMA_FINISH);
*/
// Clean up
lzma_end(&strm);
return 0;
// Prevent useless warnings so we don't need to have special CFLAGS
// to disable -Werror.
(void)opt_lzma;
(void)opt_delta;
}

100
debug/translation.bash Normal file
View File

@ -0,0 +1,100 @@
#!/bin/bash
###############################################################################
#
# Script to check output of some translated messages
#
# This should be useful for translators to check that the translated strings
# look good. This doesn't make xz print all possible strings, but it should
# cover most of the cases where mistakes can easily happen.
#
# Give the path and filename of the xz executable as an argument. If no
# arguments are given, this script uses ../src/xz/xz (relative to the
# location of this script).
#
# You may want to pipe the output of this script to less -S to view the
# tables printed by xz --list on a 80-column terminal. On the other hand,
# viewing the other messages may be better without -S.
#
###############################################################################
#
# Author: Lasse Collin
#
# This file has been put into the public domain.
# You can do whatever you want with this file.
#
###############################################################################
set -e
# If an argument was given, use it to set the location of the xz executable.
unset XZ
if [ -n "$1" ]; then
XZ=$1
[ "x${XZ:0:1}" != "x/" ] && XZ="$PWD/$XZ"
fi
# Locate top_srcdir and go there.
top_srcdir="$(cd -- "$(dirname -- "$0")" && cd .. && pwd)"
cd -- "$top_srcdir"
# If XZ wasn't already set, use the default location.
XZ=${XZ-"$PWD/src/xz/xz"}
if [ "$(type -t "$XZ" || true)" != "file" ]; then
echo "Give the location of the xz executable as an argument" \
"to this script."
exit 1
fi
XZ=$(type -p -- "$XZ")
# Print the xz version and locale information.
echo "$XZ --version"
"$XZ" --version
echo
if [ -d .git ] && type git > /dev/null 2>&1; then
echo "Source code version in $PWD:"
git describe --abbrev=4
fi
echo
locale
echo
# Make the test files directory the current directory.
cd tests/files
# Put xz in PATH so that argv[0] stays short.
PATH=${XZ%/*}:$PATH
# Some of the test commands are error messages and thus don't
# return successfully.
set +e
for CMD in \
"xz --foobarbaz" \
"xz --memlimit=123abcd" \
"xz --memlimit=40MiB -6 /dev/null" \
"xz --memlimit=0 --info-memory" \
"xz --memlimit-compress=1234MiB --memlimit-decompress=50MiB --info-memory" \
"xz --verbose --verbose /dev/null | cat" \
"xz --lzma2=foobarbaz" \
"xz --lzma2=foobarbaz=abcd" \
"xz --lzma2=mf=abcd" \
"xz --lzma2=preset=foobarbaz" \
"xz --lzma2=mf=bt4,nice=2" \
"xz --lzma2=nice=50000" \
"xz --help" \
"xz --long-help" \
"xz --list good-*lzma2*" \
"xz --list good-1-check*" \
"xz --list --verbose good-*lzma2*" \
"xz --list --verbose good-1-check*" \
"xz --list --verbose --verbose good-*lzma2*" \
"xz --list --verbose --verbose good-1-check*" \
"xz --list --verbose --verbose unsupported-check.xz"
do
echo "-----------------------------------------------------------"
echo
echo "\$ $CMD"
eval "$CMD"
echo
done 2>&1

View File

@ -0,0 +1,31 @@
liblzma example programs
========================
Introduction
The examples are written so that the same comments aren't
repeated (much) in later files.
On POSIX systems, the examples should build by just typing "make".
The examples that use stdin or stdout don't set stdin and stdout
to binary mode. On systems where it matters (e.g. Windows) it is
possible that the examples won't work without modification.
List of examples
01_compress_easy.c Multi-call compression using
a compression preset
02_decompress.c Multi-call decompression
03_compress_custom.c Like 01_compress_easy.c but using
a custom filter chain
(x86 BCJ + LZMA2)
04_compress_easy_mt.c Multi-threaded multi-call
compression using a compression
preset

View File

@ -0,0 +1,297 @@
///////////////////////////////////////////////////////////////////////////////
//
/// \file 01_compress_easy.c
/// \brief Compress from stdin to stdout in multi-call mode
///
/// Usage: ./01_compress_easy PRESET < INFILE > OUTFILE
///
/// Example: ./01_compress_easy 6 < foo > foo.xz
//
// Author: Lasse Collin
//
// This file has been put into the public domain.
// You can do whatever you want with this file.
//
///////////////////////////////////////////////////////////////////////////////
#include <stdbool.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <lzma.h>
static void
show_usage_and_exit(const char *argv0)
{
fprintf(stderr, "Usage: %s PRESET < INFILE > OUTFILE\n"
"PRESET is a number 0-9 and can optionally be "
"followed by `e' to indicate extreme preset\n",
argv0);
exit(EXIT_FAILURE);
}
static uint32_t
get_preset(int argc, char **argv)
{
// One argument whose first char must be 0-9.
if (argc != 2 || argv[1][0] < '0' || argv[1][0] > '9')
show_usage_and_exit(argv[0]);
// Calculate the preste level 0-9.
uint32_t preset = argv[1][0] - '0';
// If there is a second char, it must be 'e'. It will set
// the LZMA_PRESET_EXTREME flag.
if (argv[1][1] != '\0') {
if (argv[1][1] != 'e' || argv[1][2] != '\0')
show_usage_and_exit(argv[0]);
preset |= LZMA_PRESET_EXTREME;
}
return preset;
}
static bool
init_encoder(lzma_stream *strm, uint32_t preset)
{
// Initialize the encoder using a preset. Set the integrity to check
// to CRC64, which is the default in the xz command line tool. If
// the .xz file needs to be decompressed with XZ Embedded, use
// LZMA_CHECK_CRC32 instead.
lzma_ret ret = lzma_easy_encoder(strm, preset, LZMA_CHECK_CRC64);
// Return successfully if the initialization went fine.
if (ret == LZMA_OK)
return true;
// Something went wrong. The possible errors are documented in
// lzma/container.h (src/liblzma/api/lzma/container.h in the source
// package or e.g. /usr/include/lzma/container.h depending on the
// install prefix).
const char *msg;
switch (ret) {
case LZMA_MEM_ERROR:
msg = "Memory allocation failed";
break;
case LZMA_OPTIONS_ERROR:
msg = "Specified preset is not supported";
break;
case LZMA_UNSUPPORTED_CHECK:
msg = "Specified integrity check is not supported";
break;
default:
// This is most likely LZMA_PROG_ERROR indicating a bug in
// this program or in liblzma. It is inconvenient to have a
// separate error message for errors that should be impossible
// to occur, but knowing the error code is important for
// debugging. That's why it is good to print the error code
// at least when there is no good error message to show.
msg = "Unknown error, possibly a bug";
break;
}
fprintf(stderr, "Error initializing the encoder: %s (error code %u)\n",
msg, ret);
return false;
}
static bool
compress(lzma_stream *strm, FILE *infile, FILE *outfile)
{
// This will be LZMA_RUN until the end of the input file is reached.
// This tells lzma_code() when there will be no more input.
lzma_action action = LZMA_RUN;
// Buffers to temporarily hold uncompressed input
// and compressed output.
uint8_t inbuf[BUFSIZ];
uint8_t outbuf[BUFSIZ];
// Initialize the input and output pointers. Initializing next_in
// and avail_in isn't really necessary when we are going to encode
// just one file since LZMA_STREAM_INIT takes care of initializing
// those already. But it doesn't hurt much and it will be needed
// if encoding more than one file like we will in 02_decompress.c.
//
// While we don't care about strm->total_in or strm->total_out in this
// example, it is worth noting that initializing the encoder will
// always reset total_in and total_out to zero. But the encoder
// initialization doesn't touch next_in, avail_in, next_out, or
// avail_out.
strm->next_in = NULL;
strm->avail_in = 0;
strm->next_out = outbuf;
strm->avail_out = sizeof(outbuf);
// Loop until the file has been successfully compressed or until
// an error occurs.
while (true) {
// Fill the input buffer if it is empty.
if (strm->avail_in == 0 && !feof(infile)) {
strm->next_in = inbuf;
strm->avail_in = fread(inbuf, 1, sizeof(inbuf),
infile);
if (ferror(infile)) {
fprintf(stderr, "Read error: %s\n",
strerror(errno));
return false;
}
// Once the end of the input file has been reached,
// we need to tell lzma_code() that no more input
// will be coming and that it should finish the
// encoding.
if (feof(infile))
action = LZMA_FINISH;
}
// Tell liblzma do the actual encoding.
//
// This reads up to strm->avail_in bytes of input starting
// from strm->next_in. avail_in will be decremented and
// next_in incremented by an equal amount to match the
// number of input bytes consumed.
//
// Up to strm->avail_out bytes of compressed output will be
// written starting from strm->next_out. avail_out and next_out
// will be incremented by an equal amount to match the number
// of output bytes written.
//
// The encoder has to do internal buffering, which means that
// it may take quite a bit of input before the same data is
// available in compressed form in the output buffer.
lzma_ret ret = lzma_code(strm, action);
// If the output buffer is full or if the compression finished
// successfully, write the data from the output buffer to
// the output file.
if (strm->avail_out == 0 || ret == LZMA_STREAM_END) {
// When lzma_code() has returned LZMA_STREAM_END,
// the output buffer is likely to be only partially
// full. Calculate how much new data there is to
// be written to the output file.
size_t write_size = sizeof(outbuf) - strm->avail_out;
if (fwrite(outbuf, 1, write_size, outfile)
!= write_size) {
fprintf(stderr, "Write error: %s\n",
strerror(errno));
return false;
}
// Reset next_out and avail_out.
strm->next_out = outbuf;
strm->avail_out = sizeof(outbuf);
}
// Normally the return value of lzma_code() will be LZMA_OK
// until everything has been encoded.
if (ret != LZMA_OK) {
// Once everything has been encoded successfully, the
// return value of lzma_code() will be LZMA_STREAM_END.
//
// It is important to check for LZMA_STREAM_END. Do not
// assume that getting ret != LZMA_OK would mean that
// everything has gone well.
if (ret == LZMA_STREAM_END)
return true;
// It's not LZMA_OK nor LZMA_STREAM_END,
// so it must be an error code. See lzma/base.h
// (src/liblzma/api/lzma/base.h in the source package
// or e.g. /usr/include/lzma/base.h depending on the
// install prefix) for the list and documentation of
// possible values. Most values listen in lzma_ret
// enumeration aren't possible in this example.
const char *msg;
switch (ret) {
case LZMA_MEM_ERROR:
msg = "Memory allocation failed";
break;
case LZMA_DATA_ERROR:
// This error is returned if the compressed
// or uncompressed size get near 8 EiB
// (2^63 bytes) because that's where the .xz
// file format size limits currently are.
// That is, the possibility of this error
// is mostly theoretical unless you are doing
// something very unusual.
//
// Note that strm->total_in and strm->total_out
// have nothing to do with this error. Changing
// those variables won't increase or decrease
// the chance of getting this error.
msg = "File size limits exceeded";
break;
default:
// This is most likely LZMA_PROG_ERROR, but
// if this program is buggy (or liblzma has
// a bug), it may be e.g. LZMA_BUF_ERROR or
// LZMA_OPTIONS_ERROR too.
//
// It is inconvenient to have a separate
// error message for errors that should be
// impossible to occur, but knowing the error
// code is important for debugging. That's why
// it is good to print the error code at least
// when there is no good error message to show.
msg = "Unknown error, possibly a bug";
break;
}
fprintf(stderr, "Encoder error: %s (error code %u)\n",
msg, ret);
return false;
}
}
}
extern int
main(int argc, char **argv)
{
// Get the preset number from the command line.
uint32_t preset = get_preset(argc, argv);
// Initialize a lzma_stream structure. When it is allocated on stack,
// it is simplest to use LZMA_STREAM_INIT macro like below. When it
// is allocated on heap, using memset(strmptr, 0, sizeof(*strmptr))
// works (as long as NULL pointers are represented with zero bits
// as they are on practically all computers today).
lzma_stream strm = LZMA_STREAM_INIT;
// Initialize the encoder. If it succeeds, compress from
// stdin to stdout.
bool success = init_encoder(&strm, preset);
if (success)
success = compress(&strm, stdin, stdout);
// Free the memory allocated for the encoder. If we were encoding
// multiple files, this would only need to be done after the last
// file. See 02_decompress.c for handling of multiple files.
//
// It is OK to call lzma_end() multiple times or when it hasn't been
// actually used except initialized with LZMA_STREAM_INIT.
lzma_end(&strm);
// Close stdout to catch possible write errors that can occur
// when pending data is flushed from the stdio buffers.
if (fclose(stdout)) {
fprintf(stderr, "Write error: %s\n", strerror(errno));
success = false;
}
return success ? EXIT_SUCCESS : EXIT_FAILURE;
}

View File

@ -0,0 +1,287 @@
///////////////////////////////////////////////////////////////////////////////
//
/// \file 02_decompress.c
/// \brief Decompress .xz files to stdout
///
/// Usage: ./02_decompress INPUT_FILES... > OUTFILE
///
/// Example: ./02_decompress foo.xz bar.xz > foobar
//
// Author: Lasse Collin
//
// This file has been put into the public domain.
// You can do whatever you want with this file.
//
///////////////////////////////////////////////////////////////////////////////
#include <stdbool.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <lzma.h>
static bool
init_decoder(lzma_stream *strm)
{
// Initialize a .xz decoder. The decoder supports a memory usage limit
// and a set of flags.
//
// The memory usage of the decompressor depends on the settings used
// to compress a .xz file. It can vary from less than a megabyte to
// a few gigabytes, but in practice (at least for now) it rarely
// exceeds 65 MiB because that's how much memory is required to
// decompress files created with "xz -9". Settings requiring more
// memory take extra effort to use and don't (at least for now)
// provide significantly better compression in most cases.
//
// Memory usage limit is useful if it is important that the
// decompressor won't consume gigabytes of memory. The need
// for limiting depends on the application. In this example,
// no memory usage limiting is used. This is done by setting
// the limit to UINT64_MAX.
//
// The .xz format allows concatenating compressed files as is:
//
// echo foo | xz > foobar.xz
// echo bar | xz >> foobar.xz
//
// When decompressing normal standalone .xz files, LZMA_CONCATENATED
// should always be used to support decompression of concatenated
// .xz files. If LZMA_CONCATENATED isn't used, the decoder will stop
// after the first .xz stream. This can be useful when .xz data has
// been embedded inside another file format.
//
// Flags other than LZMA_CONCATENATED are supported too, and can
// be combined with bitwise-or. See lzma/container.h
// (src/liblzma/api/lzma/container.h in the source package or e.g.
// /usr/include/lzma/container.h depending on the install prefix)
// for details.
lzma_ret ret = lzma_stream_decoder(
strm, UINT64_MAX, LZMA_CONCATENATED);
// Return successfully if the initialization went fine.
if (ret == LZMA_OK)
return true;
// Something went wrong. The possible errors are documented in
// lzma/container.h (src/liblzma/api/lzma/container.h in the source
// package or e.g. /usr/include/lzma/container.h depending on the
// install prefix).
//
// Note that LZMA_MEMLIMIT_ERROR is never possible here. If you
// specify a very tiny limit, the error will be delayed until
// the first headers have been parsed by a call to lzma_code().
const char *msg;
switch (ret) {
case LZMA_MEM_ERROR:
msg = "Memory allocation failed";
break;
case LZMA_OPTIONS_ERROR:
msg = "Unsupported decompressor flags";
break;
default:
// This is most likely LZMA_PROG_ERROR indicating a bug in
// this program or in liblzma. It is inconvenient to have a
// separate error message for errors that should be impossible
// to occur, but knowing the error code is important for
// debugging. That's why it is good to print the error code
// at least when there is no good error message to show.
msg = "Unknown error, possibly a bug";
break;
}
fprintf(stderr, "Error initializing the decoder: %s (error code %u)\n",
msg, ret);
return false;
}
static bool
decompress(lzma_stream *strm, const char *inname, FILE *infile, FILE *outfile)
{
// When LZMA_CONCATENATED flag was used when initializing the decoder,
// we need to tell lzma_code() when there will be no more input.
// This is done by setting action to LZMA_FINISH instead of LZMA_RUN
// in the same way as it is done when encoding.
//
// When LZMA_CONCATENATED isn't used, there is no need to use
// LZMA_FINISH to tell when all the input has been read, but it
// is still OK to use it if you want. When LZMA_CONCATENATED isn't
// used, the decoder will stop after the first .xz stream. In that
// case some unused data may be left in strm->next_in.
lzma_action action = LZMA_RUN;
uint8_t inbuf[BUFSIZ];
uint8_t outbuf[BUFSIZ];
strm->next_in = NULL;
strm->avail_in = 0;
strm->next_out = outbuf;
strm->avail_out = sizeof(outbuf);
while (true) {
if (strm->avail_in == 0 && !feof(infile)) {
strm->next_in = inbuf;
strm->avail_in = fread(inbuf, 1, sizeof(inbuf),
infile);
if (ferror(infile)) {
fprintf(stderr, "%s: Read error: %s\n",
inname, strerror(errno));
return false;
}
// Once the end of the input file has been reached,
// we need to tell lzma_code() that no more input
// will be coming. As said before, this isn't required
// if the LZMA_CONCATENATED flag isn't used when
// initializing the decoder.
if (feof(infile))
action = LZMA_FINISH;
}
lzma_ret ret = lzma_code(strm, action);
if (strm->avail_out == 0 || ret == LZMA_STREAM_END) {
size_t write_size = sizeof(outbuf) - strm->avail_out;
if (fwrite(outbuf, 1, write_size, outfile)
!= write_size) {
fprintf(stderr, "Write error: %s\n",
strerror(errno));
return false;
}
strm->next_out = outbuf;
strm->avail_out = sizeof(outbuf);
}
if (ret != LZMA_OK) {
// Once everything has been decoded successfully, the
// return value of lzma_code() will be LZMA_STREAM_END.
//
// It is important to check for LZMA_STREAM_END. Do not
// assume that getting ret != LZMA_OK would mean that
// everything has gone well or that when you aren't
// getting more output it must have successfully
// decoded everything.
if (ret == LZMA_STREAM_END)
return true;
// It's not LZMA_OK nor LZMA_STREAM_END,
// so it must be an error code. See lzma/base.h
// (src/liblzma/api/lzma/base.h in the source package
// or e.g. /usr/include/lzma/base.h depending on the
// install prefix) for the list and documentation of
// possible values. Many values listen in lzma_ret
// enumeration aren't possible in this example, but
// can be made possible by enabling memory usage limit
// or adding flags to the decoder initialization.
const char *msg;
switch (ret) {
case LZMA_MEM_ERROR:
msg = "Memory allocation failed";
break;
case LZMA_FORMAT_ERROR:
// .xz magic bytes weren't found.
msg = "The input is not in the .xz format";
break;
case LZMA_OPTIONS_ERROR:
// For example, the headers specify a filter
// that isn't supported by this liblzma
// version (or it hasn't been enabled when
// building liblzma, but no-one sane does
// that unless building liblzma for an
// embedded system). Upgrading to a newer
// liblzma might help.
//
// Note that it is unlikely that the file has
// accidentally became corrupt if you get this
// error. The integrity of the .xz headers is
// always verified with a CRC32, so
// unintentionally corrupt files can be
// distinguished from unsupported files.
msg = "Unsupported compression options";
break;
case LZMA_DATA_ERROR:
msg = "Compressed file is corrupt";
break;
case LZMA_BUF_ERROR:
// Typically this error means that a valid
// file has got truncated, but it might also
// be a damaged part in the file that makes
// the decoder think the file is truncated.
// If you prefer, you can use the same error
// message for this as for LZMA_DATA_ERROR.
msg = "Compressed file is truncated or "
"otherwise corrupt";
break;
default:
// This is most likely LZMA_PROG_ERROR.
msg = "Unknown error, possibly a bug";
break;
}
fprintf(stderr, "%s: Decoder error: "
"%s (error code %u)\n",
inname, msg, ret);
return false;
}
}
}
extern int
main(int argc, char **argv)
{
if (argc <= 1) {
fprintf(stderr, "Usage: %s FILES...\n", argv[0]);
return EXIT_FAILURE;
}
lzma_stream strm = LZMA_STREAM_INIT;
bool success = true;
// Try to decompress all files.
for (int i = 1; i < argc; ++i) {
if (!init_decoder(&strm)) {
// Decoder initialization failed. There's no point
// to retry it so we need to exit.
success = false;
break;
}
FILE *infile = fopen(argv[i], "rb");
if (infile == NULL) {
fprintf(stderr, "%s: Error opening the "
"input file: %s\n",
argv[i], strerror(errno));
success = false;
} else {
success &= decompress(&strm, argv[i], infile, stdout);
fclose(infile);
}
}
// Free the memory allocated for the decoder. This only needs to be
// done after the last file.
lzma_end(&strm);
if (fclose(stdout)) {
fprintf(stderr, "Write error: %s\n", strerror(errno));
success = false;
}
return success ? EXIT_SUCCESS : EXIT_FAILURE;
}

View File

@ -0,0 +1,193 @@
///////////////////////////////////////////////////////////////////////////////
//
/// \file 03_compress_custom.c
/// \brief Compress in multi-call mode using x86 BCJ and LZMA2
///
/// Usage: ./03_compress_custom < INFILE > OUTFILE
///
/// Example: ./03_compress_custom < foo > foo.xz
//
// Author: Lasse Collin
//
// This file has been put into the public domain.
// You can do whatever you want with this file.
//
///////////////////////////////////////////////////////////////////////////////
#include <stdbool.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <lzma.h>
static bool
init_encoder(lzma_stream *strm)
{
// Use the default preset (6) for LZMA2.
//
// The lzma_options_lzma structure and the lzma_lzma_preset() function
// are declared in lzma/lzma12.h (src/liblzma/api/lzma/lzma12.h in the
// source package or e.g. /usr/include/lzma/lzma12.h depending on
// the install prefix).
lzma_options_lzma opt_lzma2;
if (lzma_lzma_preset(&opt_lzma2, LZMA_PRESET_DEFAULT)) {
// It should never fail because the default preset
// (and presets 0-9 optionally with LZMA_PRESET_EXTREME)
// are supported by all stable liblzma versions.
//
// (The encoder initialization later in this function may
// still fail due to unsupported preset *if* the features
// required by the preset have been disabled at build time,
// but no-one does such things except on embedded systems.)
fprintf(stderr, "Unsupported preset, possibly a bug\n");
return false;
}
// Now we could customize the LZMA2 options if we wanted. For example,
// we could set the the dictionary size (opt_lzma2.dict_size) to
// something else than the default (8 MiB) of the default preset.
// See lzma/lzma12.h for details of all LZMA2 options.
//
// The x86 BCJ filter will try to modify the x86 instruction stream so
// that LZMA2 can compress it better. The x86 BCJ filter doesn't need
// any options so it will be set to NULL below.
//
// Construct the filter chain. The uncompressed data goes first to
// the first filter in the array, in this case the x86 BCJ filter.
// The array is always terminated by setting .id = LZMA_VLI_UNKNOWN.
//
// See lzma/filter.h for more information about the lzma_filter
// structure.
lzma_filter filters[] = {
{ .id = LZMA_FILTER_X86, .options = NULL },
{ .id = LZMA_FILTER_LZMA2, .options = &opt_lzma2 },
{ .id = LZMA_VLI_UNKNOWN, .options = NULL },
};
// Initialize the encoder using the custom filter chain.
lzma_ret ret = lzma_stream_encoder(strm, filters, LZMA_CHECK_CRC64);
if (ret == LZMA_OK)
return true;
const char *msg;
switch (ret) {
case LZMA_MEM_ERROR:
msg = "Memory allocation failed";
break;
case LZMA_OPTIONS_ERROR:
// We are no longer using a plain preset so this error
// message has been edited accordingly compared to
// 01_compress_easy.c.
msg = "Specified filter chain is not supported";
break;
case LZMA_UNSUPPORTED_CHECK:
msg = "Specified integrity check is not supported";
break;
default:
msg = "Unknown error, possibly a bug";
break;
}
fprintf(stderr, "Error initializing the encoder: %s (error code %u)\n",
msg, ret);
return false;
}
// This function is identical to the one in 01_compress_easy.c.
static bool
compress(lzma_stream *strm, FILE *infile, FILE *outfile)
{
lzma_action action = LZMA_RUN;
uint8_t inbuf[BUFSIZ];
uint8_t outbuf[BUFSIZ];
strm->next_in = NULL;
strm->avail_in = 0;
strm->next_out = outbuf;
strm->avail_out = sizeof(outbuf);
while (true) {
if (strm->avail_in == 0 && !feof(infile)) {
strm->next_in = inbuf;
strm->avail_in = fread(inbuf, 1, sizeof(inbuf),
infile);
if (ferror(infile)) {
fprintf(stderr, "Read error: %s\n",
strerror(errno));
return false;
}
if (feof(infile))
action = LZMA_FINISH;
}
lzma_ret ret = lzma_code(strm, action);
if (strm->avail_out == 0 || ret == LZMA_STREAM_END) {
size_t write_size = sizeof(outbuf) - strm->avail_out;
if (fwrite(outbuf, 1, write_size, outfile)
!= write_size) {
fprintf(stderr, "Write error: %s\n",
strerror(errno));
return false;
}
strm->next_out = outbuf;
strm->avail_out = sizeof(outbuf);
}
if (ret != LZMA_OK) {
if (ret == LZMA_STREAM_END)
return true;
const char *msg;
switch (ret) {
case LZMA_MEM_ERROR:
msg = "Memory allocation failed";
break;
case LZMA_DATA_ERROR:
msg = "File size limits exceeded";
break;
default:
msg = "Unknown error, possibly a bug";
break;
}
fprintf(stderr, "Encoder error: %s (error code %u)\n",
msg, ret);
return false;
}
}
}
extern int
main(void)
{
lzma_stream strm = LZMA_STREAM_INIT;
bool success = init_encoder(&strm);
if (success)
success = compress(&strm, stdin, stdout);
lzma_end(&strm);
if (fclose(stdout)) {
fprintf(stderr, "Write error: %s\n", strerror(errno));
success = false;
}
return success ? EXIT_SUCCESS : EXIT_FAILURE;
}

View File

@ -0,0 +1,206 @@
///////////////////////////////////////////////////////////////////////////////
//
/// \file 04_compress_easy_mt.c
/// \brief Compress in multi-call mode using LZMA2 in multi-threaded mode
///
/// Usage: ./04_compress_easy_mt < INFILE > OUTFILE
///
/// Example: ./04_compress_easy_mt < foo > foo.xz
//
// Author: Lasse Collin
//
// This file has been put into the public domain.
// You can do whatever you want with this file.
//
///////////////////////////////////////////////////////////////////////////////
#include <stdbool.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <lzma.h>
static bool
init_encoder(lzma_stream *strm)
{
// The threaded encoder takes the options as pointer to
// a lzma_mt structure.
lzma_mt mt = {
// No flags are needed.
.flags = 0,
// Let liblzma determine a sane block size.
.block_size = 0,
// Use no timeout for lzma_code() calls by setting timeout
// to zero. That is, sometimes lzma_code() might block for
// a long time (from several seconds to even minutes).
// If this is not OK, for example due to progress indicator
// needing updates, specify a timeout in milliseconds here.
// See the documentation of lzma_mt in lzma/container.h for
// information how to choose a reasonable timeout.
.timeout = 0,
// Use the default preset (6) for LZMA2.
// To use a preset, filters must be set to NULL.
.preset = LZMA_PRESET_DEFAULT,
.filters = NULL,
// Use CRC64 for integrity checking. See also
// 01_compress_easy.c about choosing the integrity check.
.check = LZMA_CHECK_CRC64,
};
// Detect how many threads the CPU supports.
mt.threads = lzma_cputhreads();
// If the number of CPU cores/threads cannot be detected,
// use one thread. Note that this isn't the same as the normal
// single-threaded mode as this will still split the data into
// blocks and use more RAM than the normal single-threaded mode.
// You may want to consider using lzma_easy_encoder() or
// lzma_stream_encoder() instead of lzma_stream_encoder_mt() if
// lzma_cputhreads() returns 0 or 1.
if (mt.threads == 0)
mt.threads = 1;
// If the number of CPU cores/threads exceeds threads_max,
// limit the number of threads to keep memory usage lower.
// The number 8 is arbitrarily chosen and may be too low or
// high depending on the compression preset and the computer
// being used.
//
// FIXME: A better way could be to check the amount of RAM
// (or available RAM) and use lzma_stream_encoder_mt_memusage()
// to determine if the number of threads should be reduced.
const uint32_t threads_max = 8;
if (mt.threads > threads_max)
mt.threads = threads_max;
// Initialize the threaded encoder.
lzma_ret ret = lzma_stream_encoder_mt(strm, &mt);
if (ret == LZMA_OK)
return true;
const char *msg;
switch (ret) {
case LZMA_MEM_ERROR:
msg = "Memory allocation failed";
break;
case LZMA_OPTIONS_ERROR:
// We are no longer using a plain preset so this error
// message has been edited accordingly compared to
// 01_compress_easy.c.
msg = "Specified filter chain is not supported";
break;
case LZMA_UNSUPPORTED_CHECK:
msg = "Specified integrity check is not supported";
break;
default:
msg = "Unknown error, possibly a bug";
break;
}
fprintf(stderr, "Error initializing the encoder: %s (error code %u)\n",
msg, ret);
return false;
}
// This function is identical to the one in 01_compress_easy.c.
static bool
compress(lzma_stream *strm, FILE *infile, FILE *outfile)
{
lzma_action action = LZMA_RUN;
uint8_t inbuf[BUFSIZ];
uint8_t outbuf[BUFSIZ];
strm->next_in = NULL;
strm->avail_in = 0;
strm->next_out = outbuf;
strm->avail_out = sizeof(outbuf);
while (true) {
if (strm->avail_in == 0 && !feof(infile)) {
strm->next_in = inbuf;
strm->avail_in = fread(inbuf, 1, sizeof(inbuf),
infile);
if (ferror(infile)) {
fprintf(stderr, "Read error: %s\n",
strerror(errno));
return false;
}
if (feof(infile))
action = LZMA_FINISH;
}
lzma_ret ret = lzma_code(strm, action);
if (strm->avail_out == 0 || ret == LZMA_STREAM_END) {
size_t write_size = sizeof(outbuf) - strm->avail_out;
if (fwrite(outbuf, 1, write_size, outfile)
!= write_size) {
fprintf(stderr, "Write error: %s\n",
strerror(errno));
return false;
}
strm->next_out = outbuf;
strm->avail_out = sizeof(outbuf);
}
if (ret != LZMA_OK) {
if (ret == LZMA_STREAM_END)
return true;
const char *msg;
switch (ret) {
case LZMA_MEM_ERROR:
msg = "Memory allocation failed";
break;
case LZMA_DATA_ERROR:
msg = "File size limits exceeded";
break;
default:
msg = "Unknown error, possibly a bug";
break;
}
fprintf(stderr, "Encoder error: %s (error code %u)\n",
msg, ret);
return false;
}
}
}
extern int
main(void)
{
lzma_stream strm = LZMA_STREAM_INIT;
bool success = init_encoder(&strm);
if (success)
success = compress(&strm, stdin, stdout);
lzma_end(&strm);
if (fclose(stdout)) {
fprintf(stderr, "Write error: %s\n", strerror(errno));
success = false;
}
return success ? EXIT_SUCCESS : EXIT_FAILURE;
}

View File

@ -0,0 +1,127 @@
/*
* xz_pipe_comp.c
* A simple example of pipe-only xz compressor implementation.
* version: 2010-07-12 - by Daniel Mealha Cabrita
* Not copyrighted -- provided to the public domain.
*
* Compiling:
* Link with liblzma. GCC example:
* $ gcc -llzma xz_pipe_comp.c -o xz_pipe_comp
*
* Usage example:
* $ cat some_file | ./xz_pipe_comp > some_file.xz
*/
#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>
#include <stdbool.h>
#include <lzma.h>
/* COMPRESSION SETTINGS */
/* analogous to xz CLI options: -0 to -9 */
#define COMPRESSION_LEVEL 6
/* boolean setting, analogous to xz CLI option: -e */
#define COMPRESSION_EXTREME true
/* see: /usr/include/lzma/check.h LZMA_CHECK_* */
#define INTEGRITY_CHECK LZMA_CHECK_CRC64
/* read/write buffer sizes */
#define IN_BUF_MAX 4096
#define OUT_BUF_MAX 4096
/* error codes */
#define RET_OK 0
#define RET_ERROR_INIT 1
#define RET_ERROR_INPUT 2
#define RET_ERROR_OUTPUT 3
#define RET_ERROR_COMPRESSION 4
/* note: in_file and out_file must be open already */
int xz_compress (FILE *in_file, FILE *out_file)
{
uint32_t preset = COMPRESSION_LEVEL | (COMPRESSION_EXTREME ? LZMA_PRESET_EXTREME : 0);
lzma_check check = INTEGRITY_CHECK;
lzma_stream strm = LZMA_STREAM_INIT; /* alloc and init lzma_stream struct */
uint8_t in_buf [IN_BUF_MAX];
uint8_t out_buf [OUT_BUF_MAX];
size_t in_len; /* length of useful data in in_buf */
size_t out_len; /* length of useful data in out_buf */
bool in_finished = false;
bool out_finished = false;
lzma_action action;
lzma_ret ret_xz;
int ret;
ret = RET_OK;
/* initialize xz encoder */
ret_xz = lzma_easy_encoder (&strm, preset, check);
if (ret_xz != LZMA_OK) {
fprintf (stderr, "lzma_easy_encoder error: %d\n", (int) ret_xz);
return RET_ERROR_INIT;
}
while ((! in_finished) && (! out_finished)) {
/* read incoming data */
in_len = fread (in_buf, 1, IN_BUF_MAX, in_file);
if (feof (in_file)) {
in_finished = true;
}
if (ferror (in_file)) {
in_finished = true;
ret = RET_ERROR_INPUT;
}
strm.next_in = in_buf;
strm.avail_in = in_len;
/* if no more data from in_buf, flushes the
internal xz buffers and closes the xz data
with LZMA_FINISH */
action = in_finished ? LZMA_FINISH : LZMA_RUN;
/* loop until there's no pending compressed output */
do {
/* out_buf is clean at this point */
strm.next_out = out_buf;
strm.avail_out = OUT_BUF_MAX;
/* compress data */
ret_xz = lzma_code (&strm, action);
if ((ret_xz != LZMA_OK) && (ret_xz != LZMA_STREAM_END)) {
fprintf (stderr, "lzma_code error: %d\n", (int) ret_xz);
out_finished = true;
ret = RET_ERROR_COMPRESSION;
} else {
/* write compressed data */
out_len = OUT_BUF_MAX - strm.avail_out;
fwrite (out_buf, 1, out_len, out_file);
if (ferror (out_file)) {
out_finished = true;
ret = RET_ERROR_OUTPUT;
}
}
} while (strm.avail_out == 0);
}
lzma_end (&strm);
return ret;
}
int main ()
{
int ret;
ret = xz_compress (stdin, stdout);
return ret;
}

View File

@ -0,0 +1,123 @@
/*
* xz_pipe_decomp.c
* A simple example of pipe-only xz decompressor implementation.
* version: 2012-06-14 - by Daniel Mealha Cabrita
* Not copyrighted -- provided to the public domain.
*
* Compiling:
* Link with liblzma. GCC example:
* $ gcc -llzma xz_pipe_decomp.c -o xz_pipe_decomp
*
* Usage example:
* $ cat some_file.xz | ./xz_pipe_decomp > some_file
*/
#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>
#include <stdbool.h>
#include <lzma.h>
/* read/write buffer sizes */
#define IN_BUF_MAX 4096
#define OUT_BUF_MAX 4096
/* error codes */
#define RET_OK 0
#define RET_ERROR_INIT 1
#define RET_ERROR_INPUT 2
#define RET_ERROR_OUTPUT 3
#define RET_ERROR_DECOMPRESSION 4
/* note: in_file and out_file must be open already */
int xz_decompress (FILE *in_file, FILE *out_file)
{
lzma_stream strm = LZMA_STREAM_INIT; /* alloc and init lzma_stream struct */
const uint32_t flags = LZMA_TELL_UNSUPPORTED_CHECK | LZMA_CONCATENATED;
const uint64_t memory_limit = UINT64_MAX; /* no memory limit */
uint8_t in_buf [IN_BUF_MAX];
uint8_t out_buf [OUT_BUF_MAX];
size_t in_len; /* length of useful data in in_buf */
size_t out_len; /* length of useful data in out_buf */
bool in_finished = false;
bool out_finished = false;
lzma_action action;
lzma_ret ret_xz;
int ret;
ret = RET_OK;
/* initialize xz decoder */
ret_xz = lzma_stream_decoder (&strm, memory_limit, flags);
if (ret_xz != LZMA_OK) {
fprintf (stderr, "lzma_stream_decoder error: %d\n", (int) ret_xz);
return RET_ERROR_INIT;
}
while ((! in_finished) && (! out_finished)) {
/* read incoming data */
in_len = fread (in_buf, 1, IN_BUF_MAX, in_file);
if (feof (in_file)) {
in_finished = true;
}
if (ferror (in_file)) {
in_finished = true;
ret = RET_ERROR_INPUT;
}
strm.next_in = in_buf;
strm.avail_in = in_len;
/* if no more data from in_buf, flushes the
internal xz buffers and closes the decompressed data
with LZMA_FINISH */
action = in_finished ? LZMA_FINISH : LZMA_RUN;
/* loop until there's no pending decompressed output */
do {
/* out_buf is clean at this point */
strm.next_out = out_buf;
strm.avail_out = OUT_BUF_MAX;
/* decompress data */
ret_xz = lzma_code (&strm, action);
if ((ret_xz != LZMA_OK) && (ret_xz != LZMA_STREAM_END)) {
fprintf (stderr, "lzma_code error: %d\n", (int) ret_xz);
out_finished = true;
ret = RET_ERROR_DECOMPRESSION;
} else {
/* write decompressed data */
out_len = OUT_BUF_MAX - strm.avail_out;
fwrite (out_buf, 1, out_len, out_file);
if (ferror (out_file)) {
out_finished = true;
ret = RET_ERROR_OUTPUT;
}
}
} while (strm.avail_out == 0);
}
/* Bug fix (2012-06-14): If no errors were detected, check
that the last lzma_code() call returned LZMA_STREAM_END.
If not, the file is probably truncated. */
if ((ret == RET_OK) && (ret_xz != LZMA_STREAM_END)) {
fprintf (stderr, "Input truncated or corrupt\n");
ret = RET_ERROR_DECOMPRESSION;
}
lzma_end (&strm);
return ret;
}
int main ()
{
int ret;
ret = xz_decompress (stdin, stdout);
return ret;
}

244
doc/faq.txt Normal file
View File

@ -0,0 +1,244 @@
XZ Utils FAQ
============
Q: What do the letters XZ mean?
A: Nothing. They are just two letters, which come from the file format
suffix .xz. The .xz suffix was selected, because it seemed to be
pretty much unused. It has no deeper meaning.
Q: What are LZMA and LZMA2?
A: LZMA stands for Lempel-Ziv-Markov chain-Algorithm. It is the name
of the compression algorithm designed by Igor Pavlov for 7-Zip.
LZMA is based on LZ77 and range encoding.
LZMA2 is an updated version of the original LZMA to fix a couple of
practical issues. In context of XZ Utils, LZMA is called LZMA1 to
emphasize that LZMA is not the same thing as LZMA2. LZMA2 is the
primary compression algorithm in the .xz file format.
Q: There are many LZMA related projects. How does XZ Utils relate to them?
A: 7-Zip and LZMA SDK are the original projects. LZMA SDK is roughly
a subset of the 7-Zip source tree.
p7zip is 7-Zip's command-line tools ported to POSIX-like systems.
LZMA Utils provide a gzip-like lzma tool for POSIX-like systems.
LZMA Utils are based on LZMA SDK. XZ Utils are the successor to
LZMA Utils.
There are several other projects using LZMA. Most are more or less
based on LZMA SDK. See <https://7-zip.org/links.html>.
Q: Why is liblzma named liblzma if its primary file format is .xz?
Shouldn't it be e.g. libxz?
A: When the designing of the .xz format began, the idea was to replace
the .lzma format and use the same .lzma suffix. It would have been
quite OK to reuse the suffix when there were very few .lzma files
around. However, the old .lzma format became popular before the
new format was finished. The new format was renamed to .xz but the
name of liblzma wasn't changed.
Q: Do XZ Utils support the .7z format?
A: No. Use 7-Zip (Windows) or p7zip (POSIX-like systems) to handle .7z
files.
Q: I have many .tar.7z files. Can I convert them to .tar.xz without
spending hours recompressing the data?
A: In the "extra" directory, there is a script named 7z2lzma.bash which
is able to convert some .7z files to the .lzma format (not .xz). It
needs the 7za (or 7z) command from p7zip. The script may silently
produce corrupt output if certain assumptions are not met, so
decompress the resulting .lzma file and compare it against the
original before deleting the original file!
Q: I have many .lzma files. Can I quickly convert them to the .xz format?
A: For now, no. Since XZ Utils supports the .lzma format, it's usually
not too bad to keep the old files in the old format. If you want to
do the conversion anyway, you need to decompress the .lzma files and
then recompress to the .xz format.
Technically, there is a way to make the conversion relatively fast
(roughly twice the time that normal decompression takes). Writing
such a tool would take quite a bit of time though, and would probably
be useful to only a few people. If you really want such a conversion
tool, contact Lasse Collin and offer some money.
Q: I have installed xz, but my tar doesn't recognize .tar.xz files.
How can I extract .tar.xz files?
A: xz -dc foo.tar.xz | tar xf -
Q: Can I recover parts of a broken .xz file (e.g. a corrupted CD-R)?
A: It may be possible if the file consists of multiple blocks, which
typically is not the case if the file was created in single-threaded
mode. There is no recovery program yet.
Q: Is (some part of) XZ Utils patented?
A: Lasse Collin is not aware of any patents that could affect XZ Utils.
However, due to the nature of software patents, it's not possible to
guarantee that XZ Utils isn't affected by any third party patent(s).
Q: Where can I find documentation about the file format and algorithms?
A: The .xz format is documented in xz-file-format.txt. It is a container
format only, and doesn't include descriptions of any non-trivial
filters.
Documenting LZMA and LZMA2 is planned, but for now, there is no other
documentation than the source code. Before you begin, you should know
the basics of LZ77 and range-coding algorithms. LZMA is based on LZ77,
but LZMA is a lot more complex. Range coding is used to compress
the final bitstream like Huffman coding is used in Deflate.
Q: I cannot find BCJ and BCJ2 filters. Don't they exist in liblzma?
A: BCJ filter is called "x86" in liblzma. BCJ2 is not included,
because it requires using more than one encoded output stream.
Q: I need to use a script that runs "xz -9". On a system with 256 MiB
of RAM, xz says that it cannot allocate memory. Can I make the
script work without modifying it?
A: Set a default memory usage limit for compression. You can do it e.g.
in a shell initialization script such as ~/.bashrc or /etc/profile:
XZ_DEFAULTS=--memlimit-compress=150MiB
export XZ_DEFAULTS
xz will then scale the compression settings down so that the given
memory usage limit is not reached. This way xz shouldn't run out
of memory.
Check also that memory-related resource limits are high enough.
On most systems, "ulimit -a" will show the current resource limits.
Q: How do I create files that can be decompressed with XZ Embedded?
A: See the documentation in XZ Embedded. In short, something like
this is a good start:
xz --check=crc32 --lzma2=preset=6e,dict=64KiB
Or if a BCJ filter is needed too, e.g. if compressing
a kernel image for PowerPC:
xz --check=crc32 --powerpc --lzma2=preset=6e,dict=64KiB
Adjust the dictionary size to get a good compromise between
compression ratio and decompressor memory usage. Note that
in single-call decompression mode of XZ Embedded, a big
dictionary doesn't increase memory usage.
Q: How is multi-threaded compression implemented in XZ Utils?
A: The simplest method is splitting the uncompressed data into blocks
and compressing them in parallel independent from each other.
This is currently the only threading method supported in XZ Utils.
Since the blocks are compressed independently, they can also be
decompressed independently. Together with the index feature in .xz,
this allows using threads to create .xz files for random-access
reading. This also makes threaded decompression possible.
The independent blocks method has a couple of disadvantages too. It
will compress worse than a single-block method. Often the difference
is not too big (maybe 1-2 %) but sometimes it can be too big. Also,
the memory usage of the compressor increases linearly when adding
threads.
At least two other threading methods are possible but these haven't
been implemented in XZ Utils:
Match finder parallelization has been in 7-Zip for ages. It doesn't
affect compression ratio or memory usage significantly. Among the
three threading methods, only this is useful when compressing small
files (files that are not significantly bigger than the dictionary).
Unfortunately this method scales only to about two CPU cores.
The third method is pigz-style threading (I use that name, because
pigz <https://www.zlib.net/pigz/> uses that method). It doesn't
affect compression ratio significantly and scales to many cores.
The memory usage scales linearly when threads are added. This isn't
significant with pigz, because Deflate uses only a 32 KiB dictionary,
but with LZMA2 the memory usage will increase dramatically just like
with the independent-blocks method. There is also a constant
computational overhead, which may make pigz-method a bit dull on
dual-core compared to the parallel match finder method, but with more
cores the overhead is not a big deal anymore.
Combining the threading methods will be possible and also useful.
For example, combining match finder parallelization with pigz-style
threading or independent-blocks-threading can cut the memory usage
by 50 %.
Q: I told xz to use many threads but it is using only one or two
processor cores. What is wrong?
A: Since multi-threaded compression is done by splitting the data into
blocks that are compressed individually, if the input file is too
small for the block size, then many threads cannot be used. The
default block size increases when the compression level is
increased. For example, xz -6 uses 8 MiB LZMA2 dictionary and
24 MiB blocks, and xz -9 uses 64 MiB LZMA dictionary and 192 MiB
blocks. If the input file is 100 MiB, xz -6 can use five threads
of which one will finish quickly as it has only 4 MiB to compress.
However, for the same file, xz -9 can only use one thread.
One can adjust block size with --block-size=SIZE but making the
block size smaller than LZMA2 dictionary is waste of RAM: using
xz -9 with 6 MiB blocks isn't any better than using xz -6 with
6 MiB blocks. The default settings use a block size bigger than
the LZMA2 dictionary size because this was seen as a reasonable
compromise between RAM usage and compression ratio.
When decompressing, the ability to use threads depends on how the
file was created. If it was created in multi-threaded mode then
it can be decompressed in multi-threaded mode too if there are
multiple blocks in the file.
Q: How do I build a program that needs liblzmadec (lzmadec.h)?
A: liblzmadec is part of LZMA Utils. XZ Utils has liblzma, but no
liblzmadec. The code using liblzmadec should be ported to use
liblzma instead. If you cannot or don't want to do that, download
LZMA Utils from <https://tukaani.org/lzma/>.
Q: The default build of liblzma is too big. How can I make it smaller?
A: Give --enable-small to the configure script. Use also appropriate
--enable or --disable options to include only those filter encoders
and decoders and integrity checks that you actually need. Use
CFLAGS=-Os (with GCC) or equivalent to tell your compiler to optimize
for size. See INSTALL for information about configure options.
If the result is still too big, take a look at XZ Embedded. It is
a separate project, which provides a limited but significantly
smaller XZ decoder implementation than XZ Utils. You can find it
at <https://tukaani.org/xz/embedded.html>.

150
doc/history.txt Normal file
View File

@ -0,0 +1,150 @@
History of LZMA Utils and XZ Utils
==================================
Tukaani distribution
In 2005, there was a small group working on the Tukaani distribution,
which was a Slackware fork. One of the project's goals was to fit the
distro on a single 700 MiB ISO-9660 image. Using LZMA instead of gzip
helped a lot. Roughly speaking, one could fit data that took 1000 MiB
in gzipped form into 700 MiB with LZMA. Naturally, the compression
ratio varied across packages, but this was what we got on average.
Slackware packages have traditionally had .tgz as the filename suffix,
which is an abbreviation of .tar.gz. A logical naming for LZMA
compressed packages was .tlz, being an abbreviation of .tar.lzma.
At the end of the year 2007, there was no distribution under the
Tukaani project anymore, but development of LZMA Utils was kept going.
Still, there were .tlz packages around, because at least Vector Linux
(a Slackware based distribution) used LZMA for its packages.
First versions of the modified pkgtools used the LZMA_Alone tool from
Igor Pavlov's LZMA SDK as is. It was fine, because users wouldn't need
to interact with LZMA_Alone directly. But people soon wanted to use
LZMA for other files too, and the interface of LZMA_Alone wasn't
comfortable for those used to gzip and bzip2.
First steps of LZMA Utils
The first version of LZMA Utils (4.22.0) included a shell script called
lzmash. It was a wrapper that had a gzip-like command-line interface. It
used the LZMA_Alone tool from LZMA SDK to do all the real work. zgrep,
zdiff, and related scripts from gzip were adapted to work with LZMA and
were part of the first LZMA Utils release too.
LZMA Utils 4.22.0 included also lzmadec, which was a small (less than
10 KiB) decoder-only command-line tool. It was written on top of the
decoder-only C code found from the LZMA SDK. lzmadec was convenient in
situations where LZMA_Alone (a few hundred KiB) would be too big.
lzmash and lzmadec were written by Lasse Collin.
Second generation
The lzmash script was an ugly and not very secure hack. The last
version of LZMA Utils to use lzmash was 4.27.1.
LZMA Utils 4.32.0beta1 introduced a new lzma command-line tool written
by Ville Koskinen. It was written in C++, and used the encoder and
decoder from C++ LZMA SDK with some little modifications. This tool
replaced both the lzmash script and the LZMA_Alone command-line tool
in LZMA Utils.
Introducing this new tool caused some temporary incompatibilities,
because the LZMA_Alone executable was simply named lzma like the new
command-line tool, but they had a completely different command-line
interface. The file format was still the same.
Lasse wrote liblzmadec, which was a small decoder-only library based
on the C code found from LZMA SDK. liblzmadec had an API similar to
zlib, although there were some significant differences, which made it
non-trivial to use it in some applications designed for zlib and
libbzip2.
The lzmadec command-line tool was converted to use liblzmadec.
Alexandre Sauvé helped converting the build system to use GNU
Autotools. This made it easier to test for certain less portable
features needed by the new command-line tool.
Since the new command-line tool never got completely finished (for
example, it didn't support the LZMA_OPT environment variable), the
intent was to not call 4.32.x stable. Similarly, liblzmadec wasn't
polished, but appeared to work well enough, so some people started
using it too.
Because the development of the third generation of LZMA Utils was
delayed considerably (3-4 years), the 4.32.x branch had to be kept
maintained. It got some bug fixes now and then, and finally it was
decided to call it stable, although most of the missing features were
never added.
File format problems
The file format used by LZMA_Alone was primitive. It was designed with
embedded systems in mind, and thus provided only a minimal set of
features. The two biggest problems for non-embedded use were the lack
of magic bytes and an integrity check.
Igor and Lasse started developing a new file format with some help
from Ville Koskinen. Also Mark Adler, Mikko Pouru, H. Peter Anvin,
and Lars Wirzenius helped with some minor things at some point of the
development. Designing the new format took quite a long time (actually,
too long a time would be a more appropriate expression). It was mostly
because Lasse was quite slow at getting things done due to personal
reasons.
Originally the new format was supposed to use the same .lzma suffix
that was already used by the old file format. Switching to the new
format wouldn't have caused much trouble when the old format wasn't
used by many people. But since the development of the new format took
such a long time, the old format got quite popular, and it was decided
that the new file format must use a different suffix.
It was decided to use .xz as the suffix of the new file format. The
first stable .xz file format specification was finally released in
December 2008. In addition to fixing the most obvious problems of
the old .lzma format, the .xz format added some new features like
support for multiple filters (compression algorithms), filter chaining
(like piping on the command line), and limited random-access reading.
Currently the primary compression algorithm used in .xz is LZMA2.
It is an extension on top of the original LZMA to fix some practical
problems: LZMA2 adds support for flushing the encoder, uncompressed
chunks, eases stateful decoder implementations, and improves support
for multithreading. Since LZMA2 is better than the original LZMA, the
original LZMA is not supported in .xz.
Transition to XZ Utils
The early versions of XZ Utils were called LZMA Utils. The first
releases were 4.42.0alphas. They dropped the rest of the C++ LZMA SDK.
The code was still directly based on LZMA SDK but ported to C and
converted from a callback API to a stateful API. Later, Igor Pavlov
made a C version of the LZMA encoder too; these ports from C++ to C
were independent in LZMA SDK and LZMA Utils.
The core of the new LZMA Utils was liblzma, a compression library with
a zlib-like API. liblzma supported both the old and new file format.
The gzip-like lzma command-line tool was rewritten to use liblzma.
The new LZMA Utils code base was renamed to XZ Utils when the name
of the new file format had been decided. The liblzma compression
library retained its name though, because changing it would have
caused unnecessary breakage in applications already using the early
liblzma snapshots.
The xz command-line tool can emulate the gzip-like lzma tool by
creating appropriate symlinks (e.g. lzma -> xz). Thus, practically
all scripts using the lzma tool from LZMA Utils will work as is with
XZ Utils (and will keep using the old .lzma format). Still, the .lzma
format is more or less deprecated. XZ Utils will keep supporting it,
but new applications should use the .xz format, and migrating old
applications to .xz is often a good idea too.

173
doc/lzma-file-format.txt Normal file
View File

@ -0,0 +1,173 @@
The .lzma File Format
=====================
0. Preface
0.1. Notices and Acknowledgements
0.2. Changes
1. File Format
1.1. Header
1.1.1. Properties
1.1.2. Dictionary Size
1.1.3. Uncompressed Size
1.2. LZMA Compressed Data
2. References
0. Preface
This document describes the .lzma file format, which is
sometimes also called LZMA_Alone format. It is a legacy file
format, which is being or has been replaced by the .xz format.
The MIME type of the .lzma format is `application/x-lzma'.
The most commonly used software to handle .lzma files are
LZMA SDK, LZMA Utils, 7-Zip, and XZ Utils. This document
describes some of the differences between these implementations
and gives hints what subset of the .lzma format is the most
portable.
0.1. Notices and Acknowledgements
This file format was designed by Igor Pavlov for use in
LZMA SDK. This document was written by Lasse Collin
<lasse.collin@tukaani.org> using the documentation found
from the LZMA SDK.
This document has been put into the public domain.
0.2. Changes
Last modified: 2022-07-13 21:00+0300
Compared to the previous version (2011-04-12 11:55+0300)
the section 1.1.3 was modified to allow End of Payload Marker
with a known Uncompressed Size.
1. File Format
+-+-+-+-+-+-+-+-+-+-+-+-+-+==========================+
| Header | LZMA Compressed Data |
+-+-+-+-+-+-+-+-+-+-+-+-+-+==========================+
The .lzma format file consist of 13-byte Header followed by
the LZMA Compressed Data.
Unlike the .gz, .bz2, and .xz formats, it is not possible to
concatenate multiple .lzma files as is and expect the
decompression tool to decode the resulting file as if it were
a single .lzma file.
For example, the command line tools from LZMA Utils and
LZMA SDK silently ignore all the data after the first .lzma
stream. In contrast, the command line tool from XZ Utils
considers the .lzma file to be corrupt if there is data after
the first .lzma stream.
1.1. Header
+------------+----+----+----+----+--+--+--+--+--+--+--+--+
| Properties | Dictionary Size | Uncompressed Size |
+------------+----+----+----+----+--+--+--+--+--+--+--+--+
1.1.1. Properties
The Properties field contains three properties. An abbreviation
is given in parentheses, followed by the value range of the
property. The field consists of
1) the number of literal context bits (lc, [0, 8]);
2) the number of literal position bits (lp, [0, 4]); and
3) the number of position bits (pb, [0, 4]).
The properties are encoded using the following formula:
Properties = (pb * 5 + lp) * 9 + lc
The following C code illustrates a straightforward way to
decode the Properties field:
uint8_t lc, lp, pb;
uint8_t prop = get_lzma_properties();
if (prop > (4 * 5 + 4) * 9 + 8)
return LZMA_PROPERTIES_ERROR;
pb = prop / (9 * 5);
prop -= pb * 9 * 5;
lp = prop / 9;
lc = prop - lp * 9;
XZ Utils has an additional requirement: lc + lp <= 4. Files
which don't follow this requirement cannot be decompressed
with XZ Utils. Usually this isn't a problem since the most
common lc/lp/pb values are 3/0/2. It is the only lc/lp/pb
combination that the files created by LZMA Utils can have,
but LZMA Utils can decompress files with any lc/lp/pb.
1.1.2. Dictionary Size
Dictionary Size is stored as an unsigned 32-bit little endian
integer. Any 32-bit value is possible, but for maximum
portability, only sizes of 2^n and 2^n + 2^(n-1) should be
used.
LZMA Utils creates only files with dictionary size 2^n,
16 <= n <= 25. LZMA Utils can decompress files with any
dictionary size.
XZ Utils creates and decompresses .lzma files only with
dictionary sizes 2^n and 2^n + 2^(n-1). If some other
dictionary size is specified when compressing, the value
stored in the Dictionary Size field is a rounded up, but the
specified value is still used in the actual compression code.
1.1.3. Uncompressed Size
Uncompressed Size is stored as unsigned 64-bit little endian
integer. A special value of 0xFFFF_FFFF_FFFF_FFFF indicates
that Uncompressed Size is unknown. End of Payload Marker (*)
is used if Uncompressed Size is unknown. End of Payload Marker
is allowed but rarely used if Uncompressed Size is known.
XZ Utils 5.2.5 and older don't support .lzma files that have
End of Payload Marker together with a known Uncompressed Size.
XZ Utils rejects files whose Uncompressed Size field specifies
a known size that is 256 GiB or more. This is to reject false
positives when trying to guess if the input file is in the
.lzma format. When Uncompressed Size is unknown, there is no
limit for the uncompressed size of the file.
(*) Some tools use the term End of Stream (EOS) marker
instead of End of Payload Marker.
1.2. LZMA Compressed Data
Detailed description of the format of this field is out of
scope of this document.
2. References
LZMA SDK - The original LZMA implementation
http://7-zip.org/sdk.html
7-Zip
http://7-zip.org/
LZMA Utils - LZMA adapted to POSIX-like systems
http://tukaani.org/lzma/
XZ Utils - The next generation of LZMA Utils
http://tukaani.org/xz/
The .xz file format - The successor of the .lzma format
http://tukaani.org/xz/xz-file-format.txt

Binary file not shown.

BIN
doc/man/pdf-a4/xz-a4.pdf Normal file

Binary file not shown.

BIN
doc/man/pdf-a4/xzdec-a4.pdf Normal file

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

40
doc/man/txt/lzmainfo.txt Normal file
View File

@ -0,0 +1,40 @@
LZMAINFO(1) XZ Utils LZMAINFO(1)
NAME
lzmainfo - show information stored in the .lzma file header
SYNOPSIS
lzmainfo [--help] [--version] [file...]
DESCRIPTION
lzmainfo shows information stored in the .lzma file header. It reads
the first 13 bytes from the specified file, decodes the header, and
prints it to standard output in human readable format. If no files are
given or file is -, standard input is read.
Usually the most interesting information is the uncompressed size and
the dictionary size. Uncompressed size can be shown only if the file
is in the non-streamed .lzma format variant. The amount of memory re-
quired to decompress the file is a few dozen kilobytes plus the dictio-
nary size.
lzmainfo is included in XZ Utils primarily for backward compatibility
with LZMA Utils.
EXIT STATUS
0 All is good.
1 An error occurred.
BUGS
lzmainfo uses MB while the correct suffix would be MiB (2^20 bytes).
This is to keep the output compatible with LZMA Utils.
SEE ALSO
xz(1)
Tukaani 2013-06-30 LZMAINFO(1)

1589
doc/man/txt/xz.txt Normal file

File diff suppressed because it is too large Load Diff

80
doc/man/txt/xzdec.txt Normal file
View File

@ -0,0 +1,80 @@
XZDEC(1) XZ Utils XZDEC(1)
NAME
xzdec, lzmadec - Small .xz and .lzma decompressors
SYNOPSIS
xzdec [option...] [file...]
lzmadec [option...] [file...]
DESCRIPTION
xzdec is a liblzma-based decompression-only tool for .xz (and only .xz)
files. xzdec is intended to work as a drop-in replacement for xz(1) in
the most common situations where a script has been written to use xz
--decompress --stdout (and possibly a few other commonly used options)
to decompress .xz files. lzmadec is identical to xzdec except that lz-
madec supports .lzma files instead of .xz files.
To reduce the size of the executable, xzdec doesn't support multi-
threading or localization, and doesn't read options from XZ_DEFAULTS
and XZ_OPT environment variables. xzdec doesn't support displaying in-
termediate progress information: sending SIGINFO to xzdec does nothing,
but sending SIGUSR1 terminates the process instead of displaying
progress information.
OPTIONS
-d, --decompress, --uncompress
Ignored for xz(1) compatibility. xzdec supports only decompres-
sion.
-k, --keep
Ignored for xz(1) compatibility. xzdec never creates or removes
any files.
-c, --stdout, --to-stdout
Ignored for xz(1) compatibility. xzdec always writes the decom-
pressed data to standard output.
-q, --quiet
Specifying this once does nothing since xzdec never displays any
warnings or notices. Specify this twice to suppress errors.
-Q, --no-warn
Ignored for xz(1) compatibility. xzdec never uses the exit sta-
tus 2.
-h, --help
Display a help message and exit successfully.
-V, --version
Display the version number of xzdec and liblzma.
EXIT STATUS
0 All was good.
1 An error occurred.
xzdec doesn't have any warning messages like xz(1) has, thus the exit
status 2 is not used by xzdec.
NOTES
Use xz(1) instead of xzdec or lzmadec for normal everyday use. xzdec
or lzmadec are meant only for situations where it is important to have
a smaller decompressor than the full-featured xz(1).
xzdec and lzmadec are not really that small. The size can be reduced
further by dropping features from liblzma at compile time, but that
shouldn't usually be done for executables distributed in typical non-
embedded operating system distributions. If you need a truly small .xz
decompressor, consider using XZ Embedded.
SEE ALSO
xz(1)
XZ Embedded: <https://tukaani.org/xz/embedded.html>
Tukaani 2017-04-19 XZDEC(1)

37
doc/man/txt/xzdiff.txt Normal file
View File

@ -0,0 +1,37 @@
XZDIFF(1) XZ Utils XZDIFF(1)
NAME
xzcmp, xzdiff, lzcmp, lzdiff - compare compressed files
SYNOPSIS
xzcmp [cmp_options] file1 [file2]
xzdiff [diff_options] file1 [file2]
lzcmp [cmp_options] file1 [file2]
lzdiff [diff_options] file1 [file2]
DESCRIPTION
xzcmp and xzdiff invoke cmp(1) or diff(1) on files compressed with
xz(1), lzma(1), gzip(1), bzip2(1), lzop(1), or zstd(1). All options
specified are passed directly to cmp(1) or diff(1). If only one file
is specified, then the files compared are file1 (which must have a suf-
fix of a supported compression format) and file1 from which the com-
pression format suffix has been stripped. If two files are specified,
then they are uncompressed if necessary and fed to cmp(1) or diff(1).
The exit status from cmp(1) or diff(1) is preserved unless a decompres-
sion error occurs; then exit status is 2.
The names lzcmp and lzdiff are provided for backward compatibility with
LZMA Utils.
SEE ALSO
cmp(1), diff(1), xz(1), gzip(1), bzip2(1), lzop(1), zstd(1), zdiff(1)
BUGS
Messages from the cmp(1) or diff(1) programs refer to temporary file-
names instead of those specified.
Tukaani 2021-06-04 XZDIFF(1)

49
doc/man/txt/xzgrep.txt Normal file
View File

@ -0,0 +1,49 @@
XZGREP(1) XZ Utils XZGREP(1)
NAME
xzgrep - search compressed files for a regular expression
SYNOPSIS
xzgrep [grep_options] [-e] pattern [file...]
xzegrep ...
xzfgrep ...
lzgrep ...
lzegrep ...
lzfgrep ...
DESCRIPTION
xzgrep invokes grep(1) on files which may be either uncompressed or
compressed with xz(1), lzma(1), gzip(1), bzip2(1), lzop(1), or zstd(1).
All options specified are passed directly to grep(1).
If no file is specified, then standard input is decompressed if neces-
sary and fed to grep(1). When reading from standard input, gzip(1),
bzip2(1), lzop(1), and zstd(1) compressed files are not supported.
If xzgrep is invoked as xzegrep or xzfgrep then grep -E or grep -F is
used instead of grep(1). The same applies to names lzgrep, lzegrep,
and lzfgrep, which are provided for backward compatibility with LZMA
Utils.
EXIT STATUS
0 At least one match was found from at least one of the input
files. No errors occurred.
1 No matches were found from any of the input files. No errors
occurred.
>1 One or more errors occurred. It is unknown if matches were
found.
ENVIRONMENT
GREP If the GREP environment variable is set, xzgrep uses it instead
of grep(1), grep -E, or grep -F.
SEE ALSO
grep(1), xz(1), gzip(1), bzip2(1), lzop(1), zstd(1), zgrep(1)
Tukaani 2022-07-19 XZGREP(1)

39
doc/man/txt/xzless.txt Normal file
View File

@ -0,0 +1,39 @@
XZLESS(1) XZ Utils XZLESS(1)
NAME
xzless, lzless - view xz or lzma compressed (text) files
SYNOPSIS
xzless [file...]
lzless [file...]
DESCRIPTION
xzless is a filter that displays text from compressed files to a termi-
nal. It works on files compressed with xz(1) or lzma(1). If no files
are given, xzless reads from standard input.
xzless uses less(1) to present its output. Unlike xzmore, its choice
of pager cannot be altered by setting an environment variable. Com-
mands are based on both more(1) and vi(1) and allow back and forth
movement and searching. See the less(1) manual for more information.
The command named lzless is provided for backward compatibility with
LZMA Utils.
ENVIRONMENT
LESSMETACHARS
A list of characters special to the shell. Set by xzless unless
it is already set in the environment.
LESSOPEN
Set to a command line to invoke the xz(1) decompressor for pre-
processing the input files to less(1).
SEE ALSO
less(1), xz(1), xzmore(1), zless(1)
Tukaani 2010-09-27 XZLESS(1)

34
doc/man/txt/xzmore.txt Normal file
View File

@ -0,0 +1,34 @@
XZMORE(1) XZ Utils XZMORE(1)
NAME
xzmore, lzmore - view xz or lzma compressed (text) files
SYNOPSIS
xzmore [file...]
lzmore [file...]
DESCRIPTION
xzmore is a filter which allows examination of xz(1) or lzma(1) com-
pressed text files one screenful at a time on a soft-copy terminal.
To use a pager other than the default more, set environment variable
PAGER to the name of the desired program. The name lzmore is provided
for backward compatibility with LZMA Utils.
e or q When the prompt --More--(Next file: file) is printed, this com-
mand causes xzmore to exit.
s When the prompt --More--(Next file: file) is printed, this com-
mand causes xzmore to skip the next file and continue.
For list of keyboard commands supported while actually viewing the con-
tent of a file, refer to manual of the pager you use, usually more(1).
SEE ALSO
more(1), xz(1), xzless(1), zmore(1)
Tukaani 2013-06-30 XZMORE(1)

1165
doc/xz-file-format.txt Normal file

File diff suppressed because it is too large Load Diff

115
extra/7z2lzma/7z2lzma.bash Normal file
View File

@ -0,0 +1,115 @@
#!/bin/bash
#
#############################################################################
#
# 7z2lzma.bash is very primitive .7z to .lzma converter. The input file must
# have exactly one LZMA compressed stream, which has been created with the
# default lc, lp, and pb values. The CRC32 in the .7z archive is not checked,
# and the script may seem to succeed while it actually created a corrupt .lzma
# file. You should always try uncompressing both the original .7z and the
# created .lzma and compare that the output is identical.
#
# This script requires basic GNU tools and 7z or 7za tool from p7zip.
#
# Last modified: 2009-01-15 14:25+0200
#
#############################################################################
#
# Author: Lasse Collin <lasse.collin@tukaani.org>
#
# This file has been put into the public domain.
# You can do whatever you want with this file.
#
#############################################################################
# You can use 7z or 7za, both will work.
SEVENZIP=7za
if [ $# != 2 -o -z "$1" -o -z "$2" ]; then
echo "Usage: $0 input.7z output.lzma"
exit 1
fi
# Converts an integer variable to little endian binary integer.
int2bin()
{
local LEN=$1
local NUM=$2
local HEX=(0 1 2 3 4 5 6 7 8 9 A B C D E F)
local I
for ((I=0; I < "$LEN"; ++I)); do
printf "\\x${HEX[(NUM >> 4) & 0x0F]}${HEX[NUM & 0x0F]}"
NUM=$((NUM >> 8))
done
}
# Make sure we get possible errors from pipes.
set -o pipefail
# Get information about the input file. At least older 7z and 7za versions
# may return with zero exit status even when an error occurred, so check
# if the output has any lines beginning with "Error".
INFO=$("$SEVENZIP" l -slt "$1")
if [ $? != 0 ] || printf '%s\n' "$INFO" | grep -q ^Error; then
printf '%s\n' "$INFO"
exit 1
fi
# Check if the input file has more than one compressed block.
if printf '%s\n' "$INFO" | grep -q '^Block = 1'; then
echo "Cannot convert, because the input file has more than"
echo "one compressed block."
exit 1
fi
# Get compressed, uncompressed, and dictionary size.
CSIZE=$(printf '%s\n' "$INFO" | sed -rn 's|^Packed Size = ([0-9]+$)|\1|p')
USIZE=$(printf '%s\n' "$INFO" | sed -rn 's|^Size = ([0-9]+$)|\1|p')
DICT=$(printf '%s\n' "$INFO" | sed -rn 's|^Method = LZMA:([0-9]+[bkm]?)$|\1|p')
if [ -z "$CSIZE" -o -z "$USIZE" -o -z "$DICT" ]; then
echo "Parsing output of $SEVENZIP failed. Maybe the file uses some"
echo "other compression method than plain LZMA."
exit 1
fi
# The following assumes that the default lc, lp, and pb settings were used.
# Otherwise the output will be corrupt.
printf '\x5D' > "$2"
# Dictionary size can be either was power of two, bytes, kibibytes, or
# mebibytes. We need to convert it to bytes.
case $DICT in
*b)
DICT=${DICT%b}
;;
*k)
DICT=${DICT%k}
DICT=$((DICT << 10))
;;
*m)
DICT=${DICT%m}
DICT=$((DICT << 20))
;;
*)
DICT=$((1 << DICT))
;;
esac
int2bin 4 "$DICT" >> "$2"
# Uncompressed size
int2bin 8 "$USIZE" >> "$2"
# Copy the actual compressed data. Using multiple dd commands to avoid
# copying large amount of data with one-byte block size, which would be
# annoyingly slow.
BS=8192
BIGSIZE=$((CSIZE / BS))
CSIZE=$((CSIZE % BS))
{
dd of=/dev/null bs=32 count=1 \
&& dd bs="$BS" count="$BIGSIZE" \
&& dd bs=1 count="$CSIZE"
} < "$1" >> "$2"
exit $?

88
extra/scanlzma/scanlzma.c Normal file
View File

@ -0,0 +1,88 @@
/*
scanlzma, scan for lzma compressed data in stdin and echo it to stdout.
Copyright (C) 2006 Timo Lindfors
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
*/
/* Usage example:
$ wget http://www.wifi-shop.cz/Files/produkty/wa2204/wa2204av1.4.1.zip
$ unzip wa2204av1.4.1.zip
$ gcc scanlzma.c -o scanlzma -Wall
$ ./scanlzma 0 < WA2204-FW1.4.1/linux-1.4.bin | lzma -c -d | strings | grep -i "copyright"
UpdateDD version 2.5, Copyright (C) 2005 Philipp Benner.
Copyright (C) 2005 Philipp Benner.
Copyright (C) 2005 Philipp Benner.
mawk 1.3%s%s %s, Copyright (C) Michael D. Brennan
# Copyright (C) 1998, 1999, 2001 Henry Spencer.
...
*/
/* LZMA compressed file format */
/* --------------------------- */
/* Offset Size Description */
/* 0 1 Special LZMA properties for compressed data */
/* 1 4 Dictionary size (little endian) */
/* 5 8 Uncompressed size (little endian). -1 means unknown size */
/* 13 Compressed data */
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#define BUFSIZE 4096
int find_lzma_header(unsigned char *buf) {
return (buf[0] < 0xE1
&& buf[0] == 0x5d
&& buf[4] < 0x20
&& (memcmp (buf + 10 , "\x00\x00\x00", 3) == 0
|| (memcmp (buf + 5, "\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF", 8) == 0)));
}
int main(int argc, char *argv[]) {
unsigned char buf[BUFSIZE];
int ret, i, numlzma, blocks=0;
if (argc != 2) {
printf("usage: %s numlzma < infile | lzma -c -d > outfile\n"
"where numlzma is index of lzma file to extract, starting from zero.\n",
argv[0]);
exit(1);
}
numlzma = atoi(argv[1]);
for (;;) {
/* Read data. */
ret = fread(buf, BUFSIZE, 1, stdin);
if (ret != 1)
break;
/* Scan for signature. */
for (i = 0; i<BUFSIZE-23; i++) {
if (find_lzma_header(buf+i) && numlzma-- <= 0) {
fwrite(buf+i, (BUFSIZE-i), 1, stdout);
for (;;) {
int ch;
ch = getchar();
if (ch == EOF)
exit(0);
putchar(ch);
}
}
}
blocks++;
}
return 1;
}

1197
lib/getopt.c Normal file

File diff suppressed because it is too large Load Diff

226
lib/getopt.in.h Normal file
View File

@ -0,0 +1,226 @@
/* Declarations for getopt.
Copyright (C) 1989-1994,1996-1999,2001,2003,2004,2005,2006,2007
Free Software Foundation, Inc.
This file is part of the GNU C Library.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU Lesser General Public License as published by
the Free Software Foundation; either version 2.1, or (at your option)
any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along
with this program; if not, write to the Free Software Foundation,
Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. */
#ifndef _GETOPT_H
#ifndef __need_getopt
# define _GETOPT_H 1
#endif
/* Standalone applications should #define __GETOPT_PREFIX to an
identifier that prefixes the external functions and variables
defined in this header. When this happens, include the
headers that might declare getopt so that they will not cause
confusion if included after this file. Then systematically rename
identifiers so that they do not collide with the system functions
and variables. Renaming avoids problems with some compilers and
linkers. */
#if defined __GETOPT_PREFIX && !defined __need_getopt
# include <stdlib.h>
# include <stdio.h>
# include <unistd.h>
# undef __need_getopt
# undef getopt
# undef getopt_long
# undef getopt_long_only
# undef optarg
# undef opterr
# undef optind
# undef optopt
# define __GETOPT_CONCAT(x, y) x ## y
# define __GETOPT_XCONCAT(x, y) __GETOPT_CONCAT (x, y)
# define __GETOPT_ID(y) __GETOPT_XCONCAT (__GETOPT_PREFIX, y)
# define getopt __GETOPT_ID (getopt)
# define getopt_long __GETOPT_ID (getopt_long)
# define getopt_long_only __GETOPT_ID (getopt_long_only)
# define optarg __GETOPT_ID (optarg)
# define opterr __GETOPT_ID (opterr)
# define optind __GETOPT_ID (optind)
# define optopt __GETOPT_ID (optopt)
#endif
/* Standalone applications get correct prototypes for getopt_long and
getopt_long_only; they declare "char **argv". libc uses prototypes
with "char *const *argv" that are incorrect because getopt_long and
getopt_long_only can permute argv; this is required for backward
compatibility (e.g., for LSB 2.0.1).
This used to be `#if defined __GETOPT_PREFIX && !defined __need_getopt',
but it caused redefinition warnings if both unistd.h and getopt.h were
included, since unistd.h includes getopt.h having previously defined
__need_getopt.
The only place where __getopt_argv_const is used is in definitions
of getopt_long and getopt_long_only below, but these are visible
only if __need_getopt is not defined, so it is quite safe to rewrite
the conditional as follows:
*/
#if !defined __need_getopt
# if defined __GETOPT_PREFIX
# define __getopt_argv_const /* empty */
# else
# define __getopt_argv_const const
# endif
#endif
/* If __GNU_LIBRARY__ is not already defined, either we are being used
standalone, or this is the first header included in the source file.
If we are being used with glibc, we need to include <features.h>, but
that does not exist if we are standalone. So: if __GNU_LIBRARY__ is
not defined, include <ctype.h>, which will pull in <features.h> for us
if it's from glibc. (Why ctype.h? It's guaranteed to exist and it
doesn't flood the namespace with stuff the way some other headers do.) */
#if !defined __GNU_LIBRARY__
# include <ctype.h>
#endif
#ifndef __THROW
# ifndef __GNUC_PREREQ
# define __GNUC_PREREQ(maj, min) (0)
# endif
# if defined __cplusplus && __GNUC_PREREQ (2,8)
# define __THROW throw ()
# else
# define __THROW
# endif
#endif
#ifdef __cplusplus
extern "C" {
#endif
/* For communication from `getopt' to the caller.
When `getopt' finds an option that takes an argument,
the argument value is returned here.
Also, when `ordering' is RETURN_IN_ORDER,
each non-option ARGV-element is returned here. */
extern char *optarg;
/* Index in ARGV of the next element to be scanned.
This is used for communication to and from the caller
and for communication between successive calls to `getopt'.
On entry to `getopt', zero means this is the first call; initialize.
When `getopt' returns -1, this is the index of the first of the
non-option elements that the caller should itself scan.
Otherwise, `optind' communicates from one call to the next
how much of ARGV has been scanned so far. */
extern int optind;
/* Callers store zero here to inhibit the error message `getopt' prints
for unrecognized options. */
extern int opterr;
/* Set to an option character which was unrecognized. */
extern int optopt;
#ifndef __need_getopt
/* Describe the long-named options requested by the application.
The LONG_OPTIONS argument to getopt_long or getopt_long_only is a vector
of `struct option' terminated by an element containing a name which is
zero.
The field `has_arg' is:
no_argument (or 0) if the option does not take an argument,
required_argument (or 1) if the option requires an argument,
optional_argument (or 2) if the option takes an optional argument.
If the field `flag' is not NULL, it points to a variable that is set
to the value given in the field `val' when the option is found, but
left unchanged if the option is not found.
To have a long-named option do something other than set an `int' to
a compiled-in constant, such as set a value from `optarg', set the
option's `flag' field to zero and its `val' field to a nonzero
value (the equivalent single-letter option character, if there is
one). For long options that have a zero `flag' field, `getopt'
returns the contents of the `val' field. */
struct option
{
const char *name;
/* has_arg can't be an enum because some compilers complain about
type mismatches in all the code that assumes it is an int. */
int has_arg;
int *flag;
int val;
};
/* Names for the values of the `has_arg' field of `struct option'. */
# define no_argument 0
# define required_argument 1
# define optional_argument 2
#endif /* need getopt */
/* Get definitions and prototypes for functions to process the
arguments in ARGV (ARGC of them, minus the program name) for
options given in OPTS.
Return the option character from OPTS just read. Return -1 when
there are no more options. For unrecognized options, or options
missing arguments, `optopt' is set to the option letter, and '?' is
returned.
The OPTS string is a list of characters which are recognized option
letters, optionally followed by colons, specifying that that letter
takes an argument, to be placed in `optarg'.
If a letter in OPTS is followed by two colons, its argument is
optional. This behavior is specific to the GNU `getopt'.
The argument `--' causes premature termination of argument
scanning, explicitly telling `getopt' that there are no more
options.
If OPTS begins with `-', then non-option arguments are treated as
arguments to the option '\1'. This behavior is specific to the GNU
`getopt'. If OPTS begins with `+', or POSIXLY_CORRECT is set in
the environment, then do not permute arguments. */
extern int getopt (int ___argc, char *const *___argv, const char *__shortopts)
__THROW;
#ifndef __need_getopt
extern int getopt_long (int ___argc, char *__getopt_argv_const *___argv,
const char *__shortopts,
const struct option *__longopts, int *__longind)
__THROW;
extern int getopt_long_only (int ___argc, char *__getopt_argv_const *___argv,
const char *__shortopts,
const struct option *__longopts, int *__longind)
__THROW;
#endif
#ifdef __cplusplus
}
#endif
/* Make sure we later can get all the definitions and declarations. */
#undef __need_getopt
#endif /* getopt.h */

171
lib/getopt1.c Normal file
View File

@ -0,0 +1,171 @@
/* getopt_long and getopt_long_only entry points for GNU getopt.
Copyright (C) 1987,88,89,90,91,92,93,94,96,97,98,2004,2006
Free Software Foundation, Inc.
This file is part of the GNU C Library.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU Lesser General Public License as published by
the Free Software Foundation; either version 2.1, or (at your option)
any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along
with this program; if not, write to the Free Software Foundation,
Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. */
#ifdef _LIBC
# include <getopt.h>
#else
# include <config.h>
# include "getopt.h"
#endif
#include "getopt_int.h"
#include <stdio.h>
/* This needs to come after some library #include
to get __GNU_LIBRARY__ defined. */
#ifdef __GNU_LIBRARY__
#include <stdlib.h>
#endif
#ifndef NULL
#define NULL 0
#endif
int
getopt_long (int argc, char *__getopt_argv_const *argv, const char *options,
const struct option *long_options, int *opt_index)
{
return _getopt_internal (argc, (char **) argv, options, long_options,
opt_index, 0, 0);
}
int
_getopt_long_r (int argc, char **argv, const char *options,
const struct option *long_options, int *opt_index,
struct _getopt_data *d)
{
return _getopt_internal_r (argc, argv, options, long_options, opt_index,
0, 0, d);
}
/* Like getopt_long, but '-' as well as '--' can indicate a long option.
If an option that starts with '-' (not '--') doesn't match a long option,
but does match a short option, it is parsed as a short option
instead. */
int
getopt_long_only (int argc, char *__getopt_argv_const *argv,
const char *options,
const struct option *long_options, int *opt_index)
{
return _getopt_internal (argc, (char **) argv, options, long_options,
opt_index, 1, 0);
}
int
_getopt_long_only_r (int argc, char **argv, const char *options,
const struct option *long_options, int *opt_index,
struct _getopt_data *d)
{
return _getopt_internal_r (argc, argv, options, long_options, opt_index,
1, 0, d);
}
#ifdef TEST
#include <stdio.h>
int
main (int argc, char **argv)
{
int c;
int digit_optind = 0;
while (1)
{
int this_option_optind = optind ? optind : 1;
int option_index = 0;
static struct option long_options[] =
{
{"add", 1, 0, 0},
{"append", 0, 0, 0},
{"delete", 1, 0, 0},
{"verbose", 0, 0, 0},
{"create", 0, 0, 0},
{"file", 1, 0, 0},
{0, 0, 0, 0}
};
c = getopt_long (argc, argv, "abc:d:0123456789",
long_options, &option_index);
if (c == -1)
break;
switch (c)
{
case 0:
printf ("option %s", long_options[option_index].name);
if (optarg)
printf (" with arg %s", optarg);
printf ("\n");
break;
case '0':
case '1':
case '2':
case '3':
case '4':
case '5':
case '6':
case '7':
case '8':
case '9':
if (digit_optind != 0 && digit_optind != this_option_optind)
printf ("digits occur in two different argv-elements.\n");
digit_optind = this_option_optind;
printf ("option %c\n", c);
break;
case 'a':
printf ("option a\n");
break;
case 'b':
printf ("option b\n");
break;
case 'c':
printf ("option c with value `%s'\n", optarg);
break;
case 'd':
printf ("option d with value `%s'\n", optarg);
break;
case '?':
break;
default:
printf ("?? getopt returned character code 0%o ??\n", c);
}
}
if (optind < argc)
{
printf ("non-option ARGV-elements: ");
while (optind < argc)
printf ("%s ", argv[optind++]);
printf ("\n");
}
exit (0);
}
#endif /* TEST */

131
lib/getopt_int.h Normal file
View File

@ -0,0 +1,131 @@
/* Internal declarations for getopt.
Copyright (C) 1989-1994,1996-1999,2001,2003,2004
Free Software Foundation, Inc.
This file is part of the GNU C Library.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2, or (at your option)
any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation,
Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. */
#ifndef _GETOPT_INT_H
#define _GETOPT_INT_H 1
extern int _getopt_internal (int ___argc, char **___argv,
const char *__shortopts,
const struct option *__longopts, int *__longind,
int __long_only, int __posixly_correct);
/* Reentrant versions which can handle parsing multiple argument
vectors at the same time. */
/* Data type for reentrant functions. */
struct _getopt_data
{
/* These have exactly the same meaning as the corresponding global
variables, except that they are used for the reentrant
versions of getopt. */
int optind;
int opterr;
int optopt;
char *optarg;
/* Internal members. */
/* True if the internal members have been initialized. */
int __initialized;
/* The next char to be scanned in the option-element
in which the last option character we returned was found.
This allows us to pick up the scan where we left off.
If this is zero, or a null string, it means resume the scan
by advancing to the next ARGV-element. */
char *__nextchar;
/* Describe how to deal with options that follow non-option ARGV-elements.
If the caller did not specify anything,
the default is REQUIRE_ORDER if the environment variable
POSIXLY_CORRECT is defined, PERMUTE otherwise.
REQUIRE_ORDER means don't recognize them as options;
stop option processing when the first non-option is seen.
This is what Unix does.
This mode of operation is selected by either setting the environment
variable POSIXLY_CORRECT, or using `+' as the first character
of the list of option characters, or by calling getopt.
PERMUTE is the default. We permute the contents of ARGV as we
scan, so that eventually all the non-options are at the end.
This allows options to be given in any order, even with programs
that were not written to expect this.
RETURN_IN_ORDER is an option available to programs that were
written to expect options and other ARGV-elements in any order
and that care about the ordering of the two. We describe each
non-option ARGV-element as if it were the argument of an option
with character code 1. Using `-' as the first character of the
list of option characters selects this mode of operation.
The special argument `--' forces an end of option-scanning regardless
of the value of `ordering'. In the case of RETURN_IN_ORDER, only
`--' can cause `getopt' to return -1 with `optind' != ARGC. */
enum
{
REQUIRE_ORDER, PERMUTE, RETURN_IN_ORDER
} __ordering;
/* If the POSIXLY_CORRECT environment variable is set
or getopt was called. */
int __posixly_correct;
/* Handle permutation of arguments. */
/* Describe the part of ARGV that contains non-options that have
been skipped. `first_nonopt' is the index in ARGV of the first
of them; `last_nonopt' is the index after the last of them. */
int __first_nonopt;
int __last_nonopt;
#if defined _LIBC && defined USE_NONOPTION_FLAGS
int __nonoption_flags_max_len;
int __nonoption_flags_len;
# endif
};
/* The initializer is necessary to set OPTIND and OPTERR to their
default values and to clear the initialization flag. */
#define _GETOPT_DATA_INITIALIZER { 1, 1 }
extern int _getopt_internal_r (int ___argc, char **___argv,
const char *__shortopts,
const struct option *__longopts, int *__longind,
int __long_only, int __posixly_correct,
struct _getopt_data *__data);
extern int _getopt_long_r (int ___argc, char **___argv,
const char *__shortopts,
const struct option *__longopts, int *__longind,
struct _getopt_data *__data);
extern int _getopt_long_only_r (int ___argc, char **___argv,
const char *__shortopts,
const struct option *__longopts,
int *__longind,
struct _getopt_data *__data);
#endif /* getopt_int.h */

23
po/LINGUAS Normal file
View File

@ -0,0 +1,23 @@
ca
cs
da
de
eo
es
fi
fr
hr
hu
it
ko
pl
pt
pt_BR
ro
sr
sv
tr
uk
vi
zh_CN
zh_TW

46
po/Makevars Normal file
View File

@ -0,0 +1,46 @@
# Makefile variables for PO directory in any package using GNU gettext.
# Usually the message domain is the same as the package name.
DOMAIN = $(PACKAGE)
# These two variables depend on the location of this directory.
subdir = po
top_builddir = ..
# These options get passed to xgettext.
XGETTEXT_OPTIONS = --keyword=_ --keyword=N_
# This is the copyright holder that gets inserted into the header of the
# $(DOMAIN).pot file. Set this to the copyright holder of the surrounding
# package. (Note that the msgstr strings, extracted from the package's
# sources, belong to the copyright holder of the package.) Translators are
# expected to transfer the copyright for their translations to this person
# or entity, or to disclaim their copyright. The empty string stands for
# the public domain; in this case the translators are expected to disclaim
# their copyright.
COPYRIGHT_HOLDER =
# This is the email address or URL to which the translators shall report
# bugs in the untranslated strings:
# - Strings which are not entire sentences, see the maintainer guidelines
# in the GNU gettext documentation, section 'Preparing Strings'.
# - Strings which use unclear terms or require additional context to be
# understood.
# - Strings which make invalid assumptions about notation of date, time or
# money.
# - Pluralisation problems.
# - Incorrect English spelling.
# - Incorrect formatting.
# It can be your email address, or a mailing list address where translators
# can write to without being subscribed, or the URL of a web page through
# which the translators can contact you.
MSGID_BUGS_ADDRESS =
# This is the list of locale categories, beyond LC_MESSAGES, for which the
# message catalogs shall be used. It is usually empty.
EXTRA_LOCALE_CATEGORIES =
# Although you may need slightly wider terminal than 80 chars, it is
# much nicer to edit the output of --help when this is set.
XGETTEXT_OPTIONS += --no-wrap
MSGMERGE += --no-wrap

13
po/POTFILES.in Normal file
View File

@ -0,0 +1,13 @@
# List of source files which contain translatable strings.
src/xz/args.c
src/xz/coder.c
src/xz/file_io.c
src/xz/hardware.c
src/xz/list.c
src/xz/main.c
src/xz/message.c
src/xz/options.c
src/xz/signals.c
src/xz/suffix.c
src/xz/util.c
src/common/tuklib_exit.c

62
po/Rules-quot Normal file
View File

@ -0,0 +1,62 @@
# Special Makefile rules for English message catalogs with quotation marks.
#
# Copyright (C) 2001-2017 Free Software Foundation, Inc.
# This file, Rules-quot, and its auxiliary files (listed under
# DISTFILES.common.extra1) are free software; the Free Software Foundation
# gives unlimited permission to use, copy, distribute, and modify them.
DISTFILES.common.extra1 = quot.sed boldquot.sed en@quot.header en@boldquot.header insert-header.sin Rules-quot
.SUFFIXES: .insert-header .po-update-en
en@quot.po-create:
$(MAKE) en@quot.po-update
en@boldquot.po-create:
$(MAKE) en@boldquot.po-update
en@quot.po-update: en@quot.po-update-en
en@boldquot.po-update: en@boldquot.po-update-en
.insert-header.po-update-en:
@lang=`echo $@ | sed -e 's/\.po-update-en$$//'`; \
if test "$(PACKAGE)" = "gettext-tools" && test "$(CROSS_COMPILING)" != "yes"; then PATH=`pwd`/../src:$$PATH; GETTEXTLIBDIR=`cd $(top_srcdir)/src && pwd`; export GETTEXTLIBDIR; fi; \
tmpdir=`pwd`; \
echo "$$lang:"; \
ll=`echo $$lang | sed -e 's/@.*//'`; \
LC_ALL=C; export LC_ALL; \
cd $(srcdir); \
if $(MSGINIT) $(MSGINIT_OPTIONS) -i $(DOMAIN).pot --no-translator -l $$lang -o - 2>/dev/null \
| $(SED) -f $$tmpdir/$$lang.insert-header | $(MSGCONV) -t UTF-8 | \
{ case `$(MSGFILTER) --version | sed 1q | sed -e 's,^[^0-9]*,,'` in \
'' | 0.[0-9] | 0.[0-9].* | 0.1[0-8] | 0.1[0-8].*) \
$(MSGFILTER) $(SED) -f `echo $$lang | sed -e 's/.*@//'`.sed \
;; \
*) \
$(MSGFILTER) `echo $$lang | sed -e 's/.*@//'` \
;; \
esac } 2>/dev/null > $$tmpdir/$$lang.new.po \
; then \
if cmp $$lang.po $$tmpdir/$$lang.new.po >/dev/null 2>&1; then \
rm -f $$tmpdir/$$lang.new.po; \
else \
if mv -f $$tmpdir/$$lang.new.po $$lang.po; then \
:; \
else \
echo "creation of $$lang.po failed: cannot move $$tmpdir/$$lang.new.po to $$lang.po" 1>&2; \
exit 1; \
fi; \
fi; \
else \
echo "creation of $$lang.po failed!" 1>&2; \
rm -f $$tmpdir/$$lang.new.po; \
fi
en@quot.insert-header: insert-header.sin
sed -e '/^#/d' -e 's/HEADER/en@quot.header/g' $(srcdir)/insert-header.sin > en@quot.insert-header
en@boldquot.insert-header: insert-header.sin
sed -e '/^#/d' -e 's/HEADER/en@boldquot.header/g' $(srcdir)/insert-header.sin > en@boldquot.insert-header
mostlyclean: mostlyclean-quot
mostlyclean-quot:
rm -f *.insert-header

10
po/boldquot.sed Normal file
View File

@ -0,0 +1,10 @@
s/"\([^"]*\)"/“\1”/g
s/`\([^`']*\)'/\1/g
s/ '\([^`']*\)' / \1 /g
s/ '\([^`']*\)'$/ \1/g
s/^'\([^`']*\)' /\1 /g
s/“”/""/g
s///g
s//”/g
s///g
s///g

BIN
po/ca.gmo Normal file

Binary file not shown.

1031
po/ca.po Normal file

File diff suppressed because it is too large Load Diff

BIN
po/cs.gmo Normal file

Binary file not shown.

1138
po/cs.po Normal file

File diff suppressed because it is too large Load Diff

BIN
po/da.gmo Normal file

Binary file not shown.

1008
po/da.po Normal file

File diff suppressed because it is too large Load Diff

BIN
po/de.gmo Normal file

Binary file not shown.

1055
po/de.po Normal file

File diff suppressed because it is too large Load Diff

25
po/en@boldquot.header Normal file
View File

@ -0,0 +1,25 @@
# All this catalog "translates" are quotation characters.
# The msgids must be ASCII and therefore cannot contain real quotation
# characters, only substitutes like grave accent (0x60), apostrophe (0x27)
# and double quote (0x22). These substitutes look strange; see
# https://www.cl.cam.ac.uk/~mgk25/ucs/quotes.html
#
# This catalog translates grave accent (0x60) and apostrophe (0x27) to
# left single quotation mark (U+2018) and right single quotation mark (U+2019).
# It also translates pairs of apostrophe (0x27) to
# left single quotation mark (U+2018) and right single quotation mark (U+2019)
# and pairs of quotation mark (0x22) to
# left double quotation mark (U+201C) and right double quotation mark (U+201D).
#
# When output to an UTF-8 terminal, the quotation characters appear perfectly.
# When output to an ISO-8859-1 terminal, the single quotation marks are
# transliterated to apostrophes (by iconv in glibc 2.2 or newer) or to
# grave/acute accent (by libiconv), and the double quotation marks are
# transliterated to 0x22.
# When output to an ASCII terminal, the single quotation marks are
# transliterated to apostrophes, and the double quotation marks are
# transliterated to 0x22.
#
# This catalog furthermore displays the text between the quotation marks in
# bold face, assuming the VT100/XTerm escape sequences.
#

22
po/en@quot.header Normal file
View File

@ -0,0 +1,22 @@
# All this catalog "translates" are quotation characters.
# The msgids must be ASCII and therefore cannot contain real quotation
# characters, only substitutes like grave accent (0x60), apostrophe (0x27)
# and double quote (0x22). These substitutes look strange; see
# https://www.cl.cam.ac.uk/~mgk25/ucs/quotes.html
#
# This catalog translates grave accent (0x60) and apostrophe (0x27) to
# left single quotation mark (U+2018) and right single quotation mark (U+2019).
# It also translates pairs of apostrophe (0x27) to
# left single quotation mark (U+2018) and right single quotation mark (U+2019)
# and pairs of quotation mark (0x22) to
# left double quotation mark (U+201C) and right double quotation mark (U+201D).
#
# When output to an UTF-8 terminal, the quotation characters appear perfectly.
# When output to an ISO-8859-1 terminal, the single quotation marks are
# transliterated to apostrophes (by iconv in glibc 2.2 or newer) or to
# grave/acute accent (by libiconv), and the double quotation marks are
# transliterated to 0x22.
# When output to an ASCII terminal, the single quotation marks are
# transliterated to apostrophes, and the double quotation marks are
# transliterated to 0x22.
#

BIN
po/eo.gmo Normal file

Binary file not shown.

1028
po/eo.po Normal file

File diff suppressed because it is too large Load Diff

BIN
po/es.gmo Normal file

Binary file not shown.

1073
po/es.po Normal file

File diff suppressed because it is too large Load Diff

BIN
po/fi.gmo Normal file

Binary file not shown.

1059
po/fi.po Normal file

File diff suppressed because it is too large Load Diff

BIN
po/fr.gmo Normal file

Binary file not shown.

1135
po/fr.po Normal file

File diff suppressed because it is too large Load Diff

BIN
po/hr.gmo Normal file

Binary file not shown.

1069
po/hr.po Normal file

File diff suppressed because it is too large Load Diff

BIN
po/hu.gmo Normal file

Binary file not shown.

1073
po/hu.po Normal file

File diff suppressed because it is too large Load Diff

28
po/insert-header.sin Normal file
View File

@ -0,0 +1,28 @@
# Sed script that inserts the file called HEADER before the header entry.
#
# Copyright (C) 2001 Free Software Foundation, Inc.
# Written by Bruno Haible <bruno@clisp.org>, 2001.
# This file is free software; the Free Software Foundation gives
# unlimited permission to use, copy, distribute, and modify it.
#
# At each occurrence of a line starting with "msgid ", we execute the following
# commands. At the first occurrence, insert the file. At the following
# occurrences, do nothing. The distinction between the first and the following
# occurrences is achieved by looking at the hold space.
/^msgid /{
x
# Test if the hold space is empty.
s/m/m/
ta
# Yes it was empty. First occurrence. Read the file.
r HEADER
# Output the file's contents by reading the next line. But don't lose the
# current line while doing this.
g
N
bb
:a
# The hold space was nonempty. Following occurrences. Do nothing.
x
:b
}

BIN
po/it.gmo Normal file

Binary file not shown.

1125
po/it.po Normal file

File diff suppressed because it is too large Load Diff

BIN
po/ko.gmo Normal file

Binary file not shown.

1061
po/ko.po Normal file

File diff suppressed because it is too large Load Diff

BIN
po/pl.gmo Normal file

Binary file not shown.

1027
po/pl.po Normal file

File diff suppressed because it is too large Load Diff

BIN
po/pt.gmo Normal file

Binary file not shown.

1138
po/pt.po Normal file

File diff suppressed because it is too large Load Diff

BIN
po/pt_BR.gmo Normal file

Binary file not shown.

1136
po/pt_BR.po Normal file

File diff suppressed because it is too large Load Diff

6
po/quot.sed Normal file
View File

@ -0,0 +1,6 @@
s/"\([^"]*\)"/“\1”/g
s/`\([^`']*\)'/\1/g
s/ '\([^`']*\)' / \1 /g
s/ '\([^`']*\)'$/ \1/g
s/^'\([^`']*\)' /\1 /g
s/“”/""/g

25
po/remove-potcdate.sin Normal file
View File

@ -0,0 +1,25 @@
# Sed script that removes the POT-Creation-Date line in the header entry
# from a POT file.
#
# Copyright (C) 2002 Free Software Foundation, Inc.
# Copying and distribution of this file, with or without modification,
# are permitted in any medium without royalty provided the copyright
# notice and this notice are preserved. This file is offered as-is,
# without any warranty.
#
# The distinction between the first and the following occurrences of the
# pattern is achieved by looking at the hold space.
/^"POT-Creation-Date: .*"$/{
x
# Test if the hold space is empty.
s/P/P/
ta
# Yes it was empty. First occurrence. Remove the line.
g
d
bb
:a
# The hold space was nonempty. Following occurrences. Do nothing.
x
:b
}

BIN
po/ro.gmo Normal file

Binary file not shown.

1125
po/ro.po Normal file

File diff suppressed because it is too large Load Diff

BIN
po/sr.gmo Normal file

Binary file not shown.

Some files were not shown because too many files have changed in this diff Show More