bzip2-0.9.0c
This commit is contained in:
parent
1eb67a9d8f
commit
977101ad5f
47
ALGORITHMS
47
ALGORITHMS
@ -1,47 +0,0 @@
|
||||
|
||||
Bzip2 is not research work, in the sense that it doesn't present any
|
||||
new ideas. Rather, it's an engineering exercise based on existing
|
||||
ideas.
|
||||
|
||||
Four documents describe essentially all the ideas behind bzip2:
|
||||
|
||||
Michael Burrows and D. J. Wheeler:
|
||||
"A block-sorting lossless data compression algorithm"
|
||||
10th May 1994.
|
||||
Digital SRC Research Report 124.
|
||||
ftp://ftp.digital.com/pub/DEC/SRC/research-reports/SRC-124.ps.gz
|
||||
|
||||
Daniel S. Hirschberg and Debra A. LeLewer
|
||||
"Efficient Decoding of Prefix Codes"
|
||||
Communications of the ACM, April 1990, Vol 33, Number 4.
|
||||
You might be able to get an electronic copy of this
|
||||
from the ACM Digital Library.
|
||||
|
||||
David J. Wheeler
|
||||
Program bred3.c and accompanying document bred3.ps.
|
||||
This contains the idea behind the multi-table Huffman
|
||||
coding scheme.
|
||||
ftp://ftp.cl.cam.ac.uk/pub/user/djw3/
|
||||
|
||||
Jon L. Bentley and Robert Sedgewick
|
||||
"Fast Algorithms for Sorting and Searching Strings"
|
||||
Available from Sedgewick's web page,
|
||||
www.cs.princeton.edu/~rs
|
||||
|
||||
The following paper gives valuable additional insights into the
|
||||
algorithm, but is not immediately the basis of any code
|
||||
used in bzip2.
|
||||
|
||||
Peter Fenwick:
|
||||
Block Sorting Text Compression
|
||||
Proceedings of the 19th Australasian Computer Science Conference,
|
||||
Melbourne, Australia. Jan 31 - Feb 2, 1996.
|
||||
ftp://ftp.cs.auckland.ac.nz/pub/peter-f/ACSC96paper.ps
|
||||
|
||||
All three are well written, and make fascinating reading. If you want
|
||||
to modify bzip2 in any non-trivial way, I strongly suggest you obtain,
|
||||
read and understand these papers.
|
||||
|
||||
I am much indebted to the various authors for their help, support and
|
||||
advice.
|
||||
|
45
CHANGES
Normal file
45
CHANGES
Normal file
@ -0,0 +1,45 @@
|
||||
|
||||
|
||||
0.9.0
|
||||
~~~~~
|
||||
First version.
|
||||
|
||||
|
||||
0.9.0a
|
||||
~~~~~~
|
||||
Removed 'ranlib' from Makefile, since most modern Unix-es
|
||||
don't need it, or even know about it.
|
||||
|
||||
|
||||
0.9.0b
|
||||
~~~~~~
|
||||
Fixed a problem with error reporting in bzip2.c. This does not effect
|
||||
the library in any way. Problem is: versions 0.9.0 and 0.9.0a (of the
|
||||
program proper) compress and decompress correctly, but give misleading
|
||||
error messages (internal panics) when an I/O error occurs, instead of
|
||||
reporting the problem correctly. This shouldn't give any data loss
|
||||
(as far as I can see), but is confusing.
|
||||
|
||||
Made the inline declarations disappear for non-GCC compilers.
|
||||
|
||||
|
||||
0.9.0c
|
||||
~~~~~~
|
||||
Fixed some problems in the library pertaining to some boundary cases.
|
||||
This makes the library behave more correctly in those situations. The
|
||||
fixes apply only to features (calls and parameters) not used by
|
||||
bzip2.c, so the non-fixedness of them in previous versions has no
|
||||
effect on reliability of bzip2.c.
|
||||
|
||||
In bzlib.c:
|
||||
* made zero-length BZ_FLUSH work correctly in bzCompress().
|
||||
* fixed bzWrite/bzRead to ignore zero-length requests.
|
||||
* fixed bzread to correctly handle read requests after EOF.
|
||||
* wrong parameter order in call to bzDecompressInit in
|
||||
bzBuffToBuffDecompress. Fixed.
|
||||
|
||||
In compress.c:
|
||||
* changed setting of nGroups in sendMTFValues() so as to
|
||||
do a bit better on small files. This _does_ effect
|
||||
bzip2.c.
|
||||
|
360
LICENSE
360
LICENSE
@ -1,339 +1,39 @@
|
||||
GNU GENERAL PUBLIC LICENSE
|
||||
Version 2, June 1991
|
||||
|
||||
Copyright (C) 1989, 1991 Free Software Foundation, Inc.
|
||||
675 Mass Ave, Cambridge, MA 02139, USA
|
||||
Everyone is permitted to copy and distribute verbatim copies
|
||||
of this license document, but changing it is not allowed.
|
||||
This program, "bzip2" and associated library "libbzip2", are
|
||||
copyright (C) 1996-1998 Julian R Seward. All rights reserved.
|
||||
|
||||
Preamble
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
modification, are permitted provided that the following conditions
|
||||
are met:
|
||||
|
||||
The licenses for most software are designed to take away your
|
||||
freedom to share and change it. By contrast, the GNU General Public
|
||||
License is intended to guarantee your freedom to share and change free
|
||||
software--to make sure the software is free for all its users. This
|
||||
General Public License applies to most of the Free Software
|
||||
Foundation's software and to any other program whose authors commit to
|
||||
using it. (Some other Free Software Foundation software is covered by
|
||||
the GNU Library General Public License instead.) You can apply it to
|
||||
your programs, too.
|
||||
1. Redistributions of source code must retain the above copyright
|
||||
notice, this list of conditions and the following disclaimer.
|
||||
|
||||
When we speak of free software, we are referring to freedom, not
|
||||
price. Our General Public Licenses are designed to make sure that you
|
||||
have the freedom to distribute copies of free software (and charge for
|
||||
this service if you wish), that you receive source code or can get it
|
||||
if you want it, that you can change the software or use pieces of it
|
||||
in new free programs; and that you know you can do these things.
|
||||
2. The origin of this software must not be misrepresented; you must
|
||||
not claim that you wrote the original software. If you use this
|
||||
software in a product, an acknowledgment in the product
|
||||
documentation would be appreciated but is not required.
|
||||
|
||||
To protect your rights, we need to make restrictions that forbid
|
||||
anyone to deny you these rights or to ask you to surrender the rights.
|
||||
These restrictions translate to certain responsibilities for you if you
|
||||
distribute copies of the software, or if you modify it.
|
||||
3. Altered source versions must be plainly marked as such, and must
|
||||
not be misrepresented as being the original software.
|
||||
|
||||
For example, if you distribute copies of such a program, whether
|
||||
gratis or for a fee, you must give the recipients all the rights that
|
||||
you have. You must make sure that they, too, receive or can get the
|
||||
source code. And you must show them these terms so they know their
|
||||
rights.
|
||||
4. The name of the author may not be used to endorse or promote
|
||||
products derived from this software without specific prior written
|
||||
permission.
|
||||
|
||||
We protect your rights with two steps: (1) copyright the software, and
|
||||
(2) offer you this license which gives you legal permission to copy,
|
||||
distribute and/or modify the software.
|
||||
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS
|
||||
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
|
||||
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||||
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
|
||||
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
|
||||
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
|
||||
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
|
||||
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
|
||||
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
Also, for each author's protection and ours, we want to make certain
|
||||
that everyone understands that there is no warranty for this free
|
||||
software. If the software is modified by someone else and passed on, we
|
||||
want its recipients to know that what they have is not the original, so
|
||||
that any problems introduced by others will not reflect on the original
|
||||
authors' reputations.
|
||||
Julian Seward, Guildford, Surrey, UK.
|
||||
jseward@acm.org
|
||||
bzip2/libbzip2 version 0.9.0 of 28 June 1998
|
||||
|
||||
Finally, any free program is threatened constantly by software
|
||||
patents. We wish to avoid the danger that redistributors of a free
|
||||
program will individually obtain patent licenses, in effect making the
|
||||
program proprietary. To prevent this, we have made it clear that any
|
||||
patent must be licensed for everyone's free use or not licensed at all.
|
||||
|
||||
The precise terms and conditions for copying, distribution and
|
||||
modification follow.
|
||||
|
||||
GNU GENERAL PUBLIC LICENSE
|
||||
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
|
||||
|
||||
0. This License applies to any program or other work which contains
|
||||
a notice placed by the copyright holder saying it may be distributed
|
||||
under the terms of this General Public License. The "Program", below,
|
||||
refers to any such program or work, and a "work based on the Program"
|
||||
means either the Program or any derivative work under copyright law:
|
||||
that is to say, a work containing the Program or a portion of it,
|
||||
either verbatim or with modifications and/or translated into another
|
||||
language. (Hereinafter, translation is included without limitation in
|
||||
the term "modification".) Each licensee is addressed as "you".
|
||||
|
||||
Activities other than copying, distribution and modification are not
|
||||
covered by this License; they are outside its scope. The act of
|
||||
running the Program is not restricted, and the output from the Program
|
||||
is covered only if its contents constitute a work based on the
|
||||
Program (independent of having been made by running the Program).
|
||||
Whether that is true depends on what the Program does.
|
||||
|
||||
1. You may copy and distribute verbatim copies of the Program's
|
||||
source code as you receive it, in any medium, provided that you
|
||||
conspicuously and appropriately publish on each copy an appropriate
|
||||
copyright notice and disclaimer of warranty; keep intact all the
|
||||
notices that refer to this License and to the absence of any warranty;
|
||||
and give any other recipients of the Program a copy of this License
|
||||
along with the Program.
|
||||
|
||||
You may charge a fee for the physical act of transferring a copy, and
|
||||
you may at your option offer warranty protection in exchange for a fee.
|
||||
|
||||
2. You may modify your copy or copies of the Program or any portion
|
||||
of it, thus forming a work based on the Program, and copy and
|
||||
distribute such modifications or work under the terms of Section 1
|
||||
above, provided that you also meet all of these conditions:
|
||||
|
||||
a) You must cause the modified files to carry prominent notices
|
||||
stating that you changed the files and the date of any change.
|
||||
|
||||
b) You must cause any work that you distribute or publish, that in
|
||||
whole or in part contains or is derived from the Program or any
|
||||
part thereof, to be licensed as a whole at no charge to all third
|
||||
parties under the terms of this License.
|
||||
|
||||
c) If the modified program normally reads commands interactively
|
||||
when run, you must cause it, when started running for such
|
||||
interactive use in the most ordinary way, to print or display an
|
||||
announcement including an appropriate copyright notice and a
|
||||
notice that there is no warranty (or else, saying that you provide
|
||||
a warranty) and that users may redistribute the program under
|
||||
these conditions, and telling the user how to view a copy of this
|
||||
License. (Exception: if the Program itself is interactive but
|
||||
does not normally print such an announcement, your work based on
|
||||
the Program is not required to print an announcement.)
|
||||
|
||||
These requirements apply to the modified work as a whole. If
|
||||
identifiable sections of that work are not derived from the Program,
|
||||
and can be reasonably considered independent and separate works in
|
||||
themselves, then this License, and its terms, do not apply to those
|
||||
sections when you distribute them as separate works. But when you
|
||||
distribute the same sections as part of a whole which is a work based
|
||||
on the Program, the distribution of the whole must be on the terms of
|
||||
this License, whose permissions for other licensees extend to the
|
||||
entire whole, and thus to each and every part regardless of who wrote it.
|
||||
|
||||
Thus, it is not the intent of this section to claim rights or contest
|
||||
your rights to work written entirely by you; rather, the intent is to
|
||||
exercise the right to control the distribution of derivative or
|
||||
collective works based on the Program.
|
||||
|
||||
In addition, mere aggregation of another work not based on the Program
|
||||
with the Program (or with a work based on the Program) on a volume of
|
||||
a storage or distribution medium does not bring the other work under
|
||||
the scope of this License.
|
||||
|
||||
3. You may copy and distribute the Program (or a work based on it,
|
||||
under Section 2) in object code or executable form under the terms of
|
||||
Sections 1 and 2 above provided that you also do one of the following:
|
||||
|
||||
a) Accompany it with the complete corresponding machine-readable
|
||||
source code, which must be distributed under the terms of Sections
|
||||
1 and 2 above on a medium customarily used for software interchange; or,
|
||||
|
||||
b) Accompany it with a written offer, valid for at least three
|
||||
years, to give any third party, for a charge no more than your
|
||||
cost of physically performing source distribution, a complete
|
||||
machine-readable copy of the corresponding source code, to be
|
||||
distributed under the terms of Sections 1 and 2 above on a medium
|
||||
customarily used for software interchange; or,
|
||||
|
||||
c) Accompany it with the information you received as to the offer
|
||||
to distribute corresponding source code. (This alternative is
|
||||
allowed only for noncommercial distribution and only if you
|
||||
received the program in object code or executable form with such
|
||||
an offer, in accord with Subsection b above.)
|
||||
|
||||
The source code for a work means the preferred form of the work for
|
||||
making modifications to it. For an executable work, complete source
|
||||
code means all the source code for all modules it contains, plus any
|
||||
associated interface definition files, plus the scripts used to
|
||||
control compilation and installation of the executable. However, as a
|
||||
special exception, the source code distributed need not include
|
||||
anything that is normally distributed (in either source or binary
|
||||
form) with the major components (compiler, kernel, and so on) of the
|
||||
operating system on which the executable runs, unless that component
|
||||
itself accompanies the executable.
|
||||
|
||||
If distribution of executable or object code is made by offering
|
||||
access to copy from a designated place, then offering equivalent
|
||||
access to copy the source code from the same place counts as
|
||||
distribution of the source code, even though third parties are not
|
||||
compelled to copy the source along with the object code.
|
||||
|
||||
4. You may not copy, modify, sublicense, or distribute the Program
|
||||
except as expressly provided under this License. Any attempt
|
||||
otherwise to copy, modify, sublicense or distribute the Program is
|
||||
void, and will automatically terminate your rights under this License.
|
||||
However, parties who have received copies, or rights, from you under
|
||||
this License will not have their licenses terminated so long as such
|
||||
parties remain in full compliance.
|
||||
|
||||
5. You are not required to accept this License, since you have not
|
||||
signed it. However, nothing else grants you permission to modify or
|
||||
distribute the Program or its derivative works. These actions are
|
||||
prohibited by law if you do not accept this License. Therefore, by
|
||||
modifying or distributing the Program (or any work based on the
|
||||
Program), you indicate your acceptance of this License to do so, and
|
||||
all its terms and conditions for copying, distributing or modifying
|
||||
the Program or works based on it.
|
||||
|
||||
6. Each time you redistribute the Program (or any work based on the
|
||||
Program), the recipient automatically receives a license from the
|
||||
original licensor to copy, distribute or modify the Program subject to
|
||||
these terms and conditions. You may not impose any further
|
||||
restrictions on the recipients' exercise of the rights granted herein.
|
||||
You are not responsible for enforcing compliance by third parties to
|
||||
this License.
|
||||
|
||||
7. If, as a consequence of a court judgment or allegation of patent
|
||||
infringement or for any other reason (not limited to patent issues),
|
||||
conditions are imposed on you (whether by court order, agreement or
|
||||
otherwise) that contradict the conditions of this License, they do not
|
||||
excuse you from the conditions of this License. If you cannot
|
||||
distribute so as to satisfy simultaneously your obligations under this
|
||||
License and any other pertinent obligations, then as a consequence you
|
||||
may not distribute the Program at all. For example, if a patent
|
||||
license would not permit royalty-free redistribution of the Program by
|
||||
all those who receive copies directly or indirectly through you, then
|
||||
the only way you could satisfy both it and this License would be to
|
||||
refrain entirely from distribution of the Program.
|
||||
|
||||
If any portion of this section is held invalid or unenforceable under
|
||||
any particular circumstance, the balance of the section is intended to
|
||||
apply and the section as a whole is intended to apply in other
|
||||
circumstances.
|
||||
|
||||
It is not the purpose of this section to induce you to infringe any
|
||||
patents or other property right claims or to contest validity of any
|
||||
such claims; this section has the sole purpose of protecting the
|
||||
integrity of the free software distribution system, which is
|
||||
implemented by public license practices. Many people have made
|
||||
generous contributions to the wide range of software distributed
|
||||
through that system in reliance on consistent application of that
|
||||
system; it is up to the author/donor to decide if he or she is willing
|
||||
to distribute software through any other system and a licensee cannot
|
||||
impose that choice.
|
||||
|
||||
This section is intended to make thoroughly clear what is believed to
|
||||
be a consequence of the rest of this License.
|
||||
|
||||
8. If the distribution and/or use of the Program is restricted in
|
||||
certain countries either by patents or by copyrighted interfaces, the
|
||||
original copyright holder who places the Program under this License
|
||||
may add an explicit geographical distribution limitation excluding
|
||||
those countries, so that distribution is permitted only in or among
|
||||
countries not thus excluded. In such case, this License incorporates
|
||||
the limitation as if written in the body of this License.
|
||||
|
||||
9. The Free Software Foundation may publish revised and/or new versions
|
||||
of the General Public License from time to time. Such new versions will
|
||||
be similar in spirit to the present version, but may differ in detail to
|
||||
address new problems or concerns.
|
||||
|
||||
Each version is given a distinguishing version number. If the Program
|
||||
specifies a version number of this License which applies to it and "any
|
||||
later version", you have the option of following the terms and conditions
|
||||
either of that version or of any later version published by the Free
|
||||
Software Foundation. If the Program does not specify a version number of
|
||||
this License, you may choose any version ever published by the Free Software
|
||||
Foundation.
|
||||
|
||||
10. If you wish to incorporate parts of the Program into other free
|
||||
programs whose distribution conditions are different, write to the author
|
||||
to ask for permission. For software which is copyrighted by the Free
|
||||
Software Foundation, write to the Free Software Foundation; we sometimes
|
||||
make exceptions for this. Our decision will be guided by the two goals
|
||||
of preserving the free status of all derivatives of our free software and
|
||||
of promoting the sharing and reuse of software generally.
|
||||
|
||||
NO WARRANTY
|
||||
|
||||
11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
|
||||
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
|
||||
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
|
||||
PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
|
||||
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
|
||||
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
|
||||
TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
|
||||
PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
|
||||
REPAIR OR CORRECTION.
|
||||
|
||||
12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
|
||||
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
|
||||
REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
|
||||
INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
|
||||
OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
|
||||
TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
|
||||
YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
|
||||
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
|
||||
POSSIBILITY OF SUCH DAMAGES.
|
||||
|
||||
END OF TERMS AND CONDITIONS
|
||||
|
||||
Appendix: How to Apply These Terms to Your New Programs
|
||||
|
||||
If you develop a new program, and you want it to be of the greatest
|
||||
possible use to the public, the best way to achieve this is to make it
|
||||
free software which everyone can redistribute and change under these terms.
|
||||
|
||||
To do so, attach the following notices to the program. It is safest
|
||||
to attach them to the start of each source file to most effectively
|
||||
convey the exclusion of warranty; and each file should have at least
|
||||
the "copyright" line and a pointer to where the full notice is found.
|
||||
|
||||
<one line to give the program's name and a brief idea of what it does.>
|
||||
Copyright (C) 19yy <name of author>
|
||||
|
||||
This program is free software; you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
the Free Software Foundation; either version 2 of the License, or
|
||||
(at your option) any later version.
|
||||
|
||||
This program is distributed in the hope that it will be useful,
|
||||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
GNU General Public License for more details.
|
||||
|
||||
You should have received a copy of the GNU General Public License
|
||||
along with this program; if not, write to the Free Software
|
||||
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
|
||||
|
||||
Also add information on how to contact you by electronic and paper mail.
|
||||
|
||||
If the program is interactive, make it output a short notice like this
|
||||
when it starts in an interactive mode:
|
||||
|
||||
Gnomovision version 69, Copyright (C) 19yy name of author
|
||||
Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
|
||||
This is free software, and you are welcome to redistribute it
|
||||
under certain conditions; type `show c' for details.
|
||||
|
||||
The hypothetical commands `show w' and `show c' should show the appropriate
|
||||
parts of the General Public License. Of course, the commands you use may
|
||||
be called something other than `show w' and `show c'; they could even be
|
||||
mouse-clicks or menu items--whatever suits your program.
|
||||
|
||||
You should also get your employer (if you work as a programmer) or your
|
||||
school, if any, to sign a "copyright disclaimer" for the program, if
|
||||
necessary. Here is a sample; alter the names:
|
||||
|
||||
Yoyodyne, Inc., hereby disclaims all copyright interest in the program
|
||||
`Gnomovision' (which makes passes at compilers) written by James Hacker.
|
||||
|
||||
<signature of Ty Coon>, 1 April 1989
|
||||
Ty Coon, President of Vice
|
||||
|
||||
This General Public License does not permit incorporating your program into
|
||||
proprietary programs. If your program is a subroutine library, you may
|
||||
consider it more useful to permit linking proprietary applications with the
|
||||
library. If this is what you want to do, use the GNU Library General
|
||||
Public License instead of this License.
|
||||
|
48
Makefile
48
Makefile
@ -1,30 +1,46 @@
|
||||
|
||||
CC = gcc
|
||||
SH = /bin/sh
|
||||
CC=gcc
|
||||
CFLAGS=-Wall -O2 -fomit-frame-pointer -fno-strength-reduce
|
||||
|
||||
CFLAGS = -O3 -fomit-frame-pointer -funroll-loops
|
||||
OBJS= blocksort.o \
|
||||
huffman.o \
|
||||
crctable.o \
|
||||
randtable.o \
|
||||
compress.o \
|
||||
decompress.o \
|
||||
bzlib.o
|
||||
|
||||
all: lib bzip2 test
|
||||
|
||||
|
||||
all:
|
||||
cat words0
|
||||
$(CC) $(CFLAGS) -o bzip2 bzip2.c
|
||||
bzip2: lib
|
||||
$(CC) $(CFLAGS) -c bzip2.c
|
||||
$(CC) $(CFLAGS) -o bzip2 bzip2.o -L. -lbz2
|
||||
$(CC) $(CFLAGS) -o bzip2recover bzip2recover.c
|
||||
rm -f bunzip2
|
||||
ln -s ./bzip2 ./bunzip2
|
||||
cat words1
|
||||
|
||||
lib: $(OBJS)
|
||||
rm -f libbz2.a
|
||||
ar clq libbz2.a $(OBJS)
|
||||
|
||||
test: bzip2
|
||||
@cat words1
|
||||
./bzip2 -1 < sample1.ref > sample1.rb2
|
||||
./bzip2 -2 < sample2.ref > sample2.rb2
|
||||
./bunzip2 < sample1.bz2 > sample1.tst
|
||||
./bunzip2 < sample2.bz2 > sample2.tst
|
||||
cat words2
|
||||
./bzip2 -d < sample1.bz2 > sample1.tst
|
||||
./bzip2 -d < sample2.bz2 > sample2.tst
|
||||
@cat words2
|
||||
cmp sample1.bz2 sample1.rb2
|
||||
cmp sample2.bz2 sample2.rb2
|
||||
cmp sample1.tst sample1.ref
|
||||
cmp sample2.tst sample2.ref
|
||||
cat words3
|
||||
@cat words3
|
||||
|
||||
|
||||
clean:
|
||||
rm -f bzip2 bunzip2 bzip2recover sample*.tst sample*.rb2
|
||||
clean:
|
||||
rm -f *.o libbz2.a bzip2 bzip2recover sample1.rb2 sample2.rb2 sample1.tst sample2.tst
|
||||
|
||||
.c.o: $*.o bzlib.h bzlib_private.h
|
||||
$(CC) $(CFLAGS) -c $*.c -o $*.o
|
||||
|
||||
tarfile:
|
||||
tar cvf interim.tar *.c *.h Makefile manual.texi manual.ps LICENSE bzip2.1 bzip2.1.preformatted bzip2.txt words1 words2 words3 sample1.ref sample2.ref sample1.bz2 sample2.bz2 *.html README CHANGES libbz2.def libbz2.dsp dlltest.dsp
|
||||
|
||||
|
230
README
230
README
@ -1,194 +1,61 @@
|
||||
|
||||
GREETINGS!
|
||||
|
||||
This is the README for bzip2, my block-sorting file compressor,
|
||||
version 0.1.
|
||||
|
||||
bzip2 is distributed under the GNU General Public License version 2;
|
||||
for details, see the file LICENSE. Pointers to the algorithms used
|
||||
are in ALGORITHMS. Instructions for use are in bzip2.1.preformatted.
|
||||
|
||||
Please read all of this file carefully.
|
||||
|
||||
|
||||
This is the README for bzip2, a block-sorting file compressor, version
|
||||
0.9.0. This version is fully compatible with the previous public
|
||||
release, bzip2-0.1pl2.
|
||||
|
||||
HOW TO BUILD
|
||||
bzip2-0.9.0 is distributed under a BSD-style license. For details,
|
||||
see the file LICENSE.
|
||||
|
||||
-- for UNIX:
|
||||
Complete documentation is available in Postscript form (manual.ps)
|
||||
or html (manual_toc.html). A plain-text version of the manual page is
|
||||
available as bzip2.txt.
|
||||
|
||||
Type `make'. (tough, huh? :-)
|
||||
|
||||
This creates binaries "bzip2", and "bunzip2",
|
||||
which is a symbolic link to "bzip2".
|
||||
HOW TO BUILD -- UNIX
|
||||
|
||||
It also runs four compress-decompress tests to make sure
|
||||
things are working properly. If all goes well, you should be up &
|
||||
running. Please be sure to read the output from `make'
|
||||
just to be sure that the tests went ok.
|
||||
Type `make'.
|
||||
|
||||
To install bzip2 properly:
|
||||
This creates binaries "bzip2" and "bzip2recover".
|
||||
|
||||
-- Copy the binary "bzip2" to a publically visible place,
|
||||
possibly /usr/bin, /usr/common/bin or /usr/local/bin.
|
||||
It also runs four compress-decompress tests to make sure things are
|
||||
working properly. If all goes well, you should be up & running.
|
||||
Please be sure to read the output from `make' just to be sure that the
|
||||
tests went ok.
|
||||
|
||||
-- In that directory, make "bunzip2" be a symbolic link
|
||||
to "bzip2".
|
||||
To install bzip2 properly:
|
||||
|
||||
-- Copy the manual page, bzip2.1, to the relevant place.
|
||||
Probably the right place is /usr/man/man1/.
|
||||
|
||||
-- for Windows 95 and NT:
|
||||
* Copy the binaries "bzip2" and "bzip2recover" to a publically visible
|
||||
place, possibly /usr/bin or /usr/local/bin.
|
||||
|
||||
For a start, do you *really* want to recompile bzip2?
|
||||
The standard distribution includes a pre-compiled version
|
||||
for Windows 95 and NT, `bzip2.exe'.
|
||||
* In that directory, make "bunzip2" and "bzcat" be symbolic links
|
||||
to "bzip2".
|
||||
|
||||
This executable was created with Jacob Navia's excellent
|
||||
port to Win32 of Chris Fraser & David Hanson's excellent
|
||||
ANSI C compiler, "lcc". You can get to it at the pages
|
||||
of the CS department of Princeton University,
|
||||
www.cs.princeton.edu.
|
||||
I have not tried to compile this version of bzip2 with
|
||||
a commercial C compiler such as MS Visual C, as I don't
|
||||
have one available.
|
||||
* Copy the manual page, bzip2.1, to the relevant place.
|
||||
Probably the right place is /usr/man/man1/.
|
||||
|
||||
Note that lcc is designed primarily to be portable and
|
||||
fast. Code quality is a secondary aim, so bzip2.exe
|
||||
runs perhaps 40% slower than it could if compiled with
|
||||
a good optimising compiler.
|
||||
If you want to program with the library, you'll need to copy libbz2.a
|
||||
and bzlib.h to /usr/lib and /usr/include respectively.
|
||||
|
||||
|
||||
I compiled a previous version of bzip (0.21) with Borland
|
||||
C 5.0, which worked fine, and with MS VC++ 2.0, which
|
||||
didn't. Here is an comment from the README for bzip-0.21.
|
||||
|
||||
MS VC++ 2.0's optimising compiler has a bug which, at
|
||||
maximum optimisation, gives an executable which produces
|
||||
garbage compressed files. Proceed with caution.
|
||||
I do not know whether or not this happens with later
|
||||
versions of VC++.
|
||||
|
||||
Edit the defines starting at line 86 of bzip.c to
|
||||
select your platform/compiler combination, and then compile.
|
||||
Then check that the resulting executable (assumed to be
|
||||
called bzip.exe) works correctly, using the SELFTEST.BAT file.
|
||||
Bearing in mind the previous paragraph, the self-test is
|
||||
important.
|
||||
|
||||
Note that the defines which bzip-0.21 had, to support
|
||||
compilation with VC 2.0 and BC 5.0, are gone. Windows
|
||||
is not my preferred operating system, and I am, for the
|
||||
moment, content with the modestly fast executable created
|
||||
by lcc-win32.
|
||||
|
||||
A manual page is supplied, unformatted (bzip2.1),
|
||||
preformatted (bzip2.1.preformatted), and preformatted
|
||||
and sanitised for MS-DOS (bzip2.txt).
|
||||
|
||||
|
||||
|
||||
COMPILATION NOTES
|
||||
|
||||
bzip2 should work on any 32 or 64-bit machine. It is known to work
|
||||
[meaning: it has compiled and passed self-tests] on the
|
||||
following platform-os combinations:
|
||||
|
||||
Intel i386/i486 running Linux 2.0.21
|
||||
Sun Sparcs (various) running SunOS 4.1.4 and Solaris 2.5
|
||||
Intel i386/i486 running Windows 95 and NT
|
||||
DEC Alpha running Digital Unix 4.0
|
||||
|
||||
Following the release of bzip-0.21, many people mailed me
|
||||
from around the world to say they had made it work on all sorts
|
||||
of weird and wonderful machines. Chances are, if you have
|
||||
a reasonable ANSI C compiler and a 32-bit machine, you can
|
||||
get it to work.
|
||||
|
||||
The #defines starting at around line 82 of bzip2.c supply some
|
||||
degree of platform-independance. If you configure bzip2 for some
|
||||
new far-out platform which is not covered by the existing definitions,
|
||||
please send me the relevant definitions.
|
||||
|
||||
I recommend GNU C for compilation. The code is standard ANSI C,
|
||||
except for the Unix-specific file handling, so any ANSI C compiler
|
||||
should work. Note however that the many routines marked INLINE
|
||||
should be inlined by your compiler, else performance will be very
|
||||
poor. Asking your compiler to unroll loops gives some
|
||||
small improvement too; for gcc, the relevant flag is
|
||||
-funroll-loops.
|
||||
|
||||
On a 386/486 machines, I'd recommend giving gcc the
|
||||
-fomit-frame-pointer flag; this liberates another register for
|
||||
allocation, which measurably improves performance.
|
||||
|
||||
I used the abovementioned lcc compiler to develop bzip2.
|
||||
I would highly recommend this compiler for day-to-day development;
|
||||
it is fast, reliable, lightweight, has an excellent profiler,
|
||||
and is generally excellent. And it's fun to retarget, if you're
|
||||
into that kind of thing.
|
||||
|
||||
If you compile bzip2 on a new platform or with a new compiler,
|
||||
please be sure to run the four compress-decompress tests, either
|
||||
using the Makefile, or with the test.bat (MSDOS) or test.cmd (OS/2)
|
||||
files. Some compilers have been seen to introduce subtle bugs
|
||||
when optimising, so this check is important. Ideally you should
|
||||
then go on to test bzip2 on a file several megabytes or even
|
||||
tens of megabytes long, just to be 110% sure. ``Professional
|
||||
programmers are paranoid programmers.'' (anon).
|
||||
HOW TO BUILD -- Windows 95, NT, DOS, Mac, etc.
|
||||
|
||||
It's difficult for me to support compilation on all these platforms.
|
||||
My approach is to collect binaries for these platforms, and put them
|
||||
on my web page (http://www.muraroa.demon.co.uk). Look there.
|
||||
|
||||
|
||||
VALIDATION
|
||||
|
||||
Correct operation, in the sense that a compressed file can always be
|
||||
decompressed to reproduce the original, is obviously of paramount
|
||||
importance. To validate bzip2, I used a modified version of
|
||||
Mark Nelson's churn program. Churn is an automated test driver
|
||||
which recursively traverses a directory structure, using bzip2 to
|
||||
compress and then decompress each file it encounters, and checking
|
||||
that the decompressed data is the same as the original. As test
|
||||
material, I used several runs over several filesystems of differing
|
||||
sizes.
|
||||
Correct operation, in the sense that a compressed file can always be
|
||||
decompressed to reproduce the original, is obviously of paramount
|
||||
importance. To validate bzip2, I used a modified version of Mark
|
||||
Nelson's churn program. Churn is an automated test driver which
|
||||
recursively traverses a directory structure, using bzip2 to compress
|
||||
and then decompress each file it encounters, and checking that the
|
||||
decompressed data is the same as the original. There are more details
|
||||
in Section 4 of the user guide.
|
||||
|
||||
One set of tests was done on my base Linux filesystem,
|
||||
410 megabytes in 23,000 files. There were several runs over
|
||||
this filesystem, in various configurations designed to break bzip2.
|
||||
That filesystem also contained some specially constructed test
|
||||
files designed to exercise boundary cases in the code.
|
||||
This included files of zero length, various long, highly repetitive
|
||||
files, and some files which generate blocks with all values the same.
|
||||
|
||||
The other set of tests was done just with the "normal" configuration,
|
||||
but on a much larger quantity of data.
|
||||
|
||||
Tests are:
|
||||
|
||||
Linux FS, 410M, 23000 files
|
||||
|
||||
As above, with --repetitive-fast
|
||||
|
||||
As above, with -1
|
||||
|
||||
Low level disk image of a disk containing
|
||||
Windows NT4.0; 420M in a single huge file
|
||||
|
||||
Linux distribution, incl Slackware,
|
||||
all GNU sources. 1900M in 2300 files.
|
||||
|
||||
Approx ~100M compiler sources and related
|
||||
programming tools, running under Purify.
|
||||
|
||||
About 500M of data in 120 files of around
|
||||
4 M each. This is raw data from a
|
||||
biomagnetometer (SQUID-based thing).
|
||||
|
||||
Overall, total volume of test data is about
|
||||
3300 megabytes in 25000 files.
|
||||
|
||||
The distribution does four tests after building bzip. These tests
|
||||
include test decompressions of pre-supplied compressed files, so
|
||||
they not only test that bzip works correctly on the machine it was
|
||||
built on, but can also decompress files compressed on a different
|
||||
machine. This guards against unforseen interoperability problems.
|
||||
|
||||
|
||||
Please read and be aware of the following:
|
||||
@ -234,14 +101,30 @@ PATENTS:
|
||||
End of legalities.
|
||||
|
||||
|
||||
WHAT'S NEW IN 0.9.0 (as compared to 0.1pl2) ?
|
||||
|
||||
* Approx 10% faster compression, 30% faster decompression
|
||||
* -t (test mode) is a lot quicker
|
||||
* Can decompress concatenated compressed files
|
||||
* Programming interface, so programs can directly read/write .bz2 files
|
||||
* Less restrictive (BSD-style) licensing
|
||||
* Flag handling more compatible with GNU gzip
|
||||
* Much more documentation, i.e., a proper user manual
|
||||
* Hopefully, improved portability (at least of the library)
|
||||
|
||||
|
||||
I hope you find bzip2 useful. Feel free to contact me at
|
||||
jseward@acm.org
|
||||
if you have any suggestions or queries. Many people mailed me with
|
||||
comments, suggestions and patches after the releases of 0.15 and 0.21,
|
||||
and the changes in bzip2 are largely a result of this feedback.
|
||||
I thank you for your comments.
|
||||
comments, suggestions and patches after the releases of bzip-0.15,
|
||||
bzip-0.21 and bzip2-0.1pl2, and the changes in bzip2 are largely a
|
||||
result of this feedback. I thank you for your comments.
|
||||
|
||||
At least for the time being, bzip2's "home" is
|
||||
http://www.muraroa.demon.co.uk.
|
||||
|
||||
Julian Seward
|
||||
jseward@acm.org
|
||||
|
||||
Manchester, UK
|
||||
18 July 1996 (version 0.15)
|
||||
@ -250,4 +133,5 @@ Manchester, UK
|
||||
Guildford, Surrey, UK
|
||||
7 August 1997 (bzip2, version 0.1)
|
||||
29 August 1997 (bzip2, version 0.1pl2)
|
||||
23 August 1998 (bzip2, version 0.9.0)
|
||||
|
||||
|
16
README.DOS
16
README.DOS
@ -1,16 +0,0 @@
|
||||
|
||||
As of today (3 March 1998) I've removed the
|
||||
Win95/NT executables from this distribution, sorry.
|
||||
|
||||
You can still get an executable from
|
||||
http://www.muraroa.demon.co.uk, or (as a last
|
||||
resort) by mailing me at jseward@acm.org.
|
||||
|
||||
The reason for this change of packaging is that it
|
||||
makes it easier for me to fix problems with specific
|
||||
executables if they are not included in the main
|
||||
distribution.
|
||||
|
||||
J
|
||||
|
||||
|
709
blocksort.c
Normal file
709
blocksort.c
Normal file
@ -0,0 +1,709 @@
|
||||
|
||||
/*-------------------------------------------------------------*/
|
||||
/*--- Block sorting machinery ---*/
|
||||
/*--- blocksort.c ---*/
|
||||
/*-------------------------------------------------------------*/
|
||||
|
||||
/*--
|
||||
This file is a part of bzip2 and/or libbzip2, a program and
|
||||
library for lossless, block-sorting data compression.
|
||||
|
||||
Copyright (C) 1996-1998 Julian R Seward. All rights reserved.
|
||||
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
modification, are permitted provided that the following conditions
|
||||
are met:
|
||||
|
||||
1. Redistributions of source code must retain the above copyright
|
||||
notice, this list of conditions and the following disclaimer.
|
||||
|
||||
2. The origin of this software must not be misrepresented; you must
|
||||
not claim that you wrote the original software. If you use this
|
||||
software in a product, an acknowledgment in the product
|
||||
documentation would be appreciated but is not required.
|
||||
|
||||
3. Altered source versions must be plainly marked as such, and must
|
||||
not be misrepresented as being the original software.
|
||||
|
||||
4. The name of the author may not be used to endorse or promote
|
||||
products derived from this software without specific prior written
|
||||
permission.
|
||||
|
||||
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS
|
||||
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
|
||||
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||||
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
|
||||
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
|
||||
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
|
||||
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
|
||||
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
|
||||
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
Julian Seward, Guildford, Surrey, UK.
|
||||
jseward@acm.org
|
||||
bzip2/libbzip2 version 0.9.0c of 18 October 1998
|
||||
|
||||
This program is based on (at least) the work of:
|
||||
Mike Burrows
|
||||
David Wheeler
|
||||
Peter Fenwick
|
||||
Alistair Moffat
|
||||
Radford Neal
|
||||
Ian H. Witten
|
||||
Robert Sedgewick
|
||||
Jon L. Bentley
|
||||
|
||||
For more information on these sources, see the manual.
|
||||
--*/
|
||||
|
||||
|
||||
#include "bzlib_private.h"
|
||||
|
||||
/*---------------------------------------------*/
|
||||
/*--
|
||||
Compare two strings in block. We assume (see
|
||||
discussion above) that i1 and i2 have a max
|
||||
offset of 10 on entry, and that the first
|
||||
bytes of both block and quadrant have been
|
||||
copied into the "overshoot area", ie
|
||||
into the subscript range
|
||||
[nblock .. nblock+NUM_OVERSHOOT_BYTES-1].
|
||||
--*/
|
||||
static __inline__ Bool fullGtU ( UChar* block,
|
||||
UInt16* quadrant,
|
||||
UInt32 nblock,
|
||||
Int32* workDone,
|
||||
Int32 i1,
|
||||
Int32 i2
|
||||
)
|
||||
{
|
||||
Int32 k;
|
||||
UChar c1, c2;
|
||||
UInt16 s1, s2;
|
||||
|
||||
AssertD ( i1 != i2, "fullGtU(1)" );
|
||||
|
||||
c1 = block[i1];
|
||||
c2 = block[i2];
|
||||
if (c1 != c2) return (c1 > c2);
|
||||
i1++; i2++;
|
||||
|
||||
c1 = block[i1];
|
||||
c2 = block[i2];
|
||||
if (c1 != c2) return (c1 > c2);
|
||||
i1++; i2++;
|
||||
|
||||
c1 = block[i1];
|
||||
c2 = block[i2];
|
||||
if (c1 != c2) return (c1 > c2);
|
||||
i1++; i2++;
|
||||
|
||||
c1 = block[i1];
|
||||
c2 = block[i2];
|
||||
if (c1 != c2) return (c1 > c2);
|
||||
i1++; i2++;
|
||||
|
||||
c1 = block[i1];
|
||||
c2 = block[i2];
|
||||
if (c1 != c2) return (c1 > c2);
|
||||
i1++; i2++;
|
||||
|
||||
c1 = block[i1];
|
||||
c2 = block[i2];
|
||||
if (c1 != c2) return (c1 > c2);
|
||||
i1++; i2++;
|
||||
|
||||
k = nblock;
|
||||
|
||||
do {
|
||||
|
||||
c1 = block[i1];
|
||||
c2 = block[i2];
|
||||
if (c1 != c2) return (c1 > c2);
|
||||
s1 = quadrant[i1];
|
||||
s2 = quadrant[i2];
|
||||
if (s1 != s2) return (s1 > s2);
|
||||
i1++; i2++;
|
||||
|
||||
c1 = block[i1];
|
||||
c2 = block[i2];
|
||||
if (c1 != c2) return (c1 > c2);
|
||||
s1 = quadrant[i1];
|
||||
s2 = quadrant[i2];
|
||||
if (s1 != s2) return (s1 > s2);
|
||||
i1++; i2++;
|
||||
|
||||
c1 = block[i1];
|
||||
c2 = block[i2];
|
||||
if (c1 != c2) return (c1 > c2);
|
||||
s1 = quadrant[i1];
|
||||
s2 = quadrant[i2];
|
||||
if (s1 != s2) return (s1 > s2);
|
||||
i1++; i2++;
|
||||
|
||||
c1 = block[i1];
|
||||
c2 = block[i2];
|
||||
if (c1 != c2) return (c1 > c2);
|
||||
s1 = quadrant[i1];
|
||||
s2 = quadrant[i2];
|
||||
if (s1 != s2) return (s1 > s2);
|
||||
i1++; i2++;
|
||||
|
||||
if (i1 >= nblock) i1 -= nblock;
|
||||
if (i2 >= nblock) i2 -= nblock;
|
||||
|
||||
k -= 4;
|
||||
(*workDone)++;
|
||||
}
|
||||
while (k >= 0);
|
||||
|
||||
return False;
|
||||
}
|
||||
|
||||
/*---------------------------------------------*/
|
||||
/*--
|
||||
Knuth's increments seem to work better
|
||||
than Incerpi-Sedgewick here. Possibly
|
||||
because the number of elems to sort is
|
||||
usually small, typically <= 20.
|
||||
--*/
|
||||
static Int32 incs[14] = { 1, 4, 13, 40, 121, 364, 1093, 3280,
|
||||
9841, 29524, 88573, 265720,
|
||||
797161, 2391484 };
|
||||
|
||||
static void simpleSort ( EState* s, Int32 lo, Int32 hi, Int32 d )
|
||||
{
|
||||
Int32 i, j, h, bigN, hp;
|
||||
Int32 v;
|
||||
|
||||
UChar* block = s->block;
|
||||
UInt32* zptr = s->zptr;
|
||||
UInt16* quadrant = s->quadrant;
|
||||
Int32* workDone = &(s->workDone);
|
||||
Int32 nblock = s->nblock;
|
||||
Int32 workLimit = s->workLimit;
|
||||
Bool firstAttempt = s->firstAttempt;
|
||||
|
||||
bigN = hi - lo + 1;
|
||||
if (bigN < 2) return;
|
||||
|
||||
hp = 0;
|
||||
while (incs[hp] < bigN) hp++;
|
||||
hp--;
|
||||
|
||||
for (; hp >= 0; hp--) {
|
||||
h = incs[hp];
|
||||
i = lo + h;
|
||||
while (True) {
|
||||
|
||||
/*-- copy 1 --*/
|
||||
if (i > hi) break;
|
||||
v = zptr[i];
|
||||
j = i;
|
||||
while ( fullGtU ( block, quadrant, nblock, workDone,
|
||||
zptr[j-h]+d, v+d ) ) {
|
||||
zptr[j] = zptr[j-h];
|
||||
j = j - h;
|
||||
if (j <= (lo + h - 1)) break;
|
||||
}
|
||||
zptr[j] = v;
|
||||
i++;
|
||||
|
||||
/*-- copy 2 --*/
|
||||
if (i > hi) break;
|
||||
v = zptr[i];
|
||||
j = i;
|
||||
while ( fullGtU ( block, quadrant, nblock, workDone,
|
||||
zptr[j-h]+d, v+d ) ) {
|
||||
zptr[j] = zptr[j-h];
|
||||
j = j - h;
|
||||
if (j <= (lo + h - 1)) break;
|
||||
}
|
||||
zptr[j] = v;
|
||||
i++;
|
||||
|
||||
/*-- copy 3 --*/
|
||||
if (i > hi) break;
|
||||
v = zptr[i];
|
||||
j = i;
|
||||
while ( fullGtU ( block, quadrant, nblock, workDone,
|
||||
zptr[j-h]+d, v+d ) ) {
|
||||
zptr[j] = zptr[j-h];
|
||||
j = j - h;
|
||||
if (j <= (lo + h - 1)) break;
|
||||
}
|
||||
zptr[j] = v;
|
||||
i++;
|
||||
|
||||
if (*workDone > workLimit && firstAttempt) return;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------*/
|
||||
/*--
|
||||
The following is an implementation of
|
||||
an elegant 3-way quicksort for strings,
|
||||
described in a paper "Fast Algorithms for
|
||||
Sorting and Searching Strings", by Robert
|
||||
Sedgewick and Jon L. Bentley.
|
||||
--*/
|
||||
|
||||
#define swap(lv1, lv2) \
|
||||
{ Int32 tmp = lv1; lv1 = lv2; lv2 = tmp; }
|
||||
|
||||
static void vswap ( UInt32* zptr, Int32 p1, Int32 p2, Int32 n )
|
||||
{
|
||||
while (n > 0) {
|
||||
swap(zptr[p1], zptr[p2]);
|
||||
p1++; p2++; n--;
|
||||
}
|
||||
}
|
||||
|
||||
static UChar med3 ( UChar a, UChar b, UChar c )
|
||||
{
|
||||
UChar t;
|
||||
if (a > b) { t = a; a = b; b = t; };
|
||||
if (b > c) { t = b; b = c; c = t; };
|
||||
if (a > b) b = a;
|
||||
return b;
|
||||
}
|
||||
|
||||
|
||||
#define min(a,b) ((a) < (b)) ? (a) : (b)
|
||||
|
||||
typedef
|
||||
struct { Int32 ll; Int32 hh; Int32 dd; }
|
||||
StackElem;
|
||||
|
||||
#define push(lz,hz,dz) { stack[sp].ll = lz; \
|
||||
stack[sp].hh = hz; \
|
||||
stack[sp].dd = dz; \
|
||||
sp++; }
|
||||
|
||||
#define pop(lz,hz,dz) { sp--; \
|
||||
lz = stack[sp].ll; \
|
||||
hz = stack[sp].hh; \
|
||||
dz = stack[sp].dd; }
|
||||
|
||||
#define SMALL_THRESH 20
|
||||
#define DEPTH_THRESH 10
|
||||
|
||||
/*--
|
||||
If you are ever unlucky/improbable enough
|
||||
to get a stack overflow whilst sorting,
|
||||
increase the following constant and try
|
||||
again. In practice I have never seen the
|
||||
stack go above 27 elems, so the following
|
||||
limit seems very generous.
|
||||
--*/
|
||||
#define QSORT_STACK_SIZE 1000
|
||||
|
||||
|
||||
static void qSort3 ( EState* s, Int32 loSt, Int32 hiSt, Int32 dSt )
|
||||
{
|
||||
Int32 unLo, unHi, ltLo, gtHi, med, n, m;
|
||||
Int32 sp, lo, hi, d;
|
||||
StackElem stack[QSORT_STACK_SIZE];
|
||||
|
||||
UChar* block = s->block;
|
||||
UInt32* zptr = s->zptr;
|
||||
Int32* workDone = &(s->workDone);
|
||||
Int32 workLimit = s->workLimit;
|
||||
Bool firstAttempt = s->firstAttempt;
|
||||
|
||||
sp = 0;
|
||||
push ( loSt, hiSt, dSt );
|
||||
|
||||
while (sp > 0) {
|
||||
|
||||
AssertH ( sp < QSORT_STACK_SIZE, 1001 );
|
||||
|
||||
pop ( lo, hi, d );
|
||||
|
||||
if (hi - lo < SMALL_THRESH || d > DEPTH_THRESH) {
|
||||
simpleSort ( s, lo, hi, d );
|
||||
if (*workDone > workLimit && firstAttempt) return;
|
||||
continue;
|
||||
}
|
||||
|
||||
med = med3 ( block[zptr[ lo ]+d],
|
||||
block[zptr[ hi ]+d],
|
||||
block[zptr[ (lo+hi)>>1 ]+d] );
|
||||
|
||||
unLo = ltLo = lo;
|
||||
unHi = gtHi = hi;
|
||||
|
||||
while (True) {
|
||||
while (True) {
|
||||
if (unLo > unHi) break;
|
||||
n = ((Int32)block[zptr[unLo]+d]) - med;
|
||||
if (n == 0) { swap(zptr[unLo], zptr[ltLo]); ltLo++; unLo++; continue; };
|
||||
if (n > 0) break;
|
||||
unLo++;
|
||||
}
|
||||
while (True) {
|
||||
if (unLo > unHi) break;
|
||||
n = ((Int32)block[zptr[unHi]+d]) - med;
|
||||
if (n == 0) { swap(zptr[unHi], zptr[gtHi]); gtHi--; unHi--; continue; };
|
||||
if (n < 0) break;
|
||||
unHi--;
|
||||
}
|
||||
if (unLo > unHi) break;
|
||||
swap(zptr[unLo], zptr[unHi]); unLo++; unHi--;
|
||||
}
|
||||
|
||||
AssertD ( unHi == unLo-1, "bad termination in qSort3" );
|
||||
|
||||
if (gtHi < ltLo) {
|
||||
push(lo, hi, d+1 );
|
||||
continue;
|
||||
}
|
||||
|
||||
n = min(ltLo-lo, unLo-ltLo); vswap(zptr, lo, unLo-n, n);
|
||||
m = min(hi-gtHi, gtHi-unHi); vswap(zptr, unLo, hi-m+1, m);
|
||||
|
||||
n = lo + unLo - ltLo - 1;
|
||||
m = hi - (gtHi - unHi) + 1;
|
||||
|
||||
push ( lo, n, d );
|
||||
push ( n+1, m-1, d+1 );
|
||||
push ( m, hi, d );
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------*/
|
||||
|
||||
#define BIGFREQ(b) (ftab[((b)+1) << 8] - ftab[(b) << 8])
|
||||
|
||||
#define SETMASK (1 << 21)
|
||||
#define CLEARMASK (~(SETMASK))
|
||||
|
||||
static void sortMain ( EState* s )
|
||||
{
|
||||
Int32 i, j, k, ss, sb;
|
||||
Int32 runningOrder[256];
|
||||
Int32 copy[256];
|
||||
Bool bigDone[256];
|
||||
UChar c1, c2;
|
||||
Int32 numQSorted;
|
||||
|
||||
UChar* block = s->block;
|
||||
UInt32* zptr = s->zptr;
|
||||
UInt16* quadrant = s->quadrant;
|
||||
Int32* ftab = s->ftab;
|
||||
Int32* workDone = &(s->workDone);
|
||||
Int32 nblock = s->nblock;
|
||||
Int32 workLimit = s->workLimit;
|
||||
Bool firstAttempt = s->firstAttempt;
|
||||
|
||||
/*--
|
||||
In the various block-sized structures, live data runs
|
||||
from 0 to last+NUM_OVERSHOOT_BYTES inclusive. First,
|
||||
set up the overshoot area for block.
|
||||
--*/
|
||||
|
||||
if (s->verbosity >= 4)
|
||||
VPrintf0( " sort initialise ...\n" );
|
||||
|
||||
for (i = 0; i < BZ_NUM_OVERSHOOT_BYTES; i++)
|
||||
block[nblock+i] = block[i % nblock];
|
||||
for (i = 0; i < nblock+BZ_NUM_OVERSHOOT_BYTES; i++)
|
||||
quadrant[i] = 0;
|
||||
|
||||
|
||||
if (nblock <= 4000) {
|
||||
|
||||
/*--
|
||||
Use simpleSort(), since the full sorting mechanism
|
||||
has quite a large constant overhead.
|
||||
--*/
|
||||
if (s->verbosity >= 4) VPrintf0( " simpleSort ...\n" );
|
||||
for (i = 0; i < nblock; i++) zptr[i] = i;
|
||||
firstAttempt = False;
|
||||
*workDone = workLimit = 0;
|
||||
simpleSort ( s, 0, nblock-1, 0 );
|
||||
if (s->verbosity >= 4) VPrintf0( " simpleSort done.\n" );
|
||||
|
||||
} else {
|
||||
|
||||
numQSorted = 0;
|
||||
for (i = 0; i <= 255; i++) bigDone[i] = False;
|
||||
|
||||
if (s->verbosity >= 4) VPrintf0( " bucket sorting ...\n" );
|
||||
|
||||
for (i = 0; i <= 65536; i++) ftab[i] = 0;
|
||||
|
||||
c1 = block[nblock-1];
|
||||
for (i = 0; i < nblock; i++) {
|
||||
c2 = block[i];
|
||||
ftab[(c1 << 8) + c2]++;
|
||||
c1 = c2;
|
||||
}
|
||||
|
||||
for (i = 1; i <= 65536; i++) ftab[i] += ftab[i-1];
|
||||
|
||||
c1 = block[0];
|
||||
for (i = 0; i < nblock-1; i++) {
|
||||
c2 = block[i+1];
|
||||
j = (c1 << 8) + c2;
|
||||
c1 = c2;
|
||||
ftab[j]--;
|
||||
zptr[ftab[j]] = i;
|
||||
}
|
||||
j = (block[nblock-1] << 8) + block[0];
|
||||
ftab[j]--;
|
||||
zptr[ftab[j]] = nblock-1;
|
||||
|
||||
/*--
|
||||
Now ftab contains the first loc of every small bucket.
|
||||
Calculate the running order, from smallest to largest
|
||||
big bucket.
|
||||
--*/
|
||||
|
||||
for (i = 0; i <= 255; i++) runningOrder[i] = i;
|
||||
|
||||
{
|
||||
Int32 vv;
|
||||
Int32 h = 1;
|
||||
do h = 3 * h + 1; while (h <= 256);
|
||||
do {
|
||||
h = h / 3;
|
||||
for (i = h; i <= 255; i++) {
|
||||
vv = runningOrder[i];
|
||||
j = i;
|
||||
while ( BIGFREQ(runningOrder[j-h]) > BIGFREQ(vv) ) {
|
||||
runningOrder[j] = runningOrder[j-h];
|
||||
j = j - h;
|
||||
if (j <= (h - 1)) goto zero;
|
||||
}
|
||||
zero:
|
||||
runningOrder[j] = vv;
|
||||
}
|
||||
} while (h != 1);
|
||||
}
|
||||
|
||||
/*--
|
||||
The main sorting loop.
|
||||
--*/
|
||||
|
||||
for (i = 0; i <= 255; i++) {
|
||||
|
||||
/*--
|
||||
Process big buckets, starting with the least full.
|
||||
Basically this is a 4-step process in which we call
|
||||
qSort3 to sort the small buckets [ss, j], but
|
||||
also make a big effort to avoid the calls if we can.
|
||||
--*/
|
||||
ss = runningOrder[i];
|
||||
|
||||
/*--
|
||||
Step 1:
|
||||
Complete the big bucket [ss] by quicksorting
|
||||
any unsorted small buckets [ss, j], for j != ss.
|
||||
Hopefully previous pointer-scanning phases have already
|
||||
completed many of the small buckets [ss, j], so
|
||||
we don't have to sort them at all.
|
||||
--*/
|
||||
for (j = 0; j <= 255; j++) {
|
||||
if (j != ss) {
|
||||
sb = (ss << 8) + j;
|
||||
if ( ! (ftab[sb] & SETMASK) ) {
|
||||
Int32 lo = ftab[sb] & CLEARMASK;
|
||||
Int32 hi = (ftab[sb+1] & CLEARMASK) - 1;
|
||||
if (hi > lo) {
|
||||
if (s->verbosity >= 4)
|
||||
VPrintf4( " qsort [0x%x, 0x%x] done %d this %d\n",
|
||||
ss, j, numQSorted, hi - lo + 1 );
|
||||
qSort3 ( s, lo, hi, 2 );
|
||||
numQSorted += ( hi - lo + 1 );
|
||||
if (*workDone > workLimit && firstAttempt) return;
|
||||
}
|
||||
}
|
||||
ftab[sb] |= SETMASK;
|
||||
}
|
||||
}
|
||||
|
||||
/*--
|
||||
Step 2:
|
||||
Deal specially with case [ss, ss]. This establishes the
|
||||
sorted order for [ss, ss] without any comparisons.
|
||||
A clever trick, cryptically described as steps Q6b and Q6c
|
||||
in SRC-124 (aka BW94). This makes it entirely practical to
|
||||
not use a preliminary run-length coder, but unfortunately
|
||||
we are now stuck with the .bz2 file format.
|
||||
--*/
|
||||
{
|
||||
Int32 put0, get0, put1, get1;
|
||||
Int32 sbn = (ss << 8) + ss;
|
||||
Int32 lo = ftab[sbn] & CLEARMASK;
|
||||
Int32 hi = (ftab[sbn+1] & CLEARMASK) - 1;
|
||||
UChar ssc = (UChar)ss;
|
||||
put0 = lo;
|
||||
get0 = ftab[ss << 8] & CLEARMASK;
|
||||
put1 = hi;
|
||||
get1 = (ftab[(ss+1) << 8] & CLEARMASK) - 1;
|
||||
while (get0 < put0) {
|
||||
j = zptr[get0]-1; if (j < 0) j += nblock;
|
||||
c1 = block[j];
|
||||
if (c1 == ssc) { zptr[put0] = j; put0++; };
|
||||
get0++;
|
||||
}
|
||||
while (get1 > put1) {
|
||||
j = zptr[get1]-1; if (j < 0) j += nblock;
|
||||
c1 = block[j];
|
||||
if (c1 == ssc) { zptr[put1] = j; put1--; };
|
||||
get1--;
|
||||
}
|
||||
ftab[sbn] |= SETMASK;
|
||||
}
|
||||
|
||||
/*--
|
||||
Step 3:
|
||||
The [ss] big bucket is now done. Record this fact,
|
||||
and update the quadrant descriptors. Remember to
|
||||
update quadrants in the overshoot area too, if
|
||||
necessary. The "if (i < 255)" test merely skips
|
||||
this updating for the last bucket processed, since
|
||||
updating for the last bucket is pointless.
|
||||
|
||||
The quadrant array provides a way to incrementally
|
||||
cache sort orderings, as they appear, so as to
|
||||
make subsequent comparisons in fullGtU() complete
|
||||
faster. For repetitive blocks this makes a big
|
||||
difference (but not big enough to be able to avoid
|
||||
randomisation for very repetitive data.)
|
||||
|
||||
The precise meaning is: at all times:
|
||||
|
||||
for 0 <= i < nblock and 0 <= j <= nblock
|
||||
|
||||
if block[i] != block[j],
|
||||
|
||||
then the relative values of quadrant[i] and
|
||||
quadrant[j] are meaningless.
|
||||
|
||||
else {
|
||||
if quadrant[i] < quadrant[j]
|
||||
then the string starting at i lexicographically
|
||||
precedes the string starting at j
|
||||
|
||||
else if quadrant[i] > quadrant[j]
|
||||
then the string starting at j lexicographically
|
||||
precedes the string starting at i
|
||||
|
||||
else
|
||||
the relative ordering of the strings starting
|
||||
at i and j has not yet been determined.
|
||||
}
|
||||
--*/
|
||||
bigDone[ss] = True;
|
||||
|
||||
if (i < 255) {
|
||||
Int32 bbStart = ftab[ss << 8] & CLEARMASK;
|
||||
Int32 bbSize = (ftab[(ss+1) << 8] & CLEARMASK) - bbStart;
|
||||
Int32 shifts = 0;
|
||||
|
||||
while ((bbSize >> shifts) > 65534) shifts++;
|
||||
|
||||
for (j = 0; j < bbSize; j++) {
|
||||
Int32 a2update = zptr[bbStart + j];
|
||||
UInt16 qVal = (UInt16)(j >> shifts);
|
||||
quadrant[a2update] = qVal;
|
||||
if (a2update < BZ_NUM_OVERSHOOT_BYTES)
|
||||
quadrant[a2update + nblock] = qVal;
|
||||
}
|
||||
|
||||
AssertH ( ( ((bbSize-1) >> shifts) <= 65535 ), 1002 );
|
||||
}
|
||||
|
||||
/*--
|
||||
Step 4:
|
||||
Now scan this big bucket [ss] so as to synthesise the
|
||||
sorted order for small buckets [t, ss] for all t != ss.
|
||||
This will avoid doing Real Work in subsequent Step 1's.
|
||||
--*/
|
||||
for (j = 0; j <= 255; j++)
|
||||
copy[j] = ftab[(j << 8) + ss] & CLEARMASK;
|
||||
|
||||
for (j = ftab[ss << 8] & CLEARMASK;
|
||||
j < (ftab[(ss+1) << 8] & CLEARMASK);
|
||||
j++) {
|
||||
k = zptr[j]-1; if (k < 0) k += nblock;
|
||||
c1 = block[k];
|
||||
if ( ! bigDone[c1] ) {
|
||||
zptr[copy[c1]] = k;
|
||||
copy[c1] ++;
|
||||
}
|
||||
}
|
||||
|
||||
for (j = 0; j <= 255; j++) ftab[(j << 8) + ss] |= SETMASK;
|
||||
}
|
||||
if (s->verbosity >= 4)
|
||||
VPrintf3( " %d pointers, %d sorted, %d scanned\n",
|
||||
nblock, numQSorted, nblock - numQSorted );
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------*/
|
||||
static void randomiseBlock ( EState* s )
|
||||
{
|
||||
Int32 i;
|
||||
BZ_RAND_INIT_MASK;
|
||||
for (i = 0; i < 256; i++) s->inUse[i] = False;
|
||||
|
||||
for (i = 0; i < s->nblock; i++) {
|
||||
BZ_RAND_UPD_MASK;
|
||||
s->block[i] ^= BZ_RAND_MASK;
|
||||
s->inUse[s->block[i]] = True;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------*/
|
||||
void blockSort ( EState* s )
|
||||
{
|
||||
Int32 i;
|
||||
|
||||
s->workLimit = s->workFactor * (s->nblock - 1);
|
||||
s->workDone = 0;
|
||||
s->blockRandomised = False;
|
||||
s->firstAttempt = True;
|
||||
|
||||
sortMain ( s );
|
||||
|
||||
if (s->verbosity >= 3)
|
||||
VPrintf3( " %d work, %d block, ratio %5.2f\n",
|
||||
s->workDone, s->nblock-1,
|
||||
(float)(s->workDone) / (float)(s->nblock-1) );
|
||||
|
||||
if (s->workDone > s->workLimit && s->firstAttempt) {
|
||||
if (s->verbosity >= 2)
|
||||
VPrintf0( " sorting aborted; randomising block\n" );
|
||||
randomiseBlock ( s );
|
||||
s->workLimit = s->workDone = 0;
|
||||
s->blockRandomised = True;
|
||||
s->firstAttempt = False;
|
||||
sortMain ( s );
|
||||
if (s->verbosity >= 3)
|
||||
VPrintf3( " %d work, %d block, ratio %f\n",
|
||||
s->workDone, s->nblock-1,
|
||||
(float)(s->workDone) / (float)(s->nblock-1) );
|
||||
}
|
||||
|
||||
s->origPtr = -1;
|
||||
for (i = 0; i < s->nblock; i++)
|
||||
if (s->zptr[i] == 0)
|
||||
{ s->origPtr = i; break; };
|
||||
|
||||
AssertH( s->origPtr != -1, 1003 );
|
||||
}
|
||||
|
||||
/*-------------------------------------------------------------*/
|
||||
/*--- end blocksort.c ---*/
|
||||
/*-------------------------------------------------------------*/
|
191
bzip2.1
191
bzip2.1
@ -1,21 +1,29 @@
|
||||
.PU
|
||||
.TH bzip2 1
|
||||
.SH NAME
|
||||
bzip2, bunzip2 \- a block-sorting file compressor, v0.1
|
||||
bzip2, bunzip2 \- a block-sorting file compressor, v0.9.0
|
||||
.br
|
||||
bzcat \- decompresses files to stdout
|
||||
.br
|
||||
bzip2recover \- recovers data from damaged bzip2 files
|
||||
|
||||
.SH SYNOPSIS
|
||||
.ll +8
|
||||
.B bzip2
|
||||
.RB [ " \-cdfkstvVL123456789 " ]
|
||||
.RB [ " \-cdfkstvzVL123456789 " ]
|
||||
[
|
||||
.I "filenames \&..."
|
||||
]
|
||||
.ll -8
|
||||
.br
|
||||
.B bunzip2
|
||||
.RB [ " \-kvsVL " ]
|
||||
.RB [ " \-fkvsVL " ]
|
||||
[
|
||||
.I "filenames \&..."
|
||||
]
|
||||
.br
|
||||
.B bzcat
|
||||
.RB [ " \-s " ]
|
||||
[
|
||||
.I "filenames \&..."
|
||||
]
|
||||
@ -24,7 +32,7 @@ bzip2recover \- recovers data from damaged bzip2 files
|
||||
.I "filename"
|
||||
|
||||
.SH DESCRIPTION
|
||||
.I Bzip2
|
||||
.I bzip2
|
||||
compresses files using the Burrows-Wheeler block-sorting
|
||||
text compression algorithm, and Huffman coding.
|
||||
Compression is generally considerably
|
||||
@ -38,7 +46,7 @@ those of
|
||||
.I GNU Gzip,
|
||||
but they are not identical.
|
||||
|
||||
.I Bzip2
|
||||
.I bzip2
|
||||
expects a list of file names to accompany the command-line flags.
|
||||
Each file is replaced by a compressed version of itself,
|
||||
with the name "original_name.bz2".
|
||||
@ -50,11 +58,11 @@ original file names, permissions and dates in filesystems
|
||||
which lack these concepts, or have serious file name length
|
||||
restrictions, such as MS-DOS.
|
||||
|
||||
.I Bzip2
|
||||
.I bzip2
|
||||
and
|
||||
.I bunzip2
|
||||
will not overwrite existing files; if you want this to happen,
|
||||
you should delete them first.
|
||||
will by default not overwrite existing files;
|
||||
if you want this to happen, specify the \-f flag.
|
||||
|
||||
If no file names are specified,
|
||||
.I bzip2
|
||||
@ -64,7 +72,7 @@ In this case,
|
||||
will decline to write compressed output to a terminal, as
|
||||
this would be entirely incomprehensible and therefore pointless.
|
||||
|
||||
.I Bunzip2
|
||||
.I bunzip2
|
||||
(or
|
||||
.I bzip2 \-d
|
||||
) decompresses and restores all specified files whose names
|
||||
@ -73,12 +81,28 @@ Files without this suffix are ignored.
|
||||
Again, supplying no filenames
|
||||
causes decompression from standard input to standard output.
|
||||
|
||||
.I bunzip2
|
||||
will correctly decompress a file which is the concatenation
|
||||
of two or more compressed files. The result is the concatenation
|
||||
of the corresponding uncompressed files. Integrity testing
|
||||
(\-t) of concatenated compressed files is also supported.
|
||||
|
||||
You can also compress or decompress files to
|
||||
the standard output by giving the \-c flag.
|
||||
You can decompress multiple files like this, but you may
|
||||
only compress a single file this way, since it would otherwise
|
||||
be difficult to separate out the compressed representations of
|
||||
the original files.
|
||||
Multiple files may be compressed and decompressed like this.
|
||||
The resulting outputs are fed sequentially to stdout.
|
||||
Compression of multiple files in this manner generates
|
||||
a stream containing multiple compressed file representations.
|
||||
Such a stream can be decompressed correctly only by
|
||||
.I bzip2
|
||||
version 0.9.0 or later. Earlier versions of
|
||||
.I bzip2
|
||||
will stop after decompressing the first file in the stream.
|
||||
|
||||
.I bzcat
|
||||
(or
|
||||
.I bzip2 \-dc
|
||||
) decompresses all specified files to the standard output.
|
||||
|
||||
Compression is always performed, even if the compressed file is
|
||||
slightly larger than the original. Files of less than about
|
||||
@ -132,7 +156,7 @@ Compression and decompression requirements, in bytes, can be estimated as:
|
||||
|
||||
Compression: 400k + ( 7 x block size )
|
||||
|
||||
Decompression: 100k + ( 5 x block size ), or
|
||||
Decompression: 100k + ( 4 x block size ), or
|
||||
.br
|
||||
100k + ( 2.5 x block size )
|
||||
|
||||
@ -147,7 +171,7 @@ choice of block size.
|
||||
|
||||
For files compressed with the default 900k block size,
|
||||
.I bunzip2
|
||||
will require about 4600 kbytes to decompress.
|
||||
will require about 3700 kbytes to decompress.
|
||||
To support decompression of any file on a 4 megabyte machine,
|
||||
.I bunzip2
|
||||
has an option to decompress using approximately half this
|
||||
@ -168,8 +192,8 @@ For example, compressing a file 20,000 bytes long with the flag
|
||||
\-9
|
||||
will cause the compressor to allocate around
|
||||
6700k of memory, but only touch 400k + 20000 * 7 = 540
|
||||
kbytes of it. Similarly, the decompressor will allocate 4600k but
|
||||
only touch 100k + 20000 * 5 = 200 kbytes.
|
||||
kbytes of it. Similarly, the decompressor will allocate 3700k but
|
||||
only touch 100k + 20000 * 4 = 180 kbytes.
|
||||
|
||||
Here is a table which summarises the maximum memory usage for
|
||||
different block sizes. Also recorded is the total compressed
|
||||
@ -182,71 +206,73 @@ Corpus is dominated by smaller files.
|
||||
Compress Decompress Decompress Corpus
|
||||
Flag usage usage -s usage Size
|
||||
|
||||
-1 1100k 600k 350k 914704
|
||||
-2 1800k 1100k 600k 877703
|
||||
-3 2500k 1600k 850k 860338
|
||||
-4 3200k 2100k 1100k 846899
|
||||
-5 3900k 2600k 1350k 845160
|
||||
-6 4600k 3100k 1600k 838626
|
||||
-7 5400k 3600k 1850k 834096
|
||||
-8 6000k 4100k 2100k 828642
|
||||
-9 6700k 4600k 2350k 828642
|
||||
-1 1100k 500k 350k 914704
|
||||
-2 1800k 900k 600k 877703
|
||||
-3 2500k 1300k 850k 860338
|
||||
-4 3200k 1700k 1100k 846899
|
||||
-5 3900k 2100k 1350k 845160
|
||||
-6 4600k 2500k 1600k 838626
|
||||
-7 5400k 2900k 1850k 834096
|
||||
-8 6000k 3300k 2100k 828642
|
||||
-9 6700k 3700k 2350k 828642
|
||||
|
||||
.SH OPTIONS
|
||||
.TP
|
||||
.B \-c --stdout
|
||||
.B \-c --stdout
|
||||
Compress or decompress to standard output. \-c will decompress
|
||||
multiple files to stdout, but will only compress a single file to
|
||||
stdout.
|
||||
.TP
|
||||
.B \-d --decompress
|
||||
Force decompression.
|
||||
.I Bzip2
|
||||
and
|
||||
.I bzip2,
|
||||
.I bunzip2
|
||||
are really the same program, and the decision about whether to
|
||||
compress or decompress is done on the basis of which name is
|
||||
and
|
||||
.I bzcat
|
||||
are really the same program, and the decision about what actions
|
||||
to take is done on the basis of which name is
|
||||
used. This flag overrides that mechanism, and forces
|
||||
.I bzip2
|
||||
to decompress.
|
||||
.TP
|
||||
.B \-f --compress
|
||||
.B \-z --compress
|
||||
The complement to \-d: forces compression, regardless of the invokation
|
||||
name.
|
||||
.TP
|
||||
.B \-t --test
|
||||
Check integrity of the specified file(s), but don't decompress them.
|
||||
This really performs a trial decompression and throws away the result,
|
||||
using the low-memory decompression algorithm (see \-s).
|
||||
This really performs a trial decompression and throws away the result.
|
||||
.TP
|
||||
.B \-f --force
|
||||
Force overwrite of output files. Normally,
|
||||
.I bzip2
|
||||
will not overwrite existing output files.
|
||||
.TP
|
||||
.B \-k --keep
|
||||
Keep (don't delete) input files during compression or decompression.
|
||||
.TP
|
||||
.B \-s --small
|
||||
Reduce memory usage, both for compression and decompression.
|
||||
Files are decompressed using a modified algorithm which only
|
||||
Reduce memory usage, for compression, decompression and
|
||||
testing.
|
||||
Files are decompressed and tested using a modified algorithm which only
|
||||
requires 2.5 bytes per block byte. This means any file can be
|
||||
decompressed in 2300k of memory, albeit somewhat more slowly than
|
||||
usual.
|
||||
decompressed in 2300k of memory, albeit at about half the normal
|
||||
speed.
|
||||
|
||||
During compression, -s selects a block size of 200k, which limits
|
||||
memory use to around the same figure, at the expense of your
|
||||
compression ratio. In short, if your machine is low on memory
|
||||
(8 megabytes or less), use -s for everything. See
|
||||
MEMORY MANAGEMENT above.
|
||||
|
||||
.TP
|
||||
.B \-v --verbose
|
||||
Verbose mode -- show the compression ratio for each file processed.
|
||||
Further \-v's increase the verbosity level, spewing out lots of
|
||||
information which is primarily of interest for diagnostic purposes.
|
||||
.TP
|
||||
.B \-L --license
|
||||
.B \-L --license -V --version
|
||||
Display the software version, license terms and conditions.
|
||||
.TP
|
||||
.B \-V --version
|
||||
Same as \-L.
|
||||
.TP
|
||||
.B \-1 to \-9
|
||||
Set the block size to 100 k, 200 k .. 900 k when
|
||||
compressing. Has no effect when decompressing.
|
||||
@ -329,10 +355,6 @@ to compress the latter.
|
||||
If you do get a file which causes severe slowness in compression,
|
||||
try making the block size as small as possible, with flag \-1.
|
||||
|
||||
Incompressible or virtually-incompressible data may decompress
|
||||
rather more slowly than one would hope. This is due to
|
||||
a naive implementation of the move-to-front coder.
|
||||
|
||||
.I bzip2
|
||||
usually allocates several megabytes of memory to operate in,
|
||||
and then charges all over it in a fairly random fashion. This
|
||||
@ -346,28 +368,19 @@ I imagine
|
||||
.I bzip2
|
||||
will perform best on machines with very large caches.
|
||||
|
||||
Test mode (\-t) uses the low-memory decompression algorithm
|
||||
(\-s). This means test mode does not run as fast as it could;
|
||||
it could run as fast as the normal decompression machinery.
|
||||
This could easily be fixed at the cost of some code bloat.
|
||||
|
||||
.SH CAVEATS
|
||||
I/O error messages are not as helpful as they could be.
|
||||
.I Bzip2
|
||||
tries hard to detect I/O errors and exit cleanly, but the
|
||||
details of what the problem is sometimes seem rather misleading.
|
||||
|
||||
This manual page pertains to version 0.1 of
|
||||
This manual page pertains to version 0.9.0 of
|
||||
.I bzip2.
|
||||
It may well happen that some future version will
|
||||
use a different compressed file format. If you try to
|
||||
decompress, using 0.1, a .bz2 file created with some
|
||||
future version which uses a different compressed file format,
|
||||
0.1 will complain that your file "is not a bzip2 file".
|
||||
If that happens, you should obtain a more recent version
|
||||
of
|
||||
.I bzip2
|
||||
and use that to decompress the file.
|
||||
Compressed data created by this version is entirely forwards and
|
||||
backwards compatible with the previous public release, version 0.1pl2,
|
||||
but with the following exception: 0.9.0 can correctly decompress
|
||||
multiple concatenated compressed files. 0.1pl2 cannot do this; it
|
||||
will stop after decompressing just the first file in the stream.
|
||||
|
||||
Wildcard expansion for Windows 95 and NT
|
||||
is flaky.
|
||||
@ -377,63 +390,25 @@ uses 32-bit integers to represent bit positions in
|
||||
compressed files, so it cannot handle compressed files
|
||||
more than 512 megabytes long. This could easily be fixed.
|
||||
|
||||
.I bzip2recover
|
||||
sometimes reports a very small, incomplete final block.
|
||||
This is spurious and can be safely ignored.
|
||||
|
||||
.SH RELATIONSHIP TO bzip-0.21
|
||||
This program is a descendant of the
|
||||
.I bzip
|
||||
program, version 0.21, which I released in August 1996.
|
||||
The primary difference of
|
||||
.I bzip2
|
||||
is its avoidance of the possibly patented algorithms
|
||||
which were used in 0.21.
|
||||
.I bzip2
|
||||
also brings various useful refinements (\-s, \-t),
|
||||
uses less memory, decompresses significantly faster, and
|
||||
has support for recovering data from damaged files.
|
||||
|
||||
Because
|
||||
.I bzip2
|
||||
uses Huffman coding to construct the compressed bitstream,
|
||||
rather than the arithmetic coding used in 0.21,
|
||||
the compressed representations generated by the two programs
|
||||
are incompatible, and they will not interoperate. The change
|
||||
in suffix from .bz to .bz2 reflects this. It would have been
|
||||
helpful to at least allow
|
||||
.I bzip2
|
||||
to decompress files created by 0.21, but this would
|
||||
defeat the primary aim of having a patent-free compressor.
|
||||
|
||||
For a more precise statement about patent issues in
|
||||
bzip2, please see the README file in the distribution.
|
||||
|
||||
Huffman coding necessarily involves some coding inefficiency
|
||||
compared to arithmetic coding. This means that
|
||||
.I bzip2
|
||||
compresses about 1% worse than 0.21, an unfortunate but
|
||||
unavoidable fact-of-life. On the other hand, decompression
|
||||
is approximately 50% faster for the same reason, and the
|
||||
change in file format gave an opportunity to add data-recovery
|
||||
features. So it is not all bad.
|
||||
|
||||
.SH AUTHOR
|
||||
Julian Seward, jseward@acm.org.
|
||||
|
||||
http://www.muraroa.demon.co.uk
|
||||
|
||||
The ideas embodied in
|
||||
.I bzip
|
||||
and
|
||||
.I bzip2
|
||||
are due to (at least) the following people:
|
||||
Michael Burrows and David Wheeler (for the block sorting
|
||||
transformation), David Wheeler (again, for the Huffman coder),
|
||||
Peter Fenwick (for the structured coding model in 0.21,
|
||||
Peter Fenwick (for the structured coding model in the original
|
||||
.I bzip,
|
||||
and many refinements),
|
||||
and
|
||||
Alistair Moffat, Radford Neal and Ian Witten (for the arithmetic
|
||||
coder in 0.21). I am much indebted for their help, support and advice.
|
||||
See the file ALGORITHMS in the source distribution for pointers to
|
||||
coder in the original
|
||||
.I bzip).
|
||||
I am much indebted for their help, support and advice.
|
||||
See the manual in the source distribution for pointers to
|
||||
sources of documentation.
|
||||
Christian von Roques encouraged me to look for faster
|
||||
sorting algorithms, so as to speed up compression.
|
||||
|
@ -5,18 +5,20 @@ bzip2(1) bzip2(1)
|
||||
|
||||
|
||||
NNAAMMEE
|
||||
bzip2, bunzip2 - a block-sorting file compressor, v0.1
|
||||
bzip2, bunzip2 - a block-sorting file compressor, v0.9.0
|
||||
bzcat - decompresses files to stdout
|
||||
bzip2recover - recovers data from damaged bzip2 files
|
||||
|
||||
|
||||
SSYYNNOOPPSSIISS
|
||||
bbzziipp22 [ --ccddffkkssttvvVVLL112233445566778899 ] [ _f_i_l_e_n_a_m_e_s _._._. ]
|
||||
bbuunnzziipp22 [ --kkvvssVVLL ] [ _f_i_l_e_n_a_m_e_s _._._. ]
|
||||
bbzziipp22 [ --ccddffkkssttvvzzVVLL112233445566778899 ] [ _f_i_l_e_n_a_m_e_s _._._. ]
|
||||
bbuunnzziipp22 [ --ffkkvvssVVLL ] [ _f_i_l_e_n_a_m_e_s _._._. ]
|
||||
bbzzccaatt [ --ss ] [ _f_i_l_e_n_a_m_e_s _._._. ]
|
||||
bbzziipp22rreeccoovveerr _f_i_l_e_n_a_m_e
|
||||
|
||||
|
||||
DDEESSCCRRIIPPTTIIOONN
|
||||
_B_z_i_p_2 compresses files using the Burrows-Wheeler block-
|
||||
_b_z_i_p_2 compresses files using the Burrows-Wheeler block-
|
||||
sorting text compression algorithm, and Huffman coding.
|
||||
Compression is generally considerably better than that
|
||||
achieved by more conventional LZ77/LZ78-based compressors,
|
||||
@ -26,7 +28,7 @@ DDEESSCCRRIIPPTTIIOONN
|
||||
The command-line options are deliberately very similar to
|
||||
those of _G_N_U _G_z_i_p_, but they are not identical.
|
||||
|
||||
_B_z_i_p_2 expects a list of file names to accompany the com-
|
||||
_b_z_i_p_2 expects a list of file names to accompany the com-
|
||||
mand-line flags. Each file is replaced by a compressed
|
||||
version of itself, with the name "original_name.bz2".
|
||||
Each compressed file has the same modification date and
|
||||
@ -38,8 +40,8 @@ DDEESSCCRRIIPPTTIIOONN
|
||||
cepts, or have serious file name length restrictions, such
|
||||
as MS-DOS.
|
||||
|
||||
_B_z_i_p_2 and _b_u_n_z_i_p_2 will not overwrite existing files; if
|
||||
you want this to happen, you should delete them first.
|
||||
_b_z_i_p_2 and _b_u_n_z_i_p_2 will by default not overwrite existing
|
||||
files; if you want this to happen, specify the -f flag.
|
||||
|
||||
If no file names are specified, _b_z_i_p_2 compresses from
|
||||
standard input to standard output. In this case, _b_z_i_p_2
|
||||
@ -47,17 +49,15 @@ DDEESSCCRRIIPPTTIIOONN
|
||||
this would be entirely incomprehensible and therefore
|
||||
pointless.
|
||||
|
||||
_B_u_n_z_i_p_2 (or _b_z_i_p_2 _-_d ) decompresses and restores all spec-
|
||||
_b_u_n_z_i_p_2 (or _b_z_i_p_2 _-_d ) decompresses and restores all spec-
|
||||
ified files whose names end in ".bz2". Files without this
|
||||
suffix are ignored. Again, supplying no filenames causes
|
||||
decompression from standard input to standard output.
|
||||
|
||||
You can also compress or decompress files to the standard
|
||||
output by giving the -c flag. You can decompress multiple
|
||||
files like this, but you may only compress a single file
|
||||
this way, since it would otherwise be difficult to sepa-
|
||||
rate out the compressed representations of the original
|
||||
files.
|
||||
_b_u_n_z_i_p_2 will correctly decompress a file which is the con-
|
||||
catenation of two or more compressed files. The result is
|
||||
the concatenation of the corresponding uncompressed files.
|
||||
Integrity testing (-t) of concatenated compressed files is
|
||||
|
||||
|
||||
|
||||
@ -70,6 +70,21 @@ DDEESSCCRRIIPPTTIIOONN
|
||||
bzip2(1) bzip2(1)
|
||||
|
||||
|
||||
also supported.
|
||||
|
||||
You can also compress or decompress files to the standard
|
||||
output by giving the -c flag. Multiple files may be com-
|
||||
pressed and decompressed like this. The resulting outputs
|
||||
are fed sequentially to stdout. Compression of multiple
|
||||
files in this manner generates a stream containing multi-
|
||||
ple compressed file representations. Such a stream can be
|
||||
decompressed correctly only by _b_z_i_p_2 version 0.9.0 or
|
||||
later. Earlier versions of _b_z_i_p_2 will stop after decom-
|
||||
pressing the first file in the stream.
|
||||
|
||||
_b_z_c_a_t (or _b_z_i_p_2 _-_d_c ) decompresses all specified files to
|
||||
the standard output.
|
||||
|
||||
Compression is always performed, even if the compressed
|
||||
file is slightly larger than the original. Files of less
|
||||
than about one hundred bytes tend to get larger, since the
|
||||
@ -108,22 +123,7 @@ MMEEMMOORRYY MMAANNAAGGEEMMEENNTT
|
||||
file, and _b_u_n_z_i_p_2 then allocates itself just enough memory
|
||||
to decompress the file. Since block sizes are stored in
|
||||
compressed files, it follows that the flags -1 to -9 are
|
||||
irrelevant to and so ignored during decompression. Com-
|
||||
pression and decompression requirements, in bytes, can be
|
||||
estimated as:
|
||||
|
||||
Compression: 400k + ( 7 x block size )
|
||||
|
||||
Decompression: 100k + ( 5 x block size ), or
|
||||
100k + ( 2.5 x block size )
|
||||
|
||||
Larger block sizes give rapidly diminishing marginal
|
||||
returns; most of the compression comes from the first two
|
||||
or three hundred k of block size, a fact worth bearing in
|
||||
mind when using _b_z_i_p_2 on small machines. It is also
|
||||
important to appreciate that the decompression memory
|
||||
requirement is set at compression-time by the choice of
|
||||
block size.
|
||||
irrelevant to and so ignored during decompression.
|
||||
|
||||
|
||||
|
||||
@ -136,8 +136,24 @@ MMEEMMOORRYY MMAANNAAGGEEMMEENNTT
|
||||
bzip2(1) bzip2(1)
|
||||
|
||||
|
||||
Compression and decompression requirements, in bytes, can
|
||||
be estimated as:
|
||||
|
||||
Compression: 400k + ( 7 x block size )
|
||||
|
||||
Decompression: 100k + ( 4 x block size ), or
|
||||
100k + ( 2.5 x block size )
|
||||
|
||||
Larger block sizes give rapidly diminishing marginal
|
||||
returns; most of the compression comes from the first two
|
||||
or three hundred k of block size, a fact worth bearing in
|
||||
mind when using _b_z_i_p_2 on small machines. It is also
|
||||
important to appreciate that the decompression memory
|
||||
requirement is set at compression-time by the choice of
|
||||
block size.
|
||||
|
||||
For files compressed with the default 900k block size,
|
||||
_b_u_n_z_i_p_2 will require about 4600 kbytes to decompress. To
|
||||
_b_u_n_z_i_p_2 will require about 3700 kbytes to decompress. To
|
||||
support decompression of any file on a 4 megabyte machine,
|
||||
_b_u_n_z_i_p_2 has an option to decompress using approximately
|
||||
half this amount of memory, about 2300 kbytes. Decompres-
|
||||
@ -157,8 +173,8 @@ bzip2(1) bzip2(1)
|
||||
file 20,000 bytes long with the flag -9 will cause the
|
||||
compressor to allocate around 6700k of memory, but only
|
||||
touch 400k + 20000 * 7 = 540 kbytes of it. Similarly, the
|
||||
decompressor will allocate 4600k but only touch 100k +
|
||||
20000 * 5 = 200 kbytes.
|
||||
decompressor will allocate 3700k but only touch 100k +
|
||||
20000 * 4 = 180 kbytes.
|
||||
|
||||
Here is a table which summarises the maximum memory usage
|
||||
for different block sizes. Also recorded is the total
|
||||
@ -172,24 +188,8 @@ bzip2(1) bzip2(1)
|
||||
Compress Decompress Decompress Corpus
|
||||
Flag usage usage -s usage Size
|
||||
|
||||
-1 1100k 600k 350k 914704
|
||||
-2 1800k 1100k 600k 877703
|
||||
-3 2500k 1600k 850k 860338
|
||||
-4 3200k 2100k 1100k 846899
|
||||
-5 3900k 2600k 1350k 845160
|
||||
-6 4600k 3100k 1600k 838626
|
||||
-7 5400k 3600k 1850k 834096
|
||||
-8 6000k 4100k 2100k 828642
|
||||
-9 6700k 4600k 2350k 828642
|
||||
|
||||
|
||||
OOPPTTIIOONNSS
|
||||
--cc ----ssttddoouutt
|
||||
Compress or decompress to standard output. -c will
|
||||
decompress multiple files to stdout, but will only
|
||||
compress a single file to stdout.
|
||||
|
||||
|
||||
-1 1100k 500k 350k 914704
|
||||
-2 1800k 900k 600k 877703
|
||||
|
||||
|
||||
|
||||
@ -202,34 +202,52 @@ OOPPTTIIOONNSS
|
||||
bzip2(1) bzip2(1)
|
||||
|
||||
|
||||
--dd ----ddeeccoommpprreessss
|
||||
Force decompression. _B_z_i_p_2 and _b_u_n_z_i_p_2 are really
|
||||
the same program, and the decision about whether to
|
||||
compress or decompress is done on the basis of
|
||||
which name is used. This flag overrides that mech-
|
||||
anism, and forces _b_z_i_p_2 to decompress.
|
||||
-3 2500k 1300k 850k 860338
|
||||
-4 3200k 1700k 1100k 846899
|
||||
-5 3900k 2100k 1350k 845160
|
||||
-6 4600k 2500k 1600k 838626
|
||||
-7 5400k 2900k 1850k 834096
|
||||
-8 6000k 3300k 2100k 828642
|
||||
-9 6700k 3700k 2350k 828642
|
||||
|
||||
--ff ----ccoommpprreessss
|
||||
|
||||
OOPPTTIIOONNSS
|
||||
--cc ----ssttddoouutt
|
||||
Compress or decompress to standard output. -c will
|
||||
decompress multiple files to stdout, but will only
|
||||
compress a single file to stdout.
|
||||
|
||||
--dd ----ddeeccoommpprreessss
|
||||
Force decompression. _b_z_i_p_2_, _b_u_n_z_i_p_2 and _b_z_c_a_t are
|
||||
really the same program, and the decision about
|
||||
what actions to take is done on the basis of which
|
||||
name is used. This flag overrides that mechanism,
|
||||
and forces _b_z_i_p_2 to decompress.
|
||||
|
||||
--zz ----ccoommpprreessss
|
||||
The complement to -d: forces compression, regard-
|
||||
less of the invokation name.
|
||||
|
||||
--tt ----tteesstt
|
||||
Check integrity of the specified file(s), but don't
|
||||
decompress them. This really performs a trial
|
||||
decompression and throws away the result, using the
|
||||
low-memory decompression algorithm (see -s).
|
||||
decompression and throws away the result.
|
||||
|
||||
--ff ----ffoorrccee
|
||||
Force overwrite of output files. Normally, _b_z_i_p_2
|
||||
will not overwrite existing output files.
|
||||
|
||||
--kk ----kkeeeepp
|
||||
Keep (don't delete) input files during compression
|
||||
or decompression.
|
||||
|
||||
--ss ----ssmmaallll
|
||||
Reduce memory usage, both for compression and
|
||||
decompression. Files are decompressed using a mod-
|
||||
ified algorithm which only requires 2.5 bytes per
|
||||
block byte. This means any file can be decom-
|
||||
pressed in 2300k of memory, albeit somewhat more
|
||||
slowly than usual.
|
||||
Reduce memory usage, for compression, decompression
|
||||
and testing. Files are decompressed and tested
|
||||
using a modified algorithm which only requires 2.5
|
||||
bytes per block byte. This means any file can be
|
||||
decompressed in 2300k of memory, albeit at about
|
||||
half the normal speed.
|
||||
|
||||
During compression, -s selects a block size of
|
||||
200k, which limits memory use to around the same
|
||||
@ -239,24 +257,6 @@ bzip2(1) bzip2(1)
|
||||
MEMORY MANAGEMENT above.
|
||||
|
||||
|
||||
--vv ----vveerrbboossee
|
||||
Verbose mode -- show the compression ratio for each
|
||||
file processed. Further -v's increase the ver-
|
||||
bosity level, spewing out lots of information which
|
||||
is primarily of interest for diagnostic purposes.
|
||||
|
||||
--LL ----lliicceennssee
|
||||
Display the software version, license terms and
|
||||
conditions.
|
||||
|
||||
--VV ----vveerrssiioonn
|
||||
Same as -L.
|
||||
|
||||
--11 ttoo --99
|
||||
Set the block size to 100 k, 200 k .. 900 k when
|
||||
compressing. Has no effect when decompressing.
|
||||
See MEMORY MANAGEMENT above.
|
||||
|
||||
|
||||
|
||||
4
|
||||
@ -268,6 +268,21 @@ bzip2(1) bzip2(1)
|
||||
bzip2(1) bzip2(1)
|
||||
|
||||
|
||||
--vv ----vveerrbboossee
|
||||
Verbose mode -- show the compression ratio for each
|
||||
file processed. Further -v's increase the ver-
|
||||
bosity level, spewing out lots of information which
|
||||
is primarily of interest for diagnostic purposes.
|
||||
|
||||
--LL ----lliicceennssee --VV ----vveerrssiioonn
|
||||
Display the software version, license terms and
|
||||
conditions.
|
||||
|
||||
--11 ttoo --99
|
||||
Set the block size to 100 k, 200 k .. 900 k when
|
||||
compressing. Has no effect when decompressing.
|
||||
See MEMORY MANAGEMENT above.
|
||||
|
||||
----rreeppeettiittiivvee--ffaasstt
|
||||
_b_z_i_p_2 injects some small pseudo-random variations
|
||||
into very repetitive blocks to limit worst-case
|
||||
@ -306,22 +321,7 @@ RREECCOOVVEERRIINNGG DDAATTAA FFRROOMM DDAAMMAAGGEEDD F
|
||||
_b_z_i_p_2_r_e_c_o_v_e_r takes a single argument, the name of the dam-
|
||||
aged file, and writes a number of files "rec0001file.bz2",
|
||||
"rec0002file.bz2", etc, containing the extracted blocks.
|
||||
The output filenames are designed so that the use of wild-
|
||||
cards in subsequent processing -- for example, "bzip2 -dc
|
||||
rec*file.bz2 > recovered_data" -- lists the files in the
|
||||
"right" order.
|
||||
|
||||
_b_z_i_p_2_r_e_c_o_v_e_r should be of most use dealing with large .bz2
|
||||
files, as these will contain many blocks. It is clearly
|
||||
futile to use it on damaged single-block files, since a
|
||||
damaged block cannot be recovered. If you wish to min-
|
||||
imise any potential data loss through media or transmis-
|
||||
sion errors, you might consider compressing with a smaller
|
||||
block size.
|
||||
|
||||
|
||||
PPEERRFFOORRMMAANNCCEE NNOOTTEESS
|
||||
The sorting phase of compression gathers together similar
|
||||
The output filenames are designed so that the use of
|
||||
|
||||
|
||||
|
||||
@ -334,6 +334,21 @@ PPEERRFFOORRMMAANNCCEE NNOOTTEESS
|
||||
bzip2(1) bzip2(1)
|
||||
|
||||
|
||||
wildcards in subsequent processing -- for example, "bzip2
|
||||
-dc rec*file.bz2 > recovered_data" -- lists the files in
|
||||
the "right" order.
|
||||
|
||||
_b_z_i_p_2_r_e_c_o_v_e_r should be of most use dealing with large .bz2
|
||||
files, as these will contain many blocks. It is clearly
|
||||
futile to use it on damaged single-block files, since a
|
||||
damaged block cannot be recovered. If you wish to min-
|
||||
imise any potential data loss through media or transmis-
|
||||
sion errors, you might consider compressing with a smaller
|
||||
block size.
|
||||
|
||||
|
||||
PPEERRFFOORRMMAANNCCEE NNOOTTEESS
|
||||
The sorting phase of compression gathers together similar
|
||||
strings in the file. Because of this, files containing
|
||||
very long runs of repeated symbols, like "aabaabaabaab
|
||||
..." (repeated several hundred times) may compress
|
||||
@ -348,10 +363,6 @@ bzip2(1) bzip2(1)
|
||||
severe slowness in compression, try making the block size
|
||||
as small as possible, with flag -1.
|
||||
|
||||
Incompressible or virtually-incompressible data may decom-
|
||||
press rather more slowly than one would hope. This is due
|
||||
to a naive implementation of the move-to-front coder.
|
||||
|
||||
_b_z_i_p_2 usually allocates several megabytes of memory to
|
||||
operate in, and then charges all over it in a fairly ran-
|
||||
dom fashion. This means that performance, both for com-
|
||||
@ -362,12 +373,6 @@ bzip2(1) bzip2(1)
|
||||
large performance improvements. I imagine _b_z_i_p_2 will per-
|
||||
form best on machines with very large caches.
|
||||
|
||||
Test mode (-t) uses the low-memory decompression algorithm
|
||||
(-s). This means test mode does not run as fast as it
|
||||
could; it could run as fast as the normal decompression
|
||||
machinery. This could easily be fixed at the cost of some
|
||||
code bloat.
|
||||
|
||||
|
||||
CCAAVVEEAATTSS
|
||||
I/O error messages are not as helpful as they could be.
|
||||
@ -375,19 +380,14 @@ CCAAVVEEAATTSS
|
||||
but the details of what the problem is sometimes seem
|
||||
rather misleading.
|
||||
|
||||
This manual page pertains to version 0.1 of _b_z_i_p_2_. It may
|
||||
well happen that some future version will use a different
|
||||
compressed file format. If you try to decompress, using
|
||||
0.1, a .bz2 file created with some future version which
|
||||
uses a different compressed file format, 0.1 will complain
|
||||
that your file "is not a bzip2 file". If that happens,
|
||||
you should obtain a more recent version of _b_z_i_p_2 and use
|
||||
that to decompress the file.
|
||||
This manual page pertains to version 0.9.0 of _b_z_i_p_2_. Com-
|
||||
pressed data created by this version is entirely forwards
|
||||
and backwards compatible with the previous public release,
|
||||
version 0.1pl2, but with the following exception: 0.9.0
|
||||
can correctly decompress multiple concatenated compressed
|
||||
files. 0.1pl2 cannot do this; it will stop after decom-
|
||||
pressing just the first file in the stream.
|
||||
|
||||
Wildcard expansion for Windows 95 and NT is flaky.
|
||||
|
||||
_b_z_i_p_2_r_e_c_o_v_e_r uses 32-bit integers to represent bit posi-
|
||||
tions in compressed files, so it cannot handle compressed
|
||||
|
||||
|
||||
|
||||
@ -400,59 +400,31 @@ CCAAVVEEAATTSS
|
||||
bzip2(1) bzip2(1)
|
||||
|
||||
|
||||
files more than 512 megabytes long. This could easily be
|
||||
Wildcard expansion for Windows 95 and NT is flaky.
|
||||
|
||||
_b_z_i_p_2_r_e_c_o_v_e_r uses 32-bit integers to represent bit posi-
|
||||
tions in compressed files, so it cannot handle compressed
|
||||
files more than 512 megabytes long. This could easily be
|
||||
fixed.
|
||||
|
||||
_b_z_i_p_2_r_e_c_o_v_e_r sometimes reports a very small, incomplete
|
||||
final block. This is spurious and can be safely ignored.
|
||||
|
||||
|
||||
RREELLAATTIIOONNSSHHIIPP TTOO bbzziipp--00..2211
|
||||
This program is a descendant of the _b_z_i_p program, version
|
||||
0.21, which I released in August 1996. The primary dif-
|
||||
ference of _b_z_i_p_2 is its avoidance of the possibly patented
|
||||
algorithms which were used in 0.21. _b_z_i_p_2 also brings
|
||||
various useful refinements (-s, -t), uses less memory,
|
||||
decompresses significantly faster, and has support for
|
||||
recovering data from damaged files.
|
||||
|
||||
Because _b_z_i_p_2 uses Huffman coding to construct the com-
|
||||
pressed bitstream, rather than the arithmetic coding used
|
||||
in 0.21, the compressed representations generated by the
|
||||
two programs are incompatible, and they will not interop-
|
||||
erate. The change in suffix from .bz to .bz2 reflects
|
||||
this. It would have been helpful to at least allow _b_z_i_p_2
|
||||
to decompress files created by 0.21, but this would defeat
|
||||
the primary aim of having a patent-free compressor.
|
||||
|
||||
For a more precise statement about patent issues in bzip2,
|
||||
please see the README file in the distribution.
|
||||
|
||||
Huffman coding necessarily involves some coding ineffi-
|
||||
ciency compared to arithmetic coding. This means that
|
||||
_b_z_i_p_2 compresses about 1% worse than 0.21, an unfortunate
|
||||
but unavoidable fact-of-life. On the other hand, decom-
|
||||
pression is approximately 50% faster for the same reason,
|
||||
and the change in file format gave an opportunity to add
|
||||
data-recovery features. So it is not all bad.
|
||||
|
||||
|
||||
AAUUTTHHOORR
|
||||
Julian Seward, jseward@acm.org.
|
||||
http://www.muraroa.demon.co.uk
|
||||
|
||||
The ideas embodied in _b_z_i_p and _b_z_i_p_2 are due to (at least)
|
||||
the following people: Michael Burrows and David Wheeler
|
||||
(for the block sorting transformation), David Wheeler
|
||||
(again, for the Huffman coder), Peter Fenwick (for the
|
||||
structured coding model in 0.21, and many refinements),
|
||||
and Alistair Moffat, Radford Neal and Ian Witten (for the
|
||||
arithmetic coder in 0.21). I am much indebted for their
|
||||
help, support and advice. See the file ALGORITHMS in the
|
||||
source distribution for pointers to sources of documenta-
|
||||
tion. Christian von Roques encouraged me to look for
|
||||
faster sorting algorithms, so as to speed up compression.
|
||||
Bela Lubkin encouraged me to improve the worst-case com-
|
||||
pression performance. Many people sent patches, helped
|
||||
The ideas embodied in _b_z_i_p_2 are due to (at least) the fol-
|
||||
lowing people: Michael Burrows and David Wheeler (for the
|
||||
block sorting transformation), David Wheeler (again, for
|
||||
the Huffman coder), Peter Fenwick (for the structured cod-
|
||||
ing model in the original _b_z_i_p_, and many refinements), and
|
||||
Alistair Moffat, Radford Neal and Ian Witten (for the
|
||||
arithmetic coder in the original _b_z_i_p_)_. I am much
|
||||
indebted for their help, support and advice. See the man-
|
||||
ual in the source distribution for pointers to sources of
|
||||
documentation. Christian von Roques encouraged me to look
|
||||
for faster sorting algorithms, so as to speed up compres-
|
||||
sion. Bela Lubkin encouraged me to improve the worst-case
|
||||
compression performance. Many people sent patches, helped
|
||||
with portability problems, lent machines, gave advice and
|
||||
were generally helpful.
|
||||
|
||||
@ -460,6 +432,32 @@ AAUUTTHHOORR
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
7
|
||||
|
||||
|
||||
|
288
bzip2.txt
288
bzip2.txt
@ -1,22 +1,22 @@
|
||||
|
||||
|
||||
|
||||
bzip2(1) bzip2(1)
|
||||
|
||||
|
||||
NAME
|
||||
bzip2, bunzip2 - a block-sorting file compressor, v0.1
|
||||
bzip2, bunzip2 - a block-sorting file compressor, v0.9.0
|
||||
bzcat - decompresses files to stdout
|
||||
bzip2recover - recovers data from damaged bzip2 files
|
||||
|
||||
|
||||
SYNOPSIS
|
||||
bzip2 [ -cdfkstvVL123456789 ] [ filenames ... ]
|
||||
bunzip2 [ -kvsVL ] [ filenames ... ]
|
||||
bzip2 [ -cdfkstvzVL123456789 ] [ filenames ... ]
|
||||
bunzip2 [ -fkvsVL ] [ filenames ... ]
|
||||
bzcat [ -s ] [ filenames ... ]
|
||||
bzip2recover filename
|
||||
|
||||
|
||||
DESCRIPTION
|
||||
Bzip2 compresses files using the Burrows-Wheeler block-
|
||||
bzip2 compresses files using the Burrows-Wheeler block-
|
||||
sorting text compression algorithm, and Huffman coding.
|
||||
Compression is generally considerably better than that
|
||||
achieved by more conventional LZ77/LZ78-based compressors,
|
||||
@ -26,7 +26,7 @@ DESCRIPTION
|
||||
The command-line options are deliberately very similar to
|
||||
those of GNU Gzip, but they are not identical.
|
||||
|
||||
Bzip2 expects a list of file names to accompany the com-
|
||||
bzip2 expects a list of file names to accompany the com-
|
||||
mand-line flags. Each file is replaced by a compressed
|
||||
version of itself, with the name "original_name.bz2".
|
||||
Each compressed file has the same modification date and
|
||||
@ -38,8 +38,8 @@ DESCRIPTION
|
||||
cepts, or have serious file name length restrictions, such
|
||||
as MS-DOS.
|
||||
|
||||
Bzip2 and bunzip2 will not overwrite existing files; if
|
||||
you want this to happen, you should delete them first.
|
||||
bzip2 and bunzip2 will by default not overwrite existing
|
||||
files; if you want this to happen, specify the -f flag.
|
||||
|
||||
If no file names are specified, bzip2 compresses from
|
||||
standard input to standard output. In this case, bzip2
|
||||
@ -47,28 +47,29 @@ DESCRIPTION
|
||||
this would be entirely incomprehensible and therefore
|
||||
pointless.
|
||||
|
||||
Bunzip2 (or bzip2 -d ) decompresses and restores all spec-
|
||||
bunzip2 (or bzip2 -d ) decompresses and restores all spec-
|
||||
ified files whose names end in ".bz2". Files without this
|
||||
suffix are ignored. Again, supplying no filenames causes
|
||||
decompression from standard input to standard output.
|
||||
|
||||
bunzip2 will correctly decompress a file which is the con-
|
||||
catenation of two or more compressed files. The result is
|
||||
the concatenation of the corresponding uncompressed files.
|
||||
Integrity testing (-t) of concatenated compressed files is
|
||||
also supported.
|
||||
|
||||
You can also compress or decompress files to the standard
|
||||
output by giving the -c flag. You can decompress multiple
|
||||
files like this, but you may only compress a single file
|
||||
this way, since it would otherwise be difficult to sepa-
|
||||
rate out the compressed representations of the original
|
||||
files.
|
||||
|
||||
|
||||
|
||||
1
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
bzip2(1) bzip2(1)
|
||||
output by giving the -c flag. Multiple files may be com-
|
||||
pressed and decompressed like this. The resulting outputs
|
||||
are fed sequentially to stdout. Compression of multiple
|
||||
files in this manner generates a stream containing multi-
|
||||
ple compressed file representations. Such a stream can be
|
||||
decompressed correctly only by bzip2 version 0.9.0 or
|
||||
later. Earlier versions of bzip2 will stop after decom-
|
||||
pressing the first file in the stream.
|
||||
|
||||
bzcat (or bzip2 -dc ) decompresses all specified files to
|
||||
the standard output.
|
||||
|
||||
Compression is always performed, even if the compressed
|
||||
file is slightly larger than the original. Files of less
|
||||
@ -108,13 +109,14 @@ MEMORY MANAGEMENT
|
||||
file, and bunzip2 then allocates itself just enough memory
|
||||
to decompress the file. Since block sizes are stored in
|
||||
compressed files, it follows that the flags -1 to -9 are
|
||||
irrelevant to and so ignored during decompression. Com-
|
||||
pression and decompression requirements, in bytes, can be
|
||||
estimated as:
|
||||
irrelevant to and so ignored during decompression.
|
||||
|
||||
Compression and decompression requirements, in bytes, can
|
||||
be estimated as:
|
||||
|
||||
Compression: 400k + ( 7 x block size )
|
||||
|
||||
Decompression: 100k + ( 5 x block size ), or
|
||||
Decompression: 100k + ( 4 x block size ), or
|
||||
100k + ( 2.5 x block size )
|
||||
|
||||
Larger block sizes give rapidly diminishing marginal
|
||||
@ -125,19 +127,8 @@ MEMORY MANAGEMENT
|
||||
requirement is set at compression-time by the choice of
|
||||
block size.
|
||||
|
||||
|
||||
|
||||
2
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
bzip2(1) bzip2(1)
|
||||
|
||||
|
||||
For files compressed with the default 900k block size,
|
||||
bunzip2 will require about 4600 kbytes to decompress. To
|
||||
bunzip2 will require about 3700 kbytes to decompress. To
|
||||
support decompression of any file on a 4 megabyte machine,
|
||||
bunzip2 has an option to decompress using approximately
|
||||
half this amount of memory, about 2300 kbytes. Decompres-
|
||||
@ -157,8 +148,8 @@ bzip2(1) bzip2(1)
|
||||
file 20,000 bytes long with the flag -9 will cause the
|
||||
compressor to allocate around 6700k of memory, but only
|
||||
touch 400k + 20000 * 7 = 540 kbytes of it. Similarly, the
|
||||
decompressor will allocate 4600k but only touch 100k +
|
||||
20000 * 5 = 200 kbytes.
|
||||
decompressor will allocate 3700k but only touch 100k +
|
||||
20000 * 4 = 180 kbytes.
|
||||
|
||||
Here is a table which summarises the maximum memory usage
|
||||
for different block sizes. Also recorded is the total
|
||||
@ -172,15 +163,15 @@ bzip2(1) bzip2(1)
|
||||
Compress Decompress Decompress Corpus
|
||||
Flag usage usage -s usage Size
|
||||
|
||||
-1 1100k 600k 350k 914704
|
||||
-2 1800k 1100k 600k 877703
|
||||
-3 2500k 1600k 850k 860338
|
||||
-4 3200k 2100k 1100k 846899
|
||||
-5 3900k 2600k 1350k 845160
|
||||
-6 4600k 3100k 1600k 838626
|
||||
-7 5400k 3600k 1850k 834096
|
||||
-8 6000k 4100k 2100k 828642
|
||||
-9 6700k 4600k 2350k 828642
|
||||
-1 1100k 500k 350k 914704
|
||||
-2 1800k 900k 600k 877703
|
||||
-3 2500k 1300k 850k 860338
|
||||
-4 3200k 1700k 1100k 846899
|
||||
-5 3900k 2100k 1350k 845160
|
||||
-6 4600k 2500k 1600k 838626
|
||||
-7 5400k 2900k 1850k 834096
|
||||
-8 6000k 3300k 2100k 828642
|
||||
-9 6700k 3700k 2350k 828642
|
||||
|
||||
|
||||
OPTIONS
|
||||
@ -189,47 +180,37 @@ OPTIONS
|
||||
decompress multiple files to stdout, but will only
|
||||
compress a single file to stdout.
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
3
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
bzip2(1) bzip2(1)
|
||||
|
||||
|
||||
-d --decompress
|
||||
Force decompression. Bzip2 and bunzip2 are really
|
||||
the same program, and the decision about whether to
|
||||
compress or decompress is done on the basis of
|
||||
which name is used. This flag overrides that mech-
|
||||
anism, and forces bzip2 to decompress.
|
||||
Force decompression. bzip2, bunzip2 and bzcat are
|
||||
really the same program, and the decision about
|
||||
what actions to take is done on the basis of which
|
||||
name is used. This flag overrides that mechanism,
|
||||
and forces bzip2 to decompress.
|
||||
|
||||
-f --compress
|
||||
-z --compress
|
||||
The complement to -d: forces compression, regard-
|
||||
less of the invokation name.
|
||||
|
||||
-t --test
|
||||
Check integrity of the specified file(s), but don't
|
||||
decompress them. This really performs a trial
|
||||
decompression and throws away the result, using the
|
||||
low-memory decompression algorithm (see -s).
|
||||
decompression and throws away the result.
|
||||
|
||||
-f --force
|
||||
Force overwrite of output files. Normally, bzip2
|
||||
will not overwrite existing output files.
|
||||
|
||||
-k --keep
|
||||
Keep (don't delete) input files during compression
|
||||
or decompression.
|
||||
|
||||
-s --small
|
||||
Reduce memory usage, both for compression and
|
||||
decompression. Files are decompressed using a mod-
|
||||
ified algorithm which only requires 2.5 bytes per
|
||||
block byte. This means any file can be decom-
|
||||
pressed in 2300k of memory, albeit somewhat more
|
||||
slowly than usual.
|
||||
Reduce memory usage, for compression, decompression
|
||||
and testing. Files are decompressed and tested
|
||||
using a modified algorithm which only requires 2.5
|
||||
bytes per block byte. This means any file can be
|
||||
decompressed in 2300k of memory, albeit at about
|
||||
half the normal speed.
|
||||
|
||||
During compression, -s selects a block size of
|
||||
200k, which limits memory use to around the same
|
||||
@ -238,36 +219,21 @@ bzip2(1) bzip2(1)
|
||||
megabytes or less), use -s for everything. See
|
||||
MEMORY MANAGEMENT above.
|
||||
|
||||
|
||||
-v --verbose
|
||||
Verbose mode -- show the compression ratio for each
|
||||
file processed. Further -v's increase the ver-
|
||||
bosity level, spewing out lots of information which
|
||||
is primarily of interest for diagnostic purposes.
|
||||
|
||||
-L --license
|
||||
-L --license -V --version
|
||||
Display the software version, license terms and
|
||||
conditions.
|
||||
|
||||
-V --version
|
||||
Same as -L.
|
||||
|
||||
-1 to -9
|
||||
Set the block size to 100 k, 200 k .. 900 k when
|
||||
compressing. Has no effect when decompressing.
|
||||
See MEMORY MANAGEMENT above.
|
||||
|
||||
|
||||
|
||||
4
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
bzip2(1) bzip2(1)
|
||||
|
||||
|
||||
--repetitive-fast
|
||||
bzip2 injects some small pseudo-random variations
|
||||
into very repetitive blocks to limit worst-case
|
||||
@ -278,7 +244,6 @@ bzip2(1) bzip2(1)
|
||||
would take before resorting to randomisation. This
|
||||
flag makes it give up much sooner.
|
||||
|
||||
|
||||
--repetitive-best
|
||||
Opposite of --repetitive-fast; try a lot harder
|
||||
before resorting to randomisation.
|
||||
@ -306,10 +271,10 @@ RECOVERING DATA FROM DAMAGED FILES
|
||||
bzip2recover takes a single argument, the name of the dam-
|
||||
aged file, and writes a number of files "rec0001file.bz2",
|
||||
"rec0002file.bz2", etc, containing the extracted blocks.
|
||||
The output filenames are designed so that the use of wild-
|
||||
cards in subsequent processing -- for example, "bzip2 -dc
|
||||
rec*file.bz2 > recovered_data" -- lists the files in the
|
||||
"right" order.
|
||||
The output filenames are designed so that the use of
|
||||
wildcards in subsequent processing -- for example, "bzip2
|
||||
-dc rec*file.bz2 > recovered_data" -- lists the files in
|
||||
the "right" order.
|
||||
|
||||
bzip2recover should be of most use dealing with large .bz2
|
||||
files, as these will contain many blocks. It is clearly
|
||||
@ -322,18 +287,6 @@ RECOVERING DATA FROM DAMAGED FILES
|
||||
|
||||
PERFORMANCE NOTES
|
||||
The sorting phase of compression gathers together similar
|
||||
|
||||
|
||||
|
||||
5
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
bzip2(1) bzip2(1)
|
||||
|
||||
|
||||
strings in the file. Because of this, files containing
|
||||
very long runs of repeated symbols, like "aabaabaabaab
|
||||
..." (repeated several hundred times) may compress
|
||||
@ -348,10 +301,6 @@ bzip2(1) bzip2(1)
|
||||
severe slowness in compression, try making the block size
|
||||
as small as possible, with flag -1.
|
||||
|
||||
Incompressible or virtually-incompressible data may decom-
|
||||
press rather more slowly than one would hope. This is due
|
||||
to a naive implementation of the move-to-front coder.
|
||||
|
||||
bzip2 usually allocates several megabytes of memory to
|
||||
operate in, and then charges all over it in a fairly ran-
|
||||
dom fashion. This means that performance, both for com-
|
||||
@ -362,12 +311,6 @@ bzip2(1) bzip2(1)
|
||||
large performance improvements. I imagine bzip2 will per-
|
||||
form best on machines with very large caches.
|
||||
|
||||
Test mode (-t) uses the low-memory decompression algorithm
|
||||
(-s). This means test mode does not run as fast as it
|
||||
could; it could run as fast as the normal decompression
|
||||
machinery. This could easily be fixed at the cost of some
|
||||
code bloat.
|
||||
|
||||
|
||||
CAVEATS
|
||||
I/O error messages are not as helpful as they could be.
|
||||
@ -375,91 +318,38 @@ CAVEATS
|
||||
but the details of what the problem is sometimes seem
|
||||
rather misleading.
|
||||
|
||||
This manual page pertains to version 0.1 of bzip2. It may
|
||||
well happen that some future version will use a different
|
||||
compressed file format. If you try to decompress, using
|
||||
0.1, a .bz2 file created with some future version which
|
||||
uses a different compressed file format, 0.1 will complain
|
||||
that your file "is not a bzip2 file". If that happens,
|
||||
you should obtain a more recent version of bzip2 and use
|
||||
that to decompress the file.
|
||||
This manual page pertains to version 0.9.0 of bzip2. Com-
|
||||
pressed data created by this version is entirely forwards
|
||||
and backwards compatible with the previous public release,
|
||||
version 0.1pl2, but with the following exception: 0.9.0
|
||||
can correctly decompress multiple concatenated compressed
|
||||
files. 0.1pl2 cannot do this; it will stop after decom-
|
||||
pressing just the first file in the stream.
|
||||
|
||||
Wildcard expansion for Windows 95 and NT is flaky.
|
||||
|
||||
bzip2recover uses 32-bit integers to represent bit posi-
|
||||
tions in compressed files, so it cannot handle compressed
|
||||
|
||||
|
||||
|
||||
6
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
bzip2(1) bzip2(1)
|
||||
|
||||
|
||||
files more than 512 megabytes long. This could easily be
|
||||
bzip2recover uses 32-bit integers to represent bit posi-
|
||||
tions in compressed files, so it cannot handle compressed
|
||||
files more than 512 megabytes long. This could easily be
|
||||
fixed.
|
||||
|
||||
bzip2recover sometimes reports a very small, incomplete
|
||||
final block. This is spurious and can be safely ignored.
|
||||
|
||||
|
||||
RELATIONSHIP TO bzip-0.21
|
||||
This program is a descendant of the bzip program, version
|
||||
0.21, which I released in August 1996. The primary dif-
|
||||
ference of bzip2 is its avoidance of the possibly patented
|
||||
algorithms which were used in 0.21. bzip2 also brings
|
||||
various useful refinements (-s, -t), uses less memory,
|
||||
decompresses significantly faster, and has support for
|
||||
recovering data from damaged files.
|
||||
|
||||
Because bzip2 uses Huffman coding to construct the com-
|
||||
pressed bitstream, rather than the arithmetic coding used
|
||||
in 0.21, the compressed representations generated by the
|
||||
two programs are incompatible, and they will not interop-
|
||||
erate. The change in suffix from .bz to .bz2 reflects
|
||||
this. It would have been helpful to at least allow bzip2
|
||||
to decompress files created by 0.21, but this would defeat
|
||||
the primary aim of having a patent-free compressor.
|
||||
|
||||
For a more precise statement about patent issues in bzip2,
|
||||
please see the README file in the distribution.
|
||||
|
||||
Huffman coding necessarily involves some coding ineffi-
|
||||
ciency compared to arithmetic coding. This means that
|
||||
bzip2 compresses about 1% worse than 0.21, an unfortunate
|
||||
but unavoidable fact-of-life. On the other hand, decom-
|
||||
pression is approximately 50% faster for the same reason,
|
||||
and the change in file format gave an opportunity to add
|
||||
data-recovery features. So it is not all bad.
|
||||
|
||||
|
||||
AUTHOR
|
||||
Julian Seward, jseward@acm.org.
|
||||
http://www.muraroa.demon.co.uk
|
||||
|
||||
The ideas embodied in bzip and bzip2 are due to (at least)
|
||||
the following people: Michael Burrows and David Wheeler
|
||||
(for the block sorting transformation), David Wheeler
|
||||
(again, for the Huffman coder), Peter Fenwick (for the
|
||||
structured coding model in 0.21, and many refinements),
|
||||
and Alistair Moffat, Radford Neal and Ian Witten (for the
|
||||
arithmetic coder in 0.21). I am much indebted for their
|
||||
help, support and advice. See the file ALGORITHMS in the
|
||||
source distribution for pointers to sources of documenta-
|
||||
tion. Christian von Roques encouraged me to look for
|
||||
faster sorting algorithms, so as to speed up compression.
|
||||
Bela Lubkin encouraged me to improve the worst-case com-
|
||||
pression performance. Many people sent patches, helped
|
||||
The ideas embodied in bzip2 are due to (at least) the fol-
|
||||
lowing people: Michael Burrows and David Wheeler (for the
|
||||
block sorting transformation), David Wheeler (again, for
|
||||
the Huffman coder), Peter Fenwick (for the structured cod-
|
||||
ing model in the original bzip, and many refinements), and
|
||||
Alistair Moffat, Radford Neal and Ian Witten (for the
|
||||
arithmetic coder in the original bzip). I am much
|
||||
indebted for their help, support and advice. See the man-
|
||||
ual in the source distribution for pointers to sources of
|
||||
documentation. Christian von Roques encouraged me to look
|
||||
for faster sorting algorithms, so as to speed up compres-
|
||||
sion. Bela Lubkin encouraged me to improve the worst-case
|
||||
compression performance. Many people sent patches, helped
|
||||
with portability problems, lent machines, gave advice and
|
||||
were generally helpful.
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
7
|
||||
|
||||
|
||||
|
115
bzip2recover.c
115
bzip2recover.c
@ -7,43 +7,63 @@
|
||||
/*--
|
||||
This program is bzip2recover, a program to attempt data
|
||||
salvage from damaged files created by the accompanying
|
||||
bzip2-0.1 program.
|
||||
bzip2-0.9.0c program.
|
||||
|
||||
Copyright (C) 1996, 1997 by Julian Seward.
|
||||
Guildford, Surrey, UK
|
||||
email: jseward@acm.org
|
||||
Copyright (C) 1996-1998 Julian R Seward. All rights reserved.
|
||||
|
||||
This program is free software; you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
the Free Software Foundation; either version 2 of the License, or
|
||||
(at your option) any later version.
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
modification, are permitted provided that the following conditions
|
||||
are met:
|
||||
|
||||
This program is distributed in the hope that it will be useful,
|
||||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
GNU General Public License for more details.
|
||||
1. Redistributions of source code must retain the above copyright
|
||||
notice, this list of conditions and the following disclaimer.
|
||||
|
||||
You should have received a copy of the GNU General Public License
|
||||
along with this program; if not, write to the Free Software
|
||||
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
|
||||
2. The origin of this software must not be misrepresented; you must
|
||||
not claim that you wrote the original software. If you use this
|
||||
software in a product, an acknowledgment in the product
|
||||
documentation would be appreciated but is not required.
|
||||
|
||||
The GNU General Public License is contained in the file LICENSE.
|
||||
3. Altered source versions must be plainly marked as such, and must
|
||||
not be misrepresented as being the original software.
|
||||
|
||||
4. The name of the author may not be used to endorse or promote
|
||||
products derived from this software without specific prior written
|
||||
permission.
|
||||
|
||||
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS
|
||||
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
|
||||
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||||
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
|
||||
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
|
||||
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
|
||||
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
|
||||
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
|
||||
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
Julian Seward, Guildford, Surrey, UK.
|
||||
jseward@acm.org
|
||||
bzip2/libbzip2 version 0.9.0c of 18 October 1998
|
||||
--*/
|
||||
|
||||
/*--
|
||||
This program is a complete hack and should be rewritten
|
||||
properly. It isn't very complicated.
|
||||
--*/
|
||||
|
||||
#include <stdio.h>
|
||||
#include <errno.h>
|
||||
#include <malloc.h>
|
||||
#include <stdlib.h>
|
||||
#include <strings.h> /*-- or try string.h --*/
|
||||
#include <string.h>
|
||||
|
||||
#define UInt32 unsigned int
|
||||
#define Int32 int
|
||||
#define UChar unsigned char
|
||||
#define Char char
|
||||
#define Bool unsigned char
|
||||
#define True 1
|
||||
#define False 0
|
||||
typedef unsigned int UInt32;
|
||||
typedef int Int32;
|
||||
typedef unsigned char UChar;
|
||||
typedef char Char;
|
||||
typedef unsigned char Bool;
|
||||
#define True ((Bool)1)
|
||||
#define False ((Bool)0)
|
||||
|
||||
|
||||
Char inFileName[2000];
|
||||
@ -191,8 +211,9 @@ void bsClose ( BitStream* bs )
|
||||
if (retVal == EOF) writeError();
|
||||
}
|
||||
retVal = fclose ( bs->handle );
|
||||
if (retVal == EOF)
|
||||
if (retVal == EOF) {
|
||||
if (bs->mode == 'w') writeError(); else readError();
|
||||
}
|
||||
free ( bs );
|
||||
}
|
||||
|
||||
@ -248,13 +269,19 @@ Int32 main ( Int32 argc, Char** argv )
|
||||
UInt32 bitsRead;
|
||||
UInt32 bStart[20000];
|
||||
UInt32 bEnd[20000];
|
||||
|
||||
UInt32 rbStart[20000];
|
||||
UInt32 rbEnd[20000];
|
||||
Int32 rbCtr;
|
||||
|
||||
|
||||
UInt32 buffHi, buffLo, blockCRC;
|
||||
Char* p;
|
||||
|
||||
strcpy ( progName, argv[0] );
|
||||
inFileName[0] = outFileName[0] = 0;
|
||||
|
||||
fprintf ( stderr, "bzip2recover: extracts blocks from damaged .bz2 files.\n" );
|
||||
fprintf ( stderr, "bzip2recover v0.9.0c: extracts blocks from damaged .bz2 files.\n" );
|
||||
|
||||
if (argc != 2) {
|
||||
fprintf ( stderr, "%s: usage is `%s damaged_file_name'.\n",
|
||||
@ -278,6 +305,8 @@ Int32 main ( Int32 argc, Char** argv )
|
||||
currBlock = 0;
|
||||
bStart[currBlock] = 0;
|
||||
|
||||
rbCtr = 0;
|
||||
|
||||
while (True) {
|
||||
b = bsGetBit ( bsIn );
|
||||
bitsRead++;
|
||||
@ -303,19 +332,25 @@ Int32 main ( Int32 argc, Char** argv )
|
||||
if (bitsRead > 49)
|
||||
bEnd[currBlock] = bitsRead-49; else
|
||||
bEnd[currBlock] = 0;
|
||||
if (currBlock > 0)
|
||||
if (currBlock > 0 &&
|
||||
(bEnd[currBlock] - bStart[currBlock]) >= 130) {
|
||||
fprintf ( stderr, " block %d runs from %d to %d\n",
|
||||
currBlock, bStart[currBlock], bEnd[currBlock] );
|
||||
rbCtr+1, bStart[currBlock], bEnd[currBlock] );
|
||||
rbStart[rbCtr] = bStart[currBlock];
|
||||
rbEnd[rbCtr] = bEnd[currBlock];
|
||||
rbCtr++;
|
||||
}
|
||||
currBlock++;
|
||||
|
||||
bStart[currBlock] = bitsRead;
|
||||
}
|
||||
}
|
||||
|
||||
bsClose ( bsIn );
|
||||
|
||||
/*-- identified blocks run from 1 to currBlock inclusive. --*/
|
||||
/*-- identified blocks run from 1 to rbCtr inclusive. --*/
|
||||
|
||||
if (currBlock < 1) {
|
||||
if (rbCtr < 1) {
|
||||
fprintf ( stderr,
|
||||
"%s: sorry, I couldn't find any block boundaries.\n",
|
||||
progName );
|
||||
@ -336,23 +371,23 @@ Int32 main ( Int32 argc, Char** argv )
|
||||
|
||||
bitsRead = 0;
|
||||
outFile = NULL;
|
||||
wrBlock = 1;
|
||||
wrBlock = 0;
|
||||
while (True) {
|
||||
b = bsGetBit(bsIn);
|
||||
if (b == 2) break;
|
||||
buffHi = (buffHi << 1) | (buffLo >> 31);
|
||||
buffLo = (buffLo << 1) | (b & 1);
|
||||
if (bitsRead == 47+bStart[wrBlock])
|
||||
if (bitsRead == 47+rbStart[wrBlock])
|
||||
blockCRC = (buffHi << 16) | (buffLo >> 16);
|
||||
|
||||
if (outFile != NULL && bitsRead >= bStart[wrBlock]
|
||||
&& bitsRead <= bEnd[wrBlock]) {
|
||||
if (outFile != NULL && bitsRead >= rbStart[wrBlock]
|
||||
&& bitsRead <= rbEnd[wrBlock]) {
|
||||
bsPutBit ( bsWr, b );
|
||||
}
|
||||
|
||||
bitsRead++;
|
||||
|
||||
if (bitsRead == bEnd[wrBlock]+1) {
|
||||
if (bitsRead == rbEnd[wrBlock]+1) {
|
||||
if (outFile != NULL) {
|
||||
bsPutUChar ( bsWr, 0x17 ); bsPutUChar ( bsWr, 0x72 );
|
||||
bsPutUChar ( bsWr, 0x45 ); bsPutUChar ( bsWr, 0x38 );
|
||||
@ -360,18 +395,18 @@ Int32 main ( Int32 argc, Char** argv )
|
||||
bsPutUInt32 ( bsWr, blockCRC );
|
||||
bsClose ( bsWr );
|
||||
}
|
||||
if (wrBlock >= currBlock) break;
|
||||
if (wrBlock >= rbCtr) break;
|
||||
wrBlock++;
|
||||
} else
|
||||
if (bitsRead == bStart[wrBlock]) {
|
||||
if (bitsRead == rbStart[wrBlock]) {
|
||||
outFileName[0] = 0;
|
||||
sprintf ( outFileName, "rec%4d", wrBlock );
|
||||
sprintf ( outFileName, "rec%4d", wrBlock+1 );
|
||||
for (p = outFileName; *p != 0; p++) if (*p == ' ') *p = '0';
|
||||
strcat ( outFileName, inFileName );
|
||||
if ( !endsInBz2(outFileName)) strcat ( outFileName, ".bz2" );
|
||||
|
||||
fprintf ( stderr, " writing block %d to `%s' ...\n",
|
||||
wrBlock, outFileName );
|
||||
wrBlock+1, outFileName );
|
||||
|
||||
outFile = fopen ( outFileName, "wb" );
|
||||
if (outFile == NULL) {
|
||||
|
299
bzlib.h
Normal file
299
bzlib.h
Normal file
@ -0,0 +1,299 @@
|
||||
|
||||
/*-------------------------------------------------------------*/
|
||||
/*--- Public header file for the library. ---*/
|
||||
/*--- bzlib.h ---*/
|
||||
/*-------------------------------------------------------------*/
|
||||
|
||||
/*--
|
||||
This file is a part of bzip2 and/or libbzip2, a program and
|
||||
library for lossless, block-sorting data compression.
|
||||
|
||||
Copyright (C) 1996-1998 Julian R Seward. All rights reserved.
|
||||
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
modification, are permitted provided that the following conditions
|
||||
are met:
|
||||
|
||||
1. Redistributions of source code must retain the above copyright
|
||||
notice, this list of conditions and the following disclaimer.
|
||||
|
||||
2. The origin of this software must not be misrepresented; you must
|
||||
not claim that you wrote the original software. If you use this
|
||||
software in a product, an acknowledgment in the product
|
||||
documentation would be appreciated but is not required.
|
||||
|
||||
3. Altered source versions must be plainly marked as such, and must
|
||||
not be misrepresented as being the original software.
|
||||
|
||||
4. The name of the author may not be used to endorse or promote
|
||||
products derived from this software without specific prior written
|
||||
permission.
|
||||
|
||||
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS
|
||||
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
|
||||
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||||
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
|
||||
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
|
||||
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
|
||||
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
|
||||
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
|
||||
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
Julian Seward, Guildford, Surrey, UK.
|
||||
jseward@acm.org
|
||||
bzip2/libbzip2 version 0.9.0c of 18 October 1998
|
||||
|
||||
This program is based on (at least) the work of:
|
||||
Mike Burrows
|
||||
David Wheeler
|
||||
Peter Fenwick
|
||||
Alistair Moffat
|
||||
Radford Neal
|
||||
Ian H. Witten
|
||||
Robert Sedgewick
|
||||
Jon L. Bentley
|
||||
|
||||
For more information on these sources, see the manual.
|
||||
--*/
|
||||
|
||||
|
||||
#ifndef _BZLIB_H
|
||||
#define _BZLIB_H
|
||||
|
||||
#define BZ_RUN 0
|
||||
#define BZ_FLUSH 1
|
||||
#define BZ_FINISH 2
|
||||
|
||||
#define BZ_OK 0
|
||||
#define BZ_RUN_OK 1
|
||||
#define BZ_FLUSH_OK 2
|
||||
#define BZ_FINISH_OK 3
|
||||
#define BZ_STREAM_END 4
|
||||
#define BZ_SEQUENCE_ERROR (-1)
|
||||
#define BZ_PARAM_ERROR (-2)
|
||||
#define BZ_MEM_ERROR (-3)
|
||||
#define BZ_DATA_ERROR (-4)
|
||||
#define BZ_DATA_ERROR_MAGIC (-5)
|
||||
#define BZ_IO_ERROR (-6)
|
||||
#define BZ_UNEXPECTED_EOF (-7)
|
||||
#define BZ_OUTBUFF_FULL (-8)
|
||||
|
||||
typedef
|
||||
struct {
|
||||
char *next_in;
|
||||
unsigned int avail_in;
|
||||
unsigned int total_in;
|
||||
|
||||
char *next_out;
|
||||
unsigned int avail_out;
|
||||
unsigned int total_out;
|
||||
|
||||
void *state;
|
||||
|
||||
void *(*bzalloc)(void *,int,int);
|
||||
void (*bzfree)(void *,void *);
|
||||
void *opaque;
|
||||
}
|
||||
bz_stream;
|
||||
|
||||
|
||||
#ifndef BZ_IMPORT
|
||||
#define BZ_EXPORT
|
||||
#endif
|
||||
|
||||
#ifdef _WIN32
|
||||
# include <stdio.h>
|
||||
# include <windows.h>
|
||||
# ifdef small
|
||||
/* windows.h define small to char */
|
||||
# undef small
|
||||
# endif
|
||||
# ifdef BZ_EXPORT
|
||||
# define BZ_API(func) WINAPI func
|
||||
# define BZ_EXTERN extern
|
||||
# else
|
||||
/* import windows dll dynamically */
|
||||
# define BZ_API(func) (WINAPI * func)
|
||||
# define BZ_EXTERN
|
||||
# endif
|
||||
#else
|
||||
# define BZ_API(func) func
|
||||
# define BZ_EXTERN extern
|
||||
#endif
|
||||
|
||||
|
||||
/*-- Core (low-level) library functions --*/
|
||||
|
||||
BZ_EXTERN int BZ_API(bzCompressInit) (
|
||||
bz_stream* strm,
|
||||
int blockSize100k,
|
||||
int verbosity,
|
||||
int workFactor
|
||||
);
|
||||
|
||||
BZ_EXTERN int BZ_API(bzCompress) (
|
||||
bz_stream* strm,
|
||||
int action
|
||||
);
|
||||
|
||||
BZ_EXTERN int BZ_API(bzCompressEnd) (
|
||||
bz_stream* strm
|
||||
);
|
||||
|
||||
BZ_EXTERN int BZ_API(bzDecompressInit) (
|
||||
bz_stream *strm,
|
||||
int verbosity,
|
||||
int small
|
||||
);
|
||||
|
||||
BZ_EXTERN int BZ_API(bzDecompress) (
|
||||
bz_stream* strm
|
||||
);
|
||||
|
||||
BZ_EXTERN int BZ_API(bzDecompressEnd) (
|
||||
bz_stream *strm
|
||||
);
|
||||
|
||||
|
||||
|
||||
/*-- High(er) level library functions --*/
|
||||
|
||||
#ifndef BZ_NO_STDIO
|
||||
#define BZ_MAX_UNUSED 5000
|
||||
|
||||
typedef void BZFILE;
|
||||
|
||||
BZ_EXTERN BZFILE* BZ_API(bzReadOpen) (
|
||||
int* bzerror,
|
||||
FILE* f,
|
||||
int verbosity,
|
||||
int small,
|
||||
void* unused,
|
||||
int nUnused
|
||||
);
|
||||
|
||||
BZ_EXTERN void BZ_API(bzReadClose) (
|
||||
int* bzerror,
|
||||
BZFILE* b
|
||||
);
|
||||
|
||||
BZ_EXTERN void BZ_API(bzReadGetUnused) (
|
||||
int* bzerror,
|
||||
BZFILE* b,
|
||||
void** unused,
|
||||
int* nUnused
|
||||
);
|
||||
|
||||
BZ_EXTERN int BZ_API(bzRead) (
|
||||
int* bzerror,
|
||||
BZFILE* b,
|
||||
void* buf,
|
||||
int len
|
||||
);
|
||||
|
||||
BZ_EXTERN BZFILE* BZ_API(bzWriteOpen) (
|
||||
int* bzerror,
|
||||
FILE* f,
|
||||
int blockSize100k,
|
||||
int verbosity,
|
||||
int workFactor
|
||||
);
|
||||
|
||||
BZ_EXTERN void BZ_API(bzWrite) (
|
||||
int* bzerror,
|
||||
BZFILE* b,
|
||||
void* buf,
|
||||
int len
|
||||
);
|
||||
|
||||
BZ_EXTERN void BZ_API(bzWriteClose) (
|
||||
int* bzerror,
|
||||
BZFILE* b,
|
||||
int abandon,
|
||||
unsigned int* nbytes_in,
|
||||
unsigned int* nbytes_out
|
||||
);
|
||||
#endif
|
||||
|
||||
|
||||
/*-- Utility functions --*/
|
||||
|
||||
BZ_EXTERN int BZ_API(bzBuffToBuffCompress) (
|
||||
char* dest,
|
||||
unsigned int* destLen,
|
||||
char* source,
|
||||
unsigned int sourceLen,
|
||||
int blockSize100k,
|
||||
int verbosity,
|
||||
int workFactor
|
||||
);
|
||||
|
||||
BZ_EXTERN int BZ_API(bzBuffToBuffDecompress) (
|
||||
char* dest,
|
||||
unsigned int* destLen,
|
||||
char* source,
|
||||
unsigned int sourceLen,
|
||||
int small,
|
||||
int verbosity
|
||||
);
|
||||
|
||||
|
||||
/*--
|
||||
Code contributed by Yoshioka Tsuneo
|
||||
(QWF00133@niftyserve.or.jp/tsuneo-y@is.aist-nara.ac.jp),
|
||||
to support better zlib compatibility.
|
||||
This code is not _officially_ part of libbzip2 (yet);
|
||||
I haven't tested it, documented it, or considered the
|
||||
threading-safeness of it.
|
||||
If this code breaks, please contact both Yoshioka and me.
|
||||
--*/
|
||||
|
||||
BZ_EXTERN const char * BZ_API(bzlibVersion) (
|
||||
void
|
||||
);
|
||||
|
||||
#ifndef BZ_NO_STDIO
|
||||
BZ_EXTERN BZFILE * BZ_API(bzopen) (
|
||||
const char *path,
|
||||
const char *mode
|
||||
);
|
||||
|
||||
BZ_EXTERN BZFILE * BZ_API(bzdopen) (
|
||||
int fd,
|
||||
const char *mode
|
||||
);
|
||||
|
||||
BZ_EXTERN int BZ_API(bzread) (
|
||||
BZFILE* b,
|
||||
void* buf,
|
||||
int len
|
||||
);
|
||||
|
||||
BZ_EXTERN int BZ_API(bzwrite) (
|
||||
BZFILE* b,
|
||||
void* buf,
|
||||
int len
|
||||
);
|
||||
|
||||
BZ_EXTERN int BZ_API(bzflush) (
|
||||
BZFILE* b
|
||||
);
|
||||
|
||||
BZ_EXTERN void BZ_API(bzclose) (
|
||||
BZFILE* b
|
||||
);
|
||||
|
||||
BZ_EXTERN const char * BZ_API(bzerror) (
|
||||
BZFILE *b,
|
||||
int *errnum
|
||||
);
|
||||
#endif
|
||||
|
||||
|
||||
#endif
|
||||
|
||||
/*-------------------------------------------------------------*/
|
||||
/*--- end bzlib.h ---*/
|
||||
/*-------------------------------------------------------------*/
|
523
bzlib_private.h
Normal file
523
bzlib_private.h
Normal file
@ -0,0 +1,523 @@
|
||||
|
||||
/*-------------------------------------------------------------*/
|
||||
/*--- Private header file for the library. ---*/
|
||||
/*--- bzlib_private.h ---*/
|
||||
/*-------------------------------------------------------------*/
|
||||
|
||||
/*--
|
||||
This file is a part of bzip2 and/or libbzip2, a program and
|
||||
library for lossless, block-sorting data compression.
|
||||
|
||||
Copyright (C) 1996-1998 Julian R Seward. All rights reserved.
|
||||
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
modification, are permitted provided that the following conditions
|
||||
are met:
|
||||
|
||||
1. Redistributions of source code must retain the above copyright
|
||||
notice, this list of conditions and the following disclaimer.
|
||||
|
||||
2. The origin of this software must not be misrepresented; you must
|
||||
not claim that you wrote the original software. If you use this
|
||||
software in a product, an acknowledgment in the product
|
||||
documentation would be appreciated but is not required.
|
||||
|
||||
3. Altered source versions must be plainly marked as such, and must
|
||||
not be misrepresented as being the original software.
|
||||
|
||||
4. The name of the author may not be used to endorse or promote
|
||||
products derived from this software without specific prior written
|
||||
permission.
|
||||
|
||||
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS
|
||||
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
|
||||
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||||
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
|
||||
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
|
||||
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
|
||||
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
|
||||
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
|
||||
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
Julian Seward, Guildford, Surrey, UK.
|
||||
jseward@acm.org
|
||||
bzip2/libbzip2 version 0.9.0c of 18 October 1998
|
||||
|
||||
This program is based on (at least) the work of:
|
||||
Mike Burrows
|
||||
David Wheeler
|
||||
Peter Fenwick
|
||||
Alistair Moffat
|
||||
Radford Neal
|
||||
Ian H. Witten
|
||||
Robert Sedgewick
|
||||
Jon L. Bentley
|
||||
|
||||
For more information on these sources, see the manual.
|
||||
--*/
|
||||
|
||||
|
||||
#ifndef _BZLIB_PRIVATE_H
|
||||
#define _BZLIB_PRIVATE_H
|
||||
|
||||
#include <stdlib.h>
|
||||
|
||||
#ifndef BZ_NO_STDIO
|
||||
#include <stdio.h>
|
||||
#include <ctype.h>
|
||||
#include <string.h>
|
||||
#endif
|
||||
|
||||
#include "bzlib.h"
|
||||
|
||||
|
||||
|
||||
/*-- General stuff. --*/
|
||||
|
||||
#define BZ_VERSION "0.9.0c"
|
||||
|
||||
typedef char Char;
|
||||
typedef unsigned char Bool;
|
||||
typedef unsigned char UChar;
|
||||
typedef int Int32;
|
||||
typedef unsigned int UInt32;
|
||||
typedef short Int16;
|
||||
typedef unsigned short UInt16;
|
||||
|
||||
#define True ((Bool)1)
|
||||
#define False ((Bool)0)
|
||||
|
||||
#ifndef __GNUC__
|
||||
#define __inline__ /* */
|
||||
#endif
|
||||
|
||||
#ifndef BZ_NO_STDIO
|
||||
extern void bz__AssertH__fail ( int errcode );
|
||||
#define AssertH(cond,errcode) \
|
||||
{ if (!(cond)) bz__AssertH__fail ( errcode ); }
|
||||
#if BZ_DEBUG
|
||||
#define AssertD(cond,msg) \
|
||||
{ if (!(cond)) { \
|
||||
fprintf ( stderr, \
|
||||
"\n\nlibbzip2(debug build): internal error\n\t%s\n", msg );\
|
||||
exit(1); \
|
||||
}}
|
||||
#else
|
||||
#define AssertD(cond,msg) /* */
|
||||
#endif
|
||||
#define VPrintf0(zf) \
|
||||
fprintf(stderr,zf)
|
||||
#define VPrintf1(zf,za1) \
|
||||
fprintf(stderr,zf,za1)
|
||||
#define VPrintf2(zf,za1,za2) \
|
||||
fprintf(stderr,zf,za1,za2)
|
||||
#define VPrintf3(zf,za1,za2,za3) \
|
||||
fprintf(stderr,zf,za1,za2,za3)
|
||||
#define VPrintf4(zf,za1,za2,za3,za4) \
|
||||
fprintf(stderr,zf,za1,za2,za3,za4)
|
||||
#define VPrintf5(zf,za1,za2,za3,za4,za5) \
|
||||
fprintf(stderr,zf,za1,za2,za3,za4,za5)
|
||||
#else
|
||||
extern void bz_internal_error ( int errcode );
|
||||
#define AssertH(cond,errcode) \
|
||||
{ if (!(cond)) bz_internal_error ( errcode ); }
|
||||
#define AssertD(cond,msg) /* */
|
||||
#define VPrintf0(zf) /* */
|
||||
#define VPrintf1(zf,za1) /* */
|
||||
#define VPrintf2(zf,za1,za2) /* */
|
||||
#define VPrintf3(zf,za1,za2,za3) /* */
|
||||
#define VPrintf4(zf,za1,za2,za3,za4) /* */
|
||||
#define VPrintf5(zf,za1,za2,za3,za4,za5) /* */
|
||||
#endif
|
||||
|
||||
|
||||
#define BZALLOC(nnn) (strm->bzalloc)(strm->opaque,(nnn),1)
|
||||
#define BZFREE(ppp) (strm->bzfree)(strm->opaque,(ppp))
|
||||
|
||||
|
||||
/*-- Constants for the back end. --*/
|
||||
|
||||
#define BZ_MAX_ALPHA_SIZE 258
|
||||
#define BZ_MAX_CODE_LEN 23
|
||||
|
||||
#define BZ_RUNA 0
|
||||
#define BZ_RUNB 1
|
||||
|
||||
#define BZ_N_GROUPS 6
|
||||
#define BZ_G_SIZE 50
|
||||
#define BZ_N_ITERS 4
|
||||
|
||||
#define BZ_MAX_SELECTORS (2 + (900000 / BZ_G_SIZE))
|
||||
|
||||
|
||||
|
||||
/*-- Stuff for randomising repetitive blocks. --*/
|
||||
|
||||
extern Int32 rNums[512];
|
||||
|
||||
#define BZ_RAND_DECLS \
|
||||
Int32 rNToGo; \
|
||||
Int32 rTPos \
|
||||
|
||||
#define BZ_RAND_INIT_MASK \
|
||||
s->rNToGo = 0; \
|
||||
s->rTPos = 0 \
|
||||
|
||||
#define BZ_RAND_MASK ((s->rNToGo == 1) ? 1 : 0)
|
||||
|
||||
#define BZ_RAND_UPD_MASK \
|
||||
if (s->rNToGo == 0) { \
|
||||
s->rNToGo = rNums[s->rTPos]; \
|
||||
s->rTPos++; \
|
||||
if (s->rTPos == 512) s->rTPos = 0; \
|
||||
} \
|
||||
s->rNToGo--;
|
||||
|
||||
|
||||
|
||||
/*-- Stuff for doing CRCs. --*/
|
||||
|
||||
extern UInt32 crc32Table[256];
|
||||
|
||||
#define BZ_INITIALISE_CRC(crcVar) \
|
||||
{ \
|
||||
crcVar = 0xffffffffL; \
|
||||
}
|
||||
|
||||
#define BZ_FINALISE_CRC(crcVar) \
|
||||
{ \
|
||||
crcVar = ~(crcVar); \
|
||||
}
|
||||
|
||||
#define BZ_UPDATE_CRC(crcVar,cha) \
|
||||
{ \
|
||||
crcVar = (crcVar << 8) ^ \
|
||||
crc32Table[(crcVar >> 24) ^ \
|
||||
((UChar)cha)]; \
|
||||
}
|
||||
|
||||
|
||||
|
||||
/*-- States and modes for compression. --*/
|
||||
|
||||
#define BZ_M_IDLE 1
|
||||
#define BZ_M_RUNNING 2
|
||||
#define BZ_M_FLUSHING 3
|
||||
#define BZ_M_FINISHING 4
|
||||
|
||||
#define BZ_S_OUTPUT 1
|
||||
#define BZ_S_INPUT 2
|
||||
|
||||
#define BZ_NUM_OVERSHOOT_BYTES 20
|
||||
|
||||
|
||||
|
||||
/*-- Structure holding all the compression-side stuff. --*/
|
||||
|
||||
typedef
|
||||
struct {
|
||||
/* pointer back to the struct bz_stream */
|
||||
bz_stream* strm;
|
||||
|
||||
/* mode this stream is in, and whether inputting */
|
||||
/* or outputting data */
|
||||
Int32 mode;
|
||||
Int32 state;
|
||||
|
||||
/* remembers avail_in when flush/finish requested */
|
||||
UInt32 avail_in_expect;
|
||||
|
||||
/* for doing the block sorting */
|
||||
UChar* block;
|
||||
UInt16* quadrant;
|
||||
UInt32* zptr;
|
||||
UInt16* szptr;
|
||||
Int32* ftab;
|
||||
Int32 workDone;
|
||||
Int32 workLimit;
|
||||
Int32 workFactor;
|
||||
Bool firstAttempt;
|
||||
Bool blockRandomised;
|
||||
Int32 origPtr;
|
||||
|
||||
/* run-length-encoding of the input */
|
||||
UInt32 state_in_ch;
|
||||
Int32 state_in_len;
|
||||
BZ_RAND_DECLS;
|
||||
|
||||
/* input and output limits and current posns */
|
||||
Int32 nblock;
|
||||
Int32 nblockMAX;
|
||||
Int32 numZ;
|
||||
Int32 state_out_pos;
|
||||
|
||||
/* map of bytes used in block */
|
||||
Int32 nInUse;
|
||||
Bool inUse[256];
|
||||
UChar unseqToSeq[256];
|
||||
|
||||
/* the buffer for bit stream creation */
|
||||
UInt32 bsBuff;
|
||||
Int32 bsLive;
|
||||
|
||||
/* block and combined CRCs */
|
||||
UInt32 blockCRC;
|
||||
UInt32 combinedCRC;
|
||||
|
||||
/* misc administratium */
|
||||
Int32 verbosity;
|
||||
Int32 blockNo;
|
||||
Int32 nBlocksRandomised;
|
||||
Int32 blockSize100k;
|
||||
|
||||
/* stuff for coding the MTF values */
|
||||
Int32 nMTF;
|
||||
Int32 mtfFreq [BZ_MAX_ALPHA_SIZE];
|
||||
UChar selector [BZ_MAX_SELECTORS];
|
||||
UChar selectorMtf[BZ_MAX_SELECTORS];
|
||||
|
||||
UChar len [BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
|
||||
Int32 code [BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
|
||||
Int32 rfreq[BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
|
||||
|
||||
}
|
||||
EState;
|
||||
|
||||
|
||||
|
||||
/*-- externs for compression. --*/
|
||||
|
||||
extern void
|
||||
blockSort ( EState* );
|
||||
|
||||
extern void
|
||||
compressBlock ( EState*, Bool );
|
||||
|
||||
extern void
|
||||
bsInitWrite ( EState* );
|
||||
|
||||
extern void
|
||||
hbAssignCodes ( Int32*, UChar*, Int32, Int32, Int32 );
|
||||
|
||||
extern void
|
||||
hbMakeCodeLengths ( UChar*, Int32*, Int32, Int32 );
|
||||
|
||||
|
||||
|
||||
/*-- states for decompression. --*/
|
||||
|
||||
#define BZ_X_IDLE 1
|
||||
#define BZ_X_OUTPUT 2
|
||||
|
||||
#define BZ_X_MAGIC_1 10
|
||||
#define BZ_X_MAGIC_2 11
|
||||
#define BZ_X_MAGIC_3 12
|
||||
#define BZ_X_MAGIC_4 13
|
||||
#define BZ_X_BLKHDR_1 14
|
||||
#define BZ_X_BLKHDR_2 15
|
||||
#define BZ_X_BLKHDR_3 16
|
||||
#define BZ_X_BLKHDR_4 17
|
||||
#define BZ_X_BLKHDR_5 18
|
||||
#define BZ_X_BLKHDR_6 19
|
||||
#define BZ_X_BCRC_1 20
|
||||
#define BZ_X_BCRC_2 21
|
||||
#define BZ_X_BCRC_3 22
|
||||
#define BZ_X_BCRC_4 23
|
||||
#define BZ_X_RANDBIT 24
|
||||
#define BZ_X_ORIGPTR_1 25
|
||||
#define BZ_X_ORIGPTR_2 26
|
||||
#define BZ_X_ORIGPTR_3 27
|
||||
#define BZ_X_MAPPING_1 28
|
||||
#define BZ_X_MAPPING_2 29
|
||||
#define BZ_X_SELECTOR_1 30
|
||||
#define BZ_X_SELECTOR_2 31
|
||||
#define BZ_X_SELECTOR_3 32
|
||||
#define BZ_X_CODING_1 33
|
||||
#define BZ_X_CODING_2 34
|
||||
#define BZ_X_CODING_3 35
|
||||
#define BZ_X_MTF_1 36
|
||||
#define BZ_X_MTF_2 37
|
||||
#define BZ_X_MTF_3 38
|
||||
#define BZ_X_MTF_4 39
|
||||
#define BZ_X_MTF_5 40
|
||||
#define BZ_X_MTF_6 41
|
||||
#define BZ_X_ENDHDR_2 42
|
||||
#define BZ_X_ENDHDR_3 43
|
||||
#define BZ_X_ENDHDR_4 44
|
||||
#define BZ_X_ENDHDR_5 45
|
||||
#define BZ_X_ENDHDR_6 46
|
||||
#define BZ_X_CCRC_1 47
|
||||
#define BZ_X_CCRC_2 48
|
||||
#define BZ_X_CCRC_3 49
|
||||
#define BZ_X_CCRC_4 50
|
||||
|
||||
|
||||
|
||||
/*-- Constants for the fast MTF decoder. --*/
|
||||
|
||||
#define MTFA_SIZE 4096
|
||||
#define MTFL_SIZE 16
|
||||
|
||||
|
||||
|
||||
/*-- Structure holding all the decompression-side stuff. --*/
|
||||
|
||||
typedef
|
||||
struct {
|
||||
/* pointer back to the struct bz_stream */
|
||||
bz_stream* strm;
|
||||
|
||||
/* state indicator for this stream */
|
||||
Int32 state;
|
||||
|
||||
/* for doing the final run-length decoding */
|
||||
UChar state_out_ch;
|
||||
Int32 state_out_len;
|
||||
Bool blockRandomised;
|
||||
BZ_RAND_DECLS;
|
||||
|
||||
/* the buffer for bit stream reading */
|
||||
UInt32 bsBuff;
|
||||
Int32 bsLive;
|
||||
|
||||
/* misc administratium */
|
||||
Int32 blockSize100k;
|
||||
Bool smallDecompress;
|
||||
Int32 currBlockNo;
|
||||
Int32 verbosity;
|
||||
|
||||
/* for undoing the Burrows-Wheeler transform */
|
||||
Int32 origPtr;
|
||||
UInt32 tPos;
|
||||
Int32 k0;
|
||||
Int32 unzftab[256];
|
||||
Int32 nblock_used;
|
||||
Int32 cftab[257];
|
||||
Int32 cftabCopy[257];
|
||||
|
||||
/* for undoing the Burrows-Wheeler transform (FAST) */
|
||||
UInt32 *tt;
|
||||
|
||||
/* for undoing the Burrows-Wheeler transform (SMALL) */
|
||||
UInt16 *ll16;
|
||||
UChar *ll4;
|
||||
|
||||
/* stored and calculated CRCs */
|
||||
UInt32 storedBlockCRC;
|
||||
UInt32 storedCombinedCRC;
|
||||
UInt32 calculatedBlockCRC;
|
||||
UInt32 calculatedCombinedCRC;
|
||||
|
||||
/* map of bytes used in block */
|
||||
Int32 nInUse;
|
||||
Bool inUse[256];
|
||||
Bool inUse16[16];
|
||||
UChar seqToUnseq[256];
|
||||
|
||||
/* for decoding the MTF values */
|
||||
UChar mtfa [MTFA_SIZE];
|
||||
Int32 mtfbase[256 / MTFL_SIZE];
|
||||
UChar selector [BZ_MAX_SELECTORS];
|
||||
UChar selectorMtf[BZ_MAX_SELECTORS];
|
||||
UChar len [BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
|
||||
|
||||
Int32 limit [BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
|
||||
Int32 base [BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
|
||||
Int32 perm [BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
|
||||
Int32 minLens[BZ_N_GROUPS];
|
||||
|
||||
/* save area for scalars in the main decompress code */
|
||||
Int32 save_i;
|
||||
Int32 save_j;
|
||||
Int32 save_t;
|
||||
Int32 save_alphaSize;
|
||||
Int32 save_nGroups;
|
||||
Int32 save_nSelectors;
|
||||
Int32 save_EOB;
|
||||
Int32 save_groupNo;
|
||||
Int32 save_groupPos;
|
||||
Int32 save_nextSym;
|
||||
Int32 save_nblockMAX;
|
||||
Int32 save_nblock;
|
||||
Int32 save_es;
|
||||
Int32 save_N;
|
||||
Int32 save_curr;
|
||||
Int32 save_zt;
|
||||
Int32 save_zn;
|
||||
Int32 save_zvec;
|
||||
Int32 save_zj;
|
||||
Int32 save_gSel;
|
||||
Int32 save_gMinlen;
|
||||
Int32* save_gLimit;
|
||||
Int32* save_gBase;
|
||||
Int32* save_gPerm;
|
||||
|
||||
}
|
||||
DState;
|
||||
|
||||
|
||||
|
||||
/*-- Macros for decompression. --*/
|
||||
|
||||
#define BZ_GET_FAST(cccc) \
|
||||
s->tPos = s->tt[s->tPos]; \
|
||||
cccc = (UChar)(s->tPos & 0xff); \
|
||||
s->tPos >>= 8;
|
||||
|
||||
#define BZ_GET_FAST_C(cccc) \
|
||||
c_tPos = c_tt[c_tPos]; \
|
||||
cccc = (UChar)(c_tPos & 0xff); \
|
||||
c_tPos >>= 8;
|
||||
|
||||
#define SET_LL4(i,n) \
|
||||
{ if (((i) & 0x1) == 0) \
|
||||
s->ll4[(i) >> 1] = (s->ll4[(i) >> 1] & 0xf0) | (n); else \
|
||||
s->ll4[(i) >> 1] = (s->ll4[(i) >> 1] & 0x0f) | ((n) << 4); \
|
||||
}
|
||||
|
||||
#define GET_LL4(i) \
|
||||
(((UInt32)(s->ll4[(i) >> 1])) >> (((i) << 2) & 0x4) & 0xF)
|
||||
|
||||
#define SET_LL(i,n) \
|
||||
{ s->ll16[i] = (UInt16)(n & 0x0000ffff); \
|
||||
SET_LL4(i, n >> 16); \
|
||||
}
|
||||
|
||||
#define GET_LL(i) \
|
||||
(((UInt32)s->ll16[i]) | (GET_LL4(i) << 16))
|
||||
|
||||
#define BZ_GET_SMALL(cccc) \
|
||||
cccc = indexIntoF ( s->tPos, s->cftab ); \
|
||||
s->tPos = GET_LL(s->tPos);
|
||||
|
||||
|
||||
/*-- externs for decompression. --*/
|
||||
|
||||
extern Int32
|
||||
indexIntoF ( Int32, Int32* );
|
||||
|
||||
extern Int32
|
||||
decompress ( DState* );
|
||||
|
||||
extern void
|
||||
hbCreateDecodeTables ( Int32*, Int32*, Int32*, UChar*,
|
||||
Int32, Int32, Int32 );
|
||||
|
||||
|
||||
#endif
|
||||
|
||||
|
||||
/*-- BZ_NO_STDIO seems to make NULL disappear on some platforms. --*/
|
||||
|
||||
#ifdef BZ_NO_STDIO
|
||||
#ifndef NULL
|
||||
#define NULL 0
|
||||
#endif
|
||||
#endif
|
||||
|
||||
|
||||
/*-------------------------------------------------------------*/
|
||||
/*--- end bzlib_private.h ---*/
|
||||
/*-------------------------------------------------------------*/
|
588
compress.c
Normal file
588
compress.c
Normal file
@ -0,0 +1,588 @@
|
||||
|
||||
/*-------------------------------------------------------------*/
|
||||
/*--- Compression machinery (not incl block sorting) ---*/
|
||||
/*--- compress.c ---*/
|
||||
/*-------------------------------------------------------------*/
|
||||
|
||||
/*--
|
||||
This file is a part of bzip2 and/or libbzip2, a program and
|
||||
library for lossless, block-sorting data compression.
|
||||
|
||||
Copyright (C) 1996-1998 Julian R Seward. All rights reserved.
|
||||
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
modification, are permitted provided that the following conditions
|
||||
are met:
|
||||
|
||||
1. Redistributions of source code must retain the above copyright
|
||||
notice, this list of conditions and the following disclaimer.
|
||||
|
||||
2. The origin of this software must not be misrepresented; you must
|
||||
not claim that you wrote the original software. If you use this
|
||||
software in a product, an acknowledgment in the product
|
||||
documentation would be appreciated but is not required.
|
||||
|
||||
3. Altered source versions must be plainly marked as such, and must
|
||||
not be misrepresented as being the original software.
|
||||
|
||||
4. The name of the author may not be used to endorse or promote
|
||||
products derived from this software without specific prior written
|
||||
permission.
|
||||
|
||||
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS
|
||||
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
|
||||
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||||
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
|
||||
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
|
||||
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
|
||||
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
|
||||
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
|
||||
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
Julian Seward, Guildford, Surrey, UK.
|
||||
jseward@acm.org
|
||||
bzip2/libbzip2 version 0.9.0 of 28 June 1998
|
||||
|
||||
This program is based on (at least) the work of:
|
||||
Mike Burrows
|
||||
David Wheeler
|
||||
Peter Fenwick
|
||||
Alistair Moffat
|
||||
Radford Neal
|
||||
Ian H. Witten
|
||||
Robert Sedgewick
|
||||
Jon L. Bentley
|
||||
|
||||
For more information on these sources, see the manual.
|
||||
--*/
|
||||
|
||||
/*--
|
||||
CHANGES
|
||||
~~~~~~~
|
||||
0.9.0 -- original version.
|
||||
|
||||
0.9.0a/b -- no changes in this file.
|
||||
|
||||
0.9.0c
|
||||
* changed setting of nGroups in sendMTFValues() so as to
|
||||
do a bit better on small files
|
||||
--*/
|
||||
|
||||
#include "bzlib_private.h"
|
||||
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
/*--- Bit stream I/O ---*/
|
||||
/*---------------------------------------------------*/
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
void bsInitWrite ( EState* s )
|
||||
{
|
||||
s->bsLive = 0;
|
||||
s->bsBuff = 0;
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
static
|
||||
void bsFinishWrite ( EState* s )
|
||||
{
|
||||
while (s->bsLive > 0) {
|
||||
((UChar*)(s->quadrant))[s->numZ] = (UChar)(s->bsBuff >> 24);
|
||||
s->numZ++;
|
||||
s->bsBuff <<= 8;
|
||||
s->bsLive -= 8;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
#define bsNEEDW(nz) \
|
||||
{ \
|
||||
while (s->bsLive >= 8) { \
|
||||
((UChar*)(s->quadrant))[s->numZ] \
|
||||
= (UChar)(s->bsBuff >> 24); \
|
||||
s->numZ++; \
|
||||
s->bsBuff <<= 8; \
|
||||
s->bsLive -= 8; \
|
||||
} \
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
static
|
||||
void bsW ( EState* s, Int32 n, UInt32 v )
|
||||
{
|
||||
bsNEEDW ( n );
|
||||
s->bsBuff |= (v << (32 - s->bsLive - n));
|
||||
s->bsLive += n;
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
static
|
||||
void bsPutUInt32 ( EState* s, UInt32 u )
|
||||
{
|
||||
bsW ( s, 8, (u >> 24) & 0xffL );
|
||||
bsW ( s, 8, (u >> 16) & 0xffL );
|
||||
bsW ( s, 8, (u >> 8) & 0xffL );
|
||||
bsW ( s, 8, u & 0xffL );
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
static
|
||||
void bsPutUChar ( EState* s, UChar c )
|
||||
{
|
||||
bsW( s, 8, (UInt32)c );
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
/*--- The back end proper ---*/
|
||||
/*---------------------------------------------------*/
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
static
|
||||
void makeMaps_e ( EState* s )
|
||||
{
|
||||
Int32 i;
|
||||
s->nInUse = 0;
|
||||
for (i = 0; i < 256; i++)
|
||||
if (s->inUse[i]) {
|
||||
s->unseqToSeq[i] = s->nInUse;
|
||||
s->nInUse++;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
static
|
||||
void generateMTFValues ( EState* s )
|
||||
{
|
||||
UChar yy[256];
|
||||
Int32 i, j;
|
||||
UChar tmp;
|
||||
UChar tmp2;
|
||||
Int32 zPend;
|
||||
Int32 wr;
|
||||
Int32 EOB;
|
||||
|
||||
makeMaps_e ( s );
|
||||
EOB = s->nInUse+1;
|
||||
|
||||
for (i = 0; i <= EOB; i++) s->mtfFreq[i] = 0;
|
||||
|
||||
wr = 0;
|
||||
zPend = 0;
|
||||
for (i = 0; i < s->nInUse; i++) yy[i] = (UChar) i;
|
||||
|
||||
for (i = 0; i < s->nblock; i++) {
|
||||
UChar ll_i;
|
||||
|
||||
AssertD ( wr <= i, "generateMTFValues(1)" );
|
||||
j = s->zptr[i]-1; if (j < 0) j += s->nblock;
|
||||
ll_i = s->unseqToSeq[s->block[j]];
|
||||
AssertD ( ll_i < s->nInUse, "generateMTFValues(2a)" );
|
||||
|
||||
j = 0;
|
||||
tmp = yy[j];
|
||||
while ( ll_i != tmp ) {
|
||||
j++;
|
||||
tmp2 = tmp;
|
||||
tmp = yy[j];
|
||||
yy[j] = tmp2;
|
||||
};
|
||||
yy[0] = tmp;
|
||||
|
||||
if (j == 0) {
|
||||
zPend++;
|
||||
} else {
|
||||
if (zPend > 0) {
|
||||
zPend--;
|
||||
while (True) {
|
||||
switch (zPend % 2) {
|
||||
case 0: s->szptr[wr] = BZ_RUNA; wr++; s->mtfFreq[BZ_RUNA]++; break;
|
||||
case 1: s->szptr[wr] = BZ_RUNB; wr++; s->mtfFreq[BZ_RUNB]++; break;
|
||||
};
|
||||
if (zPend < 2) break;
|
||||
zPend = (zPend - 2) / 2;
|
||||
};
|
||||
zPend = 0;
|
||||
}
|
||||
s->szptr[wr] = j+1; wr++; s->mtfFreq[j+1]++;
|
||||
}
|
||||
}
|
||||
|
||||
if (zPend > 0) {
|
||||
zPend--;
|
||||
while (True) {
|
||||
switch (zPend % 2) {
|
||||
case 0: s->szptr[wr] = BZ_RUNA; wr++; s->mtfFreq[BZ_RUNA]++; break;
|
||||
case 1: s->szptr[wr] = BZ_RUNB; wr++; s->mtfFreq[BZ_RUNB]++; break;
|
||||
};
|
||||
if (zPend < 2) break;
|
||||
zPend = (zPend - 2) / 2;
|
||||
};
|
||||
}
|
||||
|
||||
s->szptr[wr] = EOB; wr++; s->mtfFreq[EOB]++;
|
||||
|
||||
s->nMTF = wr;
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
#define BZ_LESSER_ICOST 0
|
||||
#define BZ_GREATER_ICOST 15
|
||||
|
||||
static
|
||||
void sendMTFValues ( EState* s )
|
||||
{
|
||||
Int32 v, t, i, j, gs, ge, totc, bt, bc, iter;
|
||||
Int32 nSelectors, alphaSize, minLen, maxLen, selCtr;
|
||||
Int32 nGroups, nBytes;
|
||||
|
||||
/*--
|
||||
UChar len [BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
|
||||
is a global since the decoder also needs it.
|
||||
|
||||
Int32 code[BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
|
||||
Int32 rfreq[BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
|
||||
are also globals only used in this proc.
|
||||
Made global to keep stack frame size small.
|
||||
--*/
|
||||
|
||||
|
||||
UInt16 cost[BZ_N_GROUPS];
|
||||
Int32 fave[BZ_N_GROUPS];
|
||||
|
||||
if (s->verbosity >= 3)
|
||||
VPrintf3( " %d in block, %d after MTF & 1-2 coding, "
|
||||
"%d+2 syms in use\n",
|
||||
s->nblock, s->nMTF, s->nInUse );
|
||||
|
||||
alphaSize = s->nInUse+2;
|
||||
for (t = 0; t < BZ_N_GROUPS; t++)
|
||||
for (v = 0; v < alphaSize; v++)
|
||||
s->len[t][v] = BZ_GREATER_ICOST;
|
||||
|
||||
/*--- Decide how many coding tables to use ---*/
|
||||
AssertH ( s->nMTF > 0, 3001 );
|
||||
if (s->nMTF < 200) nGroups = 2; else
|
||||
if (s->nMTF < 600) nGroups = 3; else
|
||||
if (s->nMTF < 1200) nGroups = 4; else
|
||||
if (s->nMTF < 2400) nGroups = 5; else
|
||||
nGroups = 6;
|
||||
|
||||
/*--- Generate an initial set of coding tables ---*/
|
||||
{
|
||||
Int32 nPart, remF, tFreq, aFreq;
|
||||
|
||||
nPart = nGroups;
|
||||
remF = s->nMTF;
|
||||
gs = 0;
|
||||
while (nPart > 0) {
|
||||
tFreq = remF / nPart;
|
||||
ge = gs-1;
|
||||
aFreq = 0;
|
||||
while (aFreq < tFreq && ge < alphaSize-1) {
|
||||
ge++;
|
||||
aFreq += s->mtfFreq[ge];
|
||||
}
|
||||
|
||||
if (ge > gs
|
||||
&& nPart != nGroups && nPart != 1
|
||||
&& ((nGroups-nPart) % 2 == 1)) {
|
||||
aFreq -= s->mtfFreq[ge];
|
||||
ge--;
|
||||
}
|
||||
|
||||
if (s->verbosity >= 3)
|
||||
VPrintf5( " initial group %d, [%d .. %d], "
|
||||
"has %d syms (%4.1f%%)\n",
|
||||
nPart, gs, ge, aFreq,
|
||||
(100.0 * (float)aFreq) / (float)(s->nMTF) );
|
||||
|
||||
for (v = 0; v < alphaSize; v++)
|
||||
if (v >= gs && v <= ge)
|
||||
s->len[nPart-1][v] = BZ_LESSER_ICOST; else
|
||||
s->len[nPart-1][v] = BZ_GREATER_ICOST;
|
||||
|
||||
nPart--;
|
||||
gs = ge+1;
|
||||
remF -= aFreq;
|
||||
}
|
||||
}
|
||||
|
||||
/*---
|
||||
Iterate up to BZ_N_ITERS times to improve the tables.
|
||||
---*/
|
||||
for (iter = 0; iter < BZ_N_ITERS; iter++) {
|
||||
|
||||
for (t = 0; t < nGroups; t++) fave[t] = 0;
|
||||
|
||||
for (t = 0; t < nGroups; t++)
|
||||
for (v = 0; v < alphaSize; v++)
|
||||
s->rfreq[t][v] = 0;
|
||||
|
||||
nSelectors = 0;
|
||||
totc = 0;
|
||||
gs = 0;
|
||||
while (True) {
|
||||
|
||||
/*--- Set group start & end marks. --*/
|
||||
if (gs >= s->nMTF) break;
|
||||
ge = gs + BZ_G_SIZE - 1;
|
||||
if (ge >= s->nMTF) ge = s->nMTF-1;
|
||||
|
||||
/*--
|
||||
Calculate the cost of this group as coded
|
||||
by each of the coding tables.
|
||||
--*/
|
||||
for (t = 0; t < nGroups; t++) cost[t] = 0;
|
||||
|
||||
if (nGroups == 6) {
|
||||
register UInt16 cost0, cost1, cost2, cost3, cost4, cost5;
|
||||
cost0 = cost1 = cost2 = cost3 = cost4 = cost5 = 0;
|
||||
for (i = gs; i <= ge; i++) {
|
||||
UInt16 icv = s->szptr[i];
|
||||
cost0 += s->len[0][icv];
|
||||
cost1 += s->len[1][icv];
|
||||
cost2 += s->len[2][icv];
|
||||
cost3 += s->len[3][icv];
|
||||
cost4 += s->len[4][icv];
|
||||
cost5 += s->len[5][icv];
|
||||
}
|
||||
cost[0] = cost0; cost[1] = cost1; cost[2] = cost2;
|
||||
cost[3] = cost3; cost[4] = cost4; cost[5] = cost5;
|
||||
} else {
|
||||
for (i = gs; i <= ge; i++) {
|
||||
UInt16 icv = s->szptr[i];
|
||||
for (t = 0; t < nGroups; t++) cost[t] += s->len[t][icv];
|
||||
}
|
||||
}
|
||||
|
||||
/*--
|
||||
Find the coding table which is best for this group,
|
||||
and record its identity in the selector table.
|
||||
--*/
|
||||
bc = 999999999; bt = -1;
|
||||
for (t = 0; t < nGroups; t++)
|
||||
if (cost[t] < bc) { bc = cost[t]; bt = t; };
|
||||
totc += bc;
|
||||
fave[bt]++;
|
||||
s->selector[nSelectors] = bt;
|
||||
nSelectors++;
|
||||
|
||||
/*--
|
||||
Increment the symbol frequencies for the selected table.
|
||||
--*/
|
||||
for (i = gs; i <= ge; i++)
|
||||
s->rfreq[bt][ s->szptr[i] ]++;
|
||||
|
||||
gs = ge+1;
|
||||
}
|
||||
if (s->verbosity >= 3) {
|
||||
VPrintf2 ( " pass %d: size is %d, grp uses are ",
|
||||
iter+1, totc/8 );
|
||||
for (t = 0; t < nGroups; t++)
|
||||
VPrintf1 ( "%d ", fave[t] );
|
||||
VPrintf0 ( "\n" );
|
||||
}
|
||||
|
||||
/*--
|
||||
Recompute the tables based on the accumulated frequencies.
|
||||
--*/
|
||||
for (t = 0; t < nGroups; t++)
|
||||
hbMakeCodeLengths ( &(s->len[t][0]), &(s->rfreq[t][0]),
|
||||
alphaSize, 20 );
|
||||
}
|
||||
|
||||
|
||||
AssertH( nGroups < 8, 3002 );
|
||||
AssertH( nSelectors < 32768 &&
|
||||
nSelectors <= (2 + (900000 / BZ_G_SIZE)),
|
||||
3003 );
|
||||
|
||||
|
||||
/*--- Compute MTF values for the selectors. ---*/
|
||||
{
|
||||
UChar pos[BZ_N_GROUPS], ll_i, tmp2, tmp;
|
||||
for (i = 0; i < nGroups; i++) pos[i] = i;
|
||||
for (i = 0; i < nSelectors; i++) {
|
||||
ll_i = s->selector[i];
|
||||
j = 0;
|
||||
tmp = pos[j];
|
||||
while ( ll_i != tmp ) {
|
||||
j++;
|
||||
tmp2 = tmp;
|
||||
tmp = pos[j];
|
||||
pos[j] = tmp2;
|
||||
};
|
||||
pos[0] = tmp;
|
||||
s->selectorMtf[i] = j;
|
||||
}
|
||||
};
|
||||
|
||||
/*--- Assign actual codes for the tables. --*/
|
||||
for (t = 0; t < nGroups; t++) {
|
||||
minLen = 32;
|
||||
maxLen = 0;
|
||||
for (i = 0; i < alphaSize; i++) {
|
||||
if (s->len[t][i] > maxLen) maxLen = s->len[t][i];
|
||||
if (s->len[t][i] < minLen) minLen = s->len[t][i];
|
||||
}
|
||||
AssertH ( !(maxLen > 20), 3004 );
|
||||
AssertH ( !(minLen < 1), 3005 );
|
||||
hbAssignCodes ( &(s->code[t][0]), &(s->len[t][0]),
|
||||
minLen, maxLen, alphaSize );
|
||||
}
|
||||
|
||||
/*--- Transmit the mapping table. ---*/
|
||||
{
|
||||
Bool inUse16[16];
|
||||
for (i = 0; i < 16; i++) {
|
||||
inUse16[i] = False;
|
||||
for (j = 0; j < 16; j++)
|
||||
if (s->inUse[i * 16 + j]) inUse16[i] = True;
|
||||
}
|
||||
|
||||
nBytes = s->numZ;
|
||||
for (i = 0; i < 16; i++)
|
||||
if (inUse16[i]) bsW(s,1,1); else bsW(s,1,0);
|
||||
|
||||
for (i = 0; i < 16; i++)
|
||||
if (inUse16[i])
|
||||
for (j = 0; j < 16; j++) {
|
||||
if (s->inUse[i * 16 + j]) bsW(s,1,1); else bsW(s,1,0);
|
||||
}
|
||||
|
||||
if (s->verbosity >= 3)
|
||||
VPrintf1( " bytes: mapping %d, ", s->numZ-nBytes );
|
||||
}
|
||||
|
||||
/*--- Now the selectors. ---*/
|
||||
nBytes = s->numZ;
|
||||
bsW ( s, 3, nGroups );
|
||||
bsW ( s, 15, nSelectors );
|
||||
for (i = 0; i < nSelectors; i++) {
|
||||
for (j = 0; j < s->selectorMtf[i]; j++) bsW(s,1,1);
|
||||
bsW(s,1,0);
|
||||
}
|
||||
if (s->verbosity >= 3)
|
||||
VPrintf1( "selectors %d, ", s->numZ-nBytes );
|
||||
|
||||
/*--- Now the coding tables. ---*/
|
||||
nBytes = s->numZ;
|
||||
|
||||
for (t = 0; t < nGroups; t++) {
|
||||
Int32 curr = s->len[t][0];
|
||||
bsW ( s, 5, curr );
|
||||
for (i = 0; i < alphaSize; i++) {
|
||||
while (curr < s->len[t][i]) { bsW(s,2,2); curr++; /* 10 */ };
|
||||
while (curr > s->len[t][i]) { bsW(s,2,3); curr--; /* 11 */ };
|
||||
bsW ( s, 1, 0 );
|
||||
}
|
||||
}
|
||||
|
||||
if (s->verbosity >= 3)
|
||||
VPrintf1 ( "code lengths %d, ", s->numZ-nBytes );
|
||||
|
||||
/*--- And finally, the block data proper ---*/
|
||||
nBytes = s->numZ;
|
||||
selCtr = 0;
|
||||
gs = 0;
|
||||
while (True) {
|
||||
if (gs >= s->nMTF) break;
|
||||
ge = gs + BZ_G_SIZE - 1;
|
||||
if (ge >= s->nMTF) ge = s->nMTF-1;
|
||||
for (i = gs; i <= ge; i++) {
|
||||
AssertH ( s->selector[selCtr] < nGroups, 3006 );
|
||||
bsW ( s,
|
||||
s->len [s->selector[selCtr]] [s->szptr[i]],
|
||||
s->code [s->selector[selCtr]] [s->szptr[i]] );
|
||||
}
|
||||
|
||||
gs = ge+1;
|
||||
selCtr++;
|
||||
}
|
||||
AssertH( selCtr == nSelectors, 3007 );
|
||||
|
||||
if (s->verbosity >= 3)
|
||||
VPrintf1( "codes %d\n", s->numZ-nBytes );
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
void compressBlock ( EState* s, Bool is_last_block )
|
||||
{
|
||||
if (s->nblock > 0) {
|
||||
|
||||
BZ_FINALISE_CRC ( s->blockCRC );
|
||||
s->combinedCRC = (s->combinedCRC << 1) | (s->combinedCRC >> 31);
|
||||
s->combinedCRC ^= s->blockCRC;
|
||||
if (s->blockNo > 1) s->numZ = 0;
|
||||
|
||||
if (s->verbosity >= 2)
|
||||
VPrintf4( " block %d: crc = 0x%8x, "
|
||||
"combined CRC = 0x%8x, size = %d\n",
|
||||
s->blockNo, s->blockCRC, s->combinedCRC, s->nblock );
|
||||
|
||||
blockSort ( s );
|
||||
}
|
||||
|
||||
/*-- If this is the first block, create the stream header. --*/
|
||||
if (s->blockNo == 1) {
|
||||
bsInitWrite ( s );
|
||||
bsPutUChar ( s, 'B' );
|
||||
bsPutUChar ( s, 'Z' );
|
||||
bsPutUChar ( s, 'h' );
|
||||
bsPutUChar ( s, '0' + s->blockSize100k );
|
||||
}
|
||||
|
||||
if (s->nblock > 0) {
|
||||
|
||||
bsPutUChar ( s, 0x31 ); bsPutUChar ( s, 0x41 );
|
||||
bsPutUChar ( s, 0x59 ); bsPutUChar ( s, 0x26 );
|
||||
bsPutUChar ( s, 0x53 ); bsPutUChar ( s, 0x59 );
|
||||
|
||||
/*-- Now the block's CRC, so it is in a known place. --*/
|
||||
bsPutUInt32 ( s, s->blockCRC );
|
||||
|
||||
/*-- Now a single bit indicating randomisation. --*/
|
||||
if (s->blockRandomised) {
|
||||
bsW(s,1,1); s->nBlocksRandomised++;
|
||||
} else
|
||||
bsW(s,1,0);
|
||||
|
||||
bsW ( s, 24, s->origPtr );
|
||||
generateMTFValues ( s );
|
||||
sendMTFValues ( s );
|
||||
}
|
||||
|
||||
|
||||
/*-- If this is the last block, add the stream trailer. --*/
|
||||
if (is_last_block) {
|
||||
|
||||
if (s->verbosity >= 2 && s->nBlocksRandomised > 0)
|
||||
VPrintf2 ( " %d block%s needed randomisation\n",
|
||||
s->nBlocksRandomised,
|
||||
s->nBlocksRandomised == 1 ? "" : "s" );
|
||||
|
||||
bsPutUChar ( s, 0x17 ); bsPutUChar ( s, 0x72 );
|
||||
bsPutUChar ( s, 0x45 ); bsPutUChar ( s, 0x38 );
|
||||
bsPutUChar ( s, 0x50 ); bsPutUChar ( s, 0x90 );
|
||||
bsPutUInt32 ( s, s->combinedCRC );
|
||||
if (s->verbosity >= 2)
|
||||
VPrintf1( " final combined CRC = 0x%x\n ", s->combinedCRC );
|
||||
bsFinishWrite ( s );
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*-------------------------------------------------------------*/
|
||||
/*--- end compress.c ---*/
|
||||
/*-------------------------------------------------------------*/
|
144
crctable.c
Normal file
144
crctable.c
Normal file
@ -0,0 +1,144 @@
|
||||
|
||||
/*-------------------------------------------------------------*/
|
||||
/*--- Table for doing CRCs ---*/
|
||||
/*--- crctable.c ---*/
|
||||
/*-------------------------------------------------------------*/
|
||||
|
||||
/*--
|
||||
This file is a part of bzip2 and/or libbzip2, a program and
|
||||
library for lossless, block-sorting data compression.
|
||||
|
||||
Copyright (C) 1996-1998 Julian R Seward. All rights reserved.
|
||||
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
modification, are permitted provided that the following conditions
|
||||
are met:
|
||||
|
||||
1. Redistributions of source code must retain the above copyright
|
||||
notice, this list of conditions and the following disclaimer.
|
||||
|
||||
2. The origin of this software must not be misrepresented; you must
|
||||
not claim that you wrote the original software. If you use this
|
||||
software in a product, an acknowledgment in the product
|
||||
documentation would be appreciated but is not required.
|
||||
|
||||
3. Altered source versions must be plainly marked as such, and must
|
||||
not be misrepresented as being the original software.
|
||||
|
||||
4. The name of the author may not be used to endorse or promote
|
||||
products derived from this software without specific prior written
|
||||
permission.
|
||||
|
||||
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS
|
||||
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
|
||||
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||||
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
|
||||
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
|
||||
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
|
||||
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
|
||||
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
|
||||
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
Julian Seward, Guildford, Surrey, UK.
|
||||
jseward@acm.org
|
||||
bzip2/libbzip2 version 0.9.0c of 18 October 1998
|
||||
|
||||
This program is based on (at least) the work of:
|
||||
Mike Burrows
|
||||
David Wheeler
|
||||
Peter Fenwick
|
||||
Alistair Moffat
|
||||
Radford Neal
|
||||
Ian H. Witten
|
||||
Robert Sedgewick
|
||||
Jon L. Bentley
|
||||
|
||||
For more information on these sources, see the manual.
|
||||
--*/
|
||||
|
||||
|
||||
#include "bzlib_private.h"
|
||||
|
||||
/*--
|
||||
I think this is an implementation of the AUTODIN-II,
|
||||
Ethernet & FDDI 32-bit CRC standard. Vaguely derived
|
||||
from code by Rob Warnock, in Section 51 of the
|
||||
comp.compression FAQ.
|
||||
--*/
|
||||
|
||||
UInt32 crc32Table[256] = {
|
||||
|
||||
/*-- Ugly, innit? --*/
|
||||
|
||||
0x00000000L, 0x04c11db7L, 0x09823b6eL, 0x0d4326d9L,
|
||||
0x130476dcL, 0x17c56b6bL, 0x1a864db2L, 0x1e475005L,
|
||||
0x2608edb8L, 0x22c9f00fL, 0x2f8ad6d6L, 0x2b4bcb61L,
|
||||
0x350c9b64L, 0x31cd86d3L, 0x3c8ea00aL, 0x384fbdbdL,
|
||||
0x4c11db70L, 0x48d0c6c7L, 0x4593e01eL, 0x4152fda9L,
|
||||
0x5f15adacL, 0x5bd4b01bL, 0x569796c2L, 0x52568b75L,
|
||||
0x6a1936c8L, 0x6ed82b7fL, 0x639b0da6L, 0x675a1011L,
|
||||
0x791d4014L, 0x7ddc5da3L, 0x709f7b7aL, 0x745e66cdL,
|
||||
0x9823b6e0L, 0x9ce2ab57L, 0x91a18d8eL, 0x95609039L,
|
||||
0x8b27c03cL, 0x8fe6dd8bL, 0x82a5fb52L, 0x8664e6e5L,
|
||||
0xbe2b5b58L, 0xbaea46efL, 0xb7a96036L, 0xb3687d81L,
|
||||
0xad2f2d84L, 0xa9ee3033L, 0xa4ad16eaL, 0xa06c0b5dL,
|
||||
0xd4326d90L, 0xd0f37027L, 0xddb056feL, 0xd9714b49L,
|
||||
0xc7361b4cL, 0xc3f706fbL, 0xceb42022L, 0xca753d95L,
|
||||
0xf23a8028L, 0xf6fb9d9fL, 0xfbb8bb46L, 0xff79a6f1L,
|
||||
0xe13ef6f4L, 0xe5ffeb43L, 0xe8bccd9aL, 0xec7dd02dL,
|
||||
0x34867077L, 0x30476dc0L, 0x3d044b19L, 0x39c556aeL,
|
||||
0x278206abL, 0x23431b1cL, 0x2e003dc5L, 0x2ac12072L,
|
||||
0x128e9dcfL, 0x164f8078L, 0x1b0ca6a1L, 0x1fcdbb16L,
|
||||
0x018aeb13L, 0x054bf6a4L, 0x0808d07dL, 0x0cc9cdcaL,
|
||||
0x7897ab07L, 0x7c56b6b0L, 0x71159069L, 0x75d48ddeL,
|
||||
0x6b93dddbL, 0x6f52c06cL, 0x6211e6b5L, 0x66d0fb02L,
|
||||
0x5e9f46bfL, 0x5a5e5b08L, 0x571d7dd1L, 0x53dc6066L,
|
||||
0x4d9b3063L, 0x495a2dd4L, 0x44190b0dL, 0x40d816baL,
|
||||
0xaca5c697L, 0xa864db20L, 0xa527fdf9L, 0xa1e6e04eL,
|
||||
0xbfa1b04bL, 0xbb60adfcL, 0xb6238b25L, 0xb2e29692L,
|
||||
0x8aad2b2fL, 0x8e6c3698L, 0x832f1041L, 0x87ee0df6L,
|
||||
0x99a95df3L, 0x9d684044L, 0x902b669dL, 0x94ea7b2aL,
|
||||
0xe0b41de7L, 0xe4750050L, 0xe9362689L, 0xedf73b3eL,
|
||||
0xf3b06b3bL, 0xf771768cL, 0xfa325055L, 0xfef34de2L,
|
||||
0xc6bcf05fL, 0xc27dede8L, 0xcf3ecb31L, 0xcbffd686L,
|
||||
0xd5b88683L, 0xd1799b34L, 0xdc3abdedL, 0xd8fba05aL,
|
||||
0x690ce0eeL, 0x6dcdfd59L, 0x608edb80L, 0x644fc637L,
|
||||
0x7a089632L, 0x7ec98b85L, 0x738aad5cL, 0x774bb0ebL,
|
||||
0x4f040d56L, 0x4bc510e1L, 0x46863638L, 0x42472b8fL,
|
||||
0x5c007b8aL, 0x58c1663dL, 0x558240e4L, 0x51435d53L,
|
||||
0x251d3b9eL, 0x21dc2629L, 0x2c9f00f0L, 0x285e1d47L,
|
||||
0x36194d42L, 0x32d850f5L, 0x3f9b762cL, 0x3b5a6b9bL,
|
||||
0x0315d626L, 0x07d4cb91L, 0x0a97ed48L, 0x0e56f0ffL,
|
||||
0x1011a0faL, 0x14d0bd4dL, 0x19939b94L, 0x1d528623L,
|
||||
0xf12f560eL, 0xf5ee4bb9L, 0xf8ad6d60L, 0xfc6c70d7L,
|
||||
0xe22b20d2L, 0xe6ea3d65L, 0xeba91bbcL, 0xef68060bL,
|
||||
0xd727bbb6L, 0xd3e6a601L, 0xdea580d8L, 0xda649d6fL,
|
||||
0xc423cd6aL, 0xc0e2d0ddL, 0xcda1f604L, 0xc960ebb3L,
|
||||
0xbd3e8d7eL, 0xb9ff90c9L, 0xb4bcb610L, 0xb07daba7L,
|
||||
0xae3afba2L, 0xaafbe615L, 0xa7b8c0ccL, 0xa379dd7bL,
|
||||
0x9b3660c6L, 0x9ff77d71L, 0x92b45ba8L, 0x9675461fL,
|
||||
0x8832161aL, 0x8cf30badL, 0x81b02d74L, 0x857130c3L,
|
||||
0x5d8a9099L, 0x594b8d2eL, 0x5408abf7L, 0x50c9b640L,
|
||||
0x4e8ee645L, 0x4a4ffbf2L, 0x470cdd2bL, 0x43cdc09cL,
|
||||
0x7b827d21L, 0x7f436096L, 0x7200464fL, 0x76c15bf8L,
|
||||
0x68860bfdL, 0x6c47164aL, 0x61043093L, 0x65c52d24L,
|
||||
0x119b4be9L, 0x155a565eL, 0x18197087L, 0x1cd86d30L,
|
||||
0x029f3d35L, 0x065e2082L, 0x0b1d065bL, 0x0fdc1becL,
|
||||
0x3793a651L, 0x3352bbe6L, 0x3e119d3fL, 0x3ad08088L,
|
||||
0x2497d08dL, 0x2056cd3aL, 0x2d15ebe3L, 0x29d4f654L,
|
||||
0xc5a92679L, 0xc1683bceL, 0xcc2b1d17L, 0xc8ea00a0L,
|
||||
0xd6ad50a5L, 0xd26c4d12L, 0xdf2f6bcbL, 0xdbee767cL,
|
||||
0xe3a1cbc1L, 0xe760d676L, 0xea23f0afL, 0xeee2ed18L,
|
||||
0xf0a5bd1dL, 0xf464a0aaL, 0xf9278673L, 0xfde69bc4L,
|
||||
0x89b8fd09L, 0x8d79e0beL, 0x803ac667L, 0x84fbdbd0L,
|
||||
0x9abc8bd5L, 0x9e7d9662L, 0x933eb0bbL, 0x97ffad0cL,
|
||||
0xafb010b1L, 0xab710d06L, 0xa6322bdfL, 0xa2f33668L,
|
||||
0xbcb4666dL, 0xb8757bdaL, 0xb5365d03L, 0xb1f740b4L
|
||||
};
|
||||
|
||||
|
||||
/*-------------------------------------------------------------*/
|
||||
/*--- end crctable.c ---*/
|
||||
/*-------------------------------------------------------------*/
|
636
decompress.c
Normal file
636
decompress.c
Normal file
@ -0,0 +1,636 @@
|
||||
|
||||
/*-------------------------------------------------------------*/
|
||||
/*--- Decompression machinery ---*/
|
||||
/*--- decompress.c ---*/
|
||||
/*-------------------------------------------------------------*/
|
||||
|
||||
/*--
|
||||
This file is a part of bzip2 and/or libbzip2, a program and
|
||||
library for lossless, block-sorting data compression.
|
||||
|
||||
Copyright (C) 1996-1998 Julian R Seward. All rights reserved.
|
||||
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
modification, are permitted provided that the following conditions
|
||||
are met:
|
||||
|
||||
1. Redistributions of source code must retain the above copyright
|
||||
notice, this list of conditions and the following disclaimer.
|
||||
|
||||
2. The origin of this software must not be misrepresented; you must
|
||||
not claim that you wrote the original software. If you use this
|
||||
software in a product, an acknowledgment in the product
|
||||
documentation would be appreciated but is not required.
|
||||
|
||||
3. Altered source versions must be plainly marked as such, and must
|
||||
not be misrepresented as being the original software.
|
||||
|
||||
4. The name of the author may not be used to endorse or promote
|
||||
products derived from this software without specific prior written
|
||||
permission.
|
||||
|
||||
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS
|
||||
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
|
||||
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||||
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
|
||||
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
|
||||
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
|
||||
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
|
||||
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
|
||||
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
Julian Seward, Guildford, Surrey, UK.
|
||||
jseward@acm.org
|
||||
bzip2/libbzip2 version 0.9.0c of 18 October 1998
|
||||
|
||||
This program is based on (at least) the work of:
|
||||
Mike Burrows
|
||||
David Wheeler
|
||||
Peter Fenwick
|
||||
Alistair Moffat
|
||||
Radford Neal
|
||||
Ian H. Witten
|
||||
Robert Sedgewick
|
||||
Jon L. Bentley
|
||||
|
||||
For more information on these sources, see the manual.
|
||||
--*/
|
||||
|
||||
|
||||
#include "bzlib_private.h"
|
||||
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
static
|
||||
void makeMaps_d ( DState* s )
|
||||
{
|
||||
Int32 i;
|
||||
s->nInUse = 0;
|
||||
for (i = 0; i < 256; i++)
|
||||
if (s->inUse[i]) {
|
||||
s->seqToUnseq[s->nInUse] = i;
|
||||
s->nInUse++;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
#define RETURN(rrr) \
|
||||
{ retVal = rrr; goto save_state_and_return; };
|
||||
|
||||
#define GET_BITS(lll,vvv,nnn) \
|
||||
case lll: s->state = lll; \
|
||||
while (True) { \
|
||||
if (s->bsLive >= nnn) { \
|
||||
UInt32 v; \
|
||||
v = (s->bsBuff >> \
|
||||
(s->bsLive-nnn)) & ((1 << nnn)-1); \
|
||||
s->bsLive -= nnn; \
|
||||
vvv = v; \
|
||||
break; \
|
||||
} \
|
||||
if (s->strm->avail_in == 0) RETURN(BZ_OK); \
|
||||
s->bsBuff \
|
||||
= (s->bsBuff << 8) | \
|
||||
((UInt32) \
|
||||
(*((UChar*)(s->strm->next_in)))); \
|
||||
s->bsLive += 8; \
|
||||
s->strm->next_in++; \
|
||||
s->strm->avail_in--; \
|
||||
s->strm->total_in++; \
|
||||
}
|
||||
|
||||
#define GET_UCHAR(lll,uuu) \
|
||||
GET_BITS(lll,uuu,8)
|
||||
|
||||
#define GET_BIT(lll,uuu) \
|
||||
GET_BITS(lll,uuu,1)
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
#define GET_MTF_VAL(label1,label2,lval) \
|
||||
{ \
|
||||
if (groupPos == 0) { \
|
||||
groupNo++; \
|
||||
groupPos = BZ_G_SIZE; \
|
||||
gSel = s->selector[groupNo]; \
|
||||
gMinlen = s->minLens[gSel]; \
|
||||
gLimit = &(s->limit[gSel][0]); \
|
||||
gPerm = &(s->perm[gSel][0]); \
|
||||
gBase = &(s->base[gSel][0]); \
|
||||
} \
|
||||
groupPos--; \
|
||||
zn = gMinlen; \
|
||||
GET_BITS(label1, zvec, zn); \
|
||||
while (zvec > gLimit[zn]) { \
|
||||
zn++; \
|
||||
GET_BIT(label2, zj); \
|
||||
zvec = (zvec << 1) | zj; \
|
||||
}; \
|
||||
lval = gPerm[zvec - gBase[zn]]; \
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
Int32 decompress ( DState* s )
|
||||
{
|
||||
UChar uc;
|
||||
Int32 retVal;
|
||||
Int32 minLen, maxLen;
|
||||
bz_stream* strm = s->strm;
|
||||
|
||||
/* stuff that needs to be saved/restored */
|
||||
Int32 i ;
|
||||
Int32 j;
|
||||
Int32 t;
|
||||
Int32 alphaSize;
|
||||
Int32 nGroups;
|
||||
Int32 nSelectors;
|
||||
Int32 EOB;
|
||||
Int32 groupNo;
|
||||
Int32 groupPos;
|
||||
Int32 nextSym;
|
||||
Int32 nblockMAX;
|
||||
Int32 nblock;
|
||||
Int32 es;
|
||||
Int32 N;
|
||||
Int32 curr;
|
||||
Int32 zt;
|
||||
Int32 zn;
|
||||
Int32 zvec;
|
||||
Int32 zj;
|
||||
Int32 gSel;
|
||||
Int32 gMinlen;
|
||||
Int32* gLimit;
|
||||
Int32* gBase;
|
||||
Int32* gPerm;
|
||||
|
||||
if (s->state == BZ_X_MAGIC_1) {
|
||||
/*initialise the save area*/
|
||||
s->save_i = 0;
|
||||
s->save_j = 0;
|
||||
s->save_t = 0;
|
||||
s->save_alphaSize = 0;
|
||||
s->save_nGroups = 0;
|
||||
s->save_nSelectors = 0;
|
||||
s->save_EOB = 0;
|
||||
s->save_groupNo = 0;
|
||||
s->save_groupPos = 0;
|
||||
s->save_nextSym = 0;
|
||||
s->save_nblockMAX = 0;
|
||||
s->save_nblock = 0;
|
||||
s->save_es = 0;
|
||||
s->save_N = 0;
|
||||
s->save_curr = 0;
|
||||
s->save_zt = 0;
|
||||
s->save_zn = 0;
|
||||
s->save_zvec = 0;
|
||||
s->save_zj = 0;
|
||||
s->save_gSel = 0;
|
||||
s->save_gMinlen = 0;
|
||||
s->save_gLimit = NULL;
|
||||
s->save_gBase = NULL;
|
||||
s->save_gPerm = NULL;
|
||||
}
|
||||
|
||||
/*restore from the save area*/
|
||||
i = s->save_i;
|
||||
j = s->save_j;
|
||||
t = s->save_t;
|
||||
alphaSize = s->save_alphaSize;
|
||||
nGroups = s->save_nGroups;
|
||||
nSelectors = s->save_nSelectors;
|
||||
EOB = s->save_EOB;
|
||||
groupNo = s->save_groupNo;
|
||||
groupPos = s->save_groupPos;
|
||||
nextSym = s->save_nextSym;
|
||||
nblockMAX = s->save_nblockMAX;
|
||||
nblock = s->save_nblock;
|
||||
es = s->save_es;
|
||||
N = s->save_N;
|
||||
curr = s->save_curr;
|
||||
zt = s->save_zt;
|
||||
zn = s->save_zn;
|
||||
zvec = s->save_zvec;
|
||||
zj = s->save_zj;
|
||||
gSel = s->save_gSel;
|
||||
gMinlen = s->save_gMinlen;
|
||||
gLimit = s->save_gLimit;
|
||||
gBase = s->save_gBase;
|
||||
gPerm = s->save_gPerm;
|
||||
|
||||
retVal = BZ_OK;
|
||||
|
||||
switch (s->state) {
|
||||
|
||||
GET_UCHAR(BZ_X_MAGIC_1, uc);
|
||||
if (uc != 'B') RETURN(BZ_DATA_ERROR_MAGIC);
|
||||
|
||||
GET_UCHAR(BZ_X_MAGIC_2, uc);
|
||||
if (uc != 'Z') RETURN(BZ_DATA_ERROR_MAGIC);
|
||||
|
||||
GET_UCHAR(BZ_X_MAGIC_3, uc)
|
||||
if (uc != 'h') RETURN(BZ_DATA_ERROR_MAGIC);
|
||||
|
||||
GET_BITS(BZ_X_MAGIC_4, s->blockSize100k, 8)
|
||||
if (s->blockSize100k < '1' ||
|
||||
s->blockSize100k > '9') RETURN(BZ_DATA_ERROR_MAGIC);
|
||||
s->blockSize100k -= '0';
|
||||
|
||||
if (s->smallDecompress) {
|
||||
s->ll16 = BZALLOC( s->blockSize100k * 100000 * sizeof(UInt16) );
|
||||
s->ll4 = BZALLOC(
|
||||
((1 + s->blockSize100k * 100000) >> 1) * sizeof(UChar)
|
||||
);
|
||||
if (s->ll16 == NULL || s->ll4 == NULL) RETURN(BZ_MEM_ERROR);
|
||||
} else {
|
||||
s->tt = BZALLOC( s->blockSize100k * 100000 * sizeof(Int32) );
|
||||
if (s->tt == NULL) RETURN(BZ_MEM_ERROR);
|
||||
}
|
||||
|
||||
GET_UCHAR(BZ_X_BLKHDR_1, uc);
|
||||
|
||||
if (uc == 0x17) goto endhdr_2;
|
||||
if (uc != 0x31) RETURN(BZ_DATA_ERROR);
|
||||
GET_UCHAR(BZ_X_BLKHDR_2, uc);
|
||||
if (uc != 0x41) RETURN(BZ_DATA_ERROR);
|
||||
GET_UCHAR(BZ_X_BLKHDR_3, uc);
|
||||
if (uc != 0x59) RETURN(BZ_DATA_ERROR);
|
||||
GET_UCHAR(BZ_X_BLKHDR_4, uc);
|
||||
if (uc != 0x26) RETURN(BZ_DATA_ERROR);
|
||||
GET_UCHAR(BZ_X_BLKHDR_5, uc);
|
||||
if (uc != 0x53) RETURN(BZ_DATA_ERROR);
|
||||
GET_UCHAR(BZ_X_BLKHDR_6, uc);
|
||||
if (uc != 0x59) RETURN(BZ_DATA_ERROR);
|
||||
|
||||
s->currBlockNo++;
|
||||
if (s->verbosity >= 2)
|
||||
VPrintf1 ( "\n [%d: huff+mtf ", s->currBlockNo );
|
||||
|
||||
s->storedBlockCRC = 0;
|
||||
GET_UCHAR(BZ_X_BCRC_1, uc);
|
||||
s->storedBlockCRC = (s->storedBlockCRC << 8) | ((UInt32)uc);
|
||||
GET_UCHAR(BZ_X_BCRC_2, uc);
|
||||
s->storedBlockCRC = (s->storedBlockCRC << 8) | ((UInt32)uc);
|
||||
GET_UCHAR(BZ_X_BCRC_3, uc);
|
||||
s->storedBlockCRC = (s->storedBlockCRC << 8) | ((UInt32)uc);
|
||||
GET_UCHAR(BZ_X_BCRC_4, uc);
|
||||
s->storedBlockCRC = (s->storedBlockCRC << 8) | ((UInt32)uc);
|
||||
|
||||
GET_BITS(BZ_X_RANDBIT, s->blockRandomised, 1);
|
||||
|
||||
s->origPtr = 0;
|
||||
GET_UCHAR(BZ_X_ORIGPTR_1, uc);
|
||||
s->origPtr = (s->origPtr << 8) | ((Int32)uc);
|
||||
GET_UCHAR(BZ_X_ORIGPTR_2, uc);
|
||||
s->origPtr = (s->origPtr << 8) | ((Int32)uc);
|
||||
GET_UCHAR(BZ_X_ORIGPTR_3, uc);
|
||||
s->origPtr = (s->origPtr << 8) | ((Int32)uc);
|
||||
|
||||
/*--- Receive the mapping table ---*/
|
||||
for (i = 0; i < 16; i++) {
|
||||
GET_BIT(BZ_X_MAPPING_1, uc);
|
||||
if (uc == 1)
|
||||
s->inUse16[i] = True; else
|
||||
s->inUse16[i] = False;
|
||||
}
|
||||
|
||||
for (i = 0; i < 256; i++) s->inUse[i] = False;
|
||||
|
||||
for (i = 0; i < 16; i++)
|
||||
if (s->inUse16[i])
|
||||
for (j = 0; j < 16; j++) {
|
||||
GET_BIT(BZ_X_MAPPING_2, uc);
|
||||
if (uc == 1) s->inUse[i * 16 + j] = True;
|
||||
}
|
||||
makeMaps_d ( s );
|
||||
alphaSize = s->nInUse+2;
|
||||
|
||||
/*--- Now the selectors ---*/
|
||||
GET_BITS(BZ_X_SELECTOR_1, nGroups, 3);
|
||||
GET_BITS(BZ_X_SELECTOR_2, nSelectors, 15);
|
||||
for (i = 0; i < nSelectors; i++) {
|
||||
j = 0;
|
||||
while (True) {
|
||||
GET_BIT(BZ_X_SELECTOR_3, uc);
|
||||
if (uc == 0) break;
|
||||
j++;
|
||||
if (j > 5) RETURN(BZ_DATA_ERROR);
|
||||
}
|
||||
s->selectorMtf[i] = j;
|
||||
}
|
||||
|
||||
/*--- Undo the MTF values for the selectors. ---*/
|
||||
{
|
||||
UChar pos[BZ_N_GROUPS], tmp, v;
|
||||
for (v = 0; v < nGroups; v++) pos[v] = v;
|
||||
|
||||
for (i = 0; i < nSelectors; i++) {
|
||||
v = s->selectorMtf[i];
|
||||
tmp = pos[v];
|
||||
while (v > 0) { pos[v] = pos[v-1]; v--; }
|
||||
pos[0] = tmp;
|
||||
s->selector[i] = tmp;
|
||||
}
|
||||
}
|
||||
|
||||
/*--- Now the coding tables ---*/
|
||||
for (t = 0; t < nGroups; t++) {
|
||||
GET_BITS(BZ_X_CODING_1, curr, 5);
|
||||
for (i = 0; i < alphaSize; i++) {
|
||||
while (True) {
|
||||
if (curr < 1 || curr > 20) RETURN(BZ_DATA_ERROR);
|
||||
GET_BIT(BZ_X_CODING_2, uc);
|
||||
if (uc == 0) break;
|
||||
GET_BIT(BZ_X_CODING_3, uc);
|
||||
if (uc == 0) curr++; else curr--;
|
||||
}
|
||||
s->len[t][i] = curr;
|
||||
}
|
||||
}
|
||||
|
||||
/*--- Create the Huffman decoding tables ---*/
|
||||
for (t = 0; t < nGroups; t++) {
|
||||
minLen = 32;
|
||||
maxLen = 0;
|
||||
for (i = 0; i < alphaSize; i++) {
|
||||
if (s->len[t][i] > maxLen) maxLen = s->len[t][i];
|
||||
if (s->len[t][i] < minLen) minLen = s->len[t][i];
|
||||
}
|
||||
hbCreateDecodeTables (
|
||||
&(s->limit[t][0]),
|
||||
&(s->base[t][0]),
|
||||
&(s->perm[t][0]),
|
||||
&(s->len[t][0]),
|
||||
minLen, maxLen, alphaSize
|
||||
);
|
||||
s->minLens[t] = minLen;
|
||||
}
|
||||
|
||||
/*--- Now the MTF values ---*/
|
||||
|
||||
EOB = s->nInUse+1;
|
||||
nblockMAX = 100000 * s->blockSize100k;
|
||||
groupNo = -1;
|
||||
groupPos = 0;
|
||||
|
||||
for (i = 0; i <= 255; i++) s->unzftab[i] = 0;
|
||||
|
||||
/*-- MTF init --*/
|
||||
{
|
||||
Int32 ii, jj, kk;
|
||||
kk = MTFA_SIZE-1;
|
||||
for (ii = 256 / MTFL_SIZE - 1; ii >= 0; ii--) {
|
||||
for (jj = MTFL_SIZE-1; jj >= 0; jj--) {
|
||||
s->mtfa[kk] = (UChar)(ii * MTFL_SIZE + jj);
|
||||
kk--;
|
||||
}
|
||||
s->mtfbase[ii] = kk + 1;
|
||||
}
|
||||
}
|
||||
/*-- end MTF init --*/
|
||||
|
||||
nblock = 0;
|
||||
|
||||
GET_MTF_VAL(BZ_X_MTF_1, BZ_X_MTF_2, nextSym);
|
||||
|
||||
while (True) {
|
||||
|
||||
if (nextSym == EOB) break;
|
||||
|
||||
if (nextSym == BZ_RUNA || nextSym == BZ_RUNB) {
|
||||
|
||||
es = -1;
|
||||
N = 1;
|
||||
do {
|
||||
if (nextSym == BZ_RUNA) es = es + (0+1) * N; else
|
||||
if (nextSym == BZ_RUNB) es = es + (1+1) * N;
|
||||
N = N * 2;
|
||||
GET_MTF_VAL(BZ_X_MTF_3, BZ_X_MTF_4, nextSym);
|
||||
}
|
||||
while (nextSym == BZ_RUNA || nextSym == BZ_RUNB);
|
||||
|
||||
es++;
|
||||
uc = s->seqToUnseq[ s->mtfa[s->mtfbase[0]] ];
|
||||
s->unzftab[uc] += es;
|
||||
|
||||
if (s->smallDecompress)
|
||||
while (es > 0) {
|
||||
s->ll16[nblock] = (UInt16)uc;
|
||||
nblock++;
|
||||
es--;
|
||||
}
|
||||
else
|
||||
while (es > 0) {
|
||||
s->tt[nblock] = (UInt32)uc;
|
||||
nblock++;
|
||||
es--;
|
||||
};
|
||||
|
||||
if (nblock > nblockMAX) RETURN(BZ_DATA_ERROR);
|
||||
continue;
|
||||
|
||||
} else {
|
||||
|
||||
if (nblock > nblockMAX) RETURN(BZ_DATA_ERROR);
|
||||
|
||||
/*-- uc = MTF ( nextSym-1 ) --*/
|
||||
{
|
||||
Int32 ii, jj, kk, pp, lno, off;
|
||||
UInt32 nn;
|
||||
nn = (UInt32)(nextSym - 1);
|
||||
|
||||
if (nn < MTFL_SIZE) {
|
||||
/* avoid general-case expense */
|
||||
pp = s->mtfbase[0];
|
||||
uc = s->mtfa[pp+nn];
|
||||
while (nn > 3) {
|
||||
Int32 z = pp+nn;
|
||||
s->mtfa[(z) ] = s->mtfa[(z)-1];
|
||||
s->mtfa[(z)-1] = s->mtfa[(z)-2];
|
||||
s->mtfa[(z)-2] = s->mtfa[(z)-3];
|
||||
s->mtfa[(z)-3] = s->mtfa[(z)-4];
|
||||
nn -= 4;
|
||||
}
|
||||
while (nn > 0) {
|
||||
s->mtfa[(pp+nn)] = s->mtfa[(pp+nn)-1]; nn--;
|
||||
};
|
||||
s->mtfa[pp] = uc;
|
||||
} else {
|
||||
/* general case */
|
||||
lno = nn / MTFL_SIZE;
|
||||
off = nn % MTFL_SIZE;
|
||||
pp = s->mtfbase[lno] + off;
|
||||
uc = s->mtfa[pp];
|
||||
while (pp > s->mtfbase[lno]) {
|
||||
s->mtfa[pp] = s->mtfa[pp-1]; pp--;
|
||||
};
|
||||
s->mtfbase[lno]++;
|
||||
while (lno > 0) {
|
||||
s->mtfbase[lno]--;
|
||||
s->mtfa[s->mtfbase[lno]]
|
||||
= s->mtfa[s->mtfbase[lno-1] + MTFL_SIZE - 1];
|
||||
lno--;
|
||||
}
|
||||
s->mtfbase[0]--;
|
||||
s->mtfa[s->mtfbase[0]] = uc;
|
||||
if (s->mtfbase[0] == 0) {
|
||||
kk = MTFA_SIZE-1;
|
||||
for (ii = 256 / MTFL_SIZE-1; ii >= 0; ii--) {
|
||||
for (jj = MTFL_SIZE-1; jj >= 0; jj--) {
|
||||
s->mtfa[kk] = s->mtfa[s->mtfbase[ii] + jj];
|
||||
kk--;
|
||||
}
|
||||
s->mtfbase[ii] = kk + 1;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
/*-- end uc = MTF ( nextSym-1 ) --*/
|
||||
|
||||
s->unzftab[s->seqToUnseq[uc]]++;
|
||||
if (s->smallDecompress)
|
||||
s->ll16[nblock] = (UInt16)(s->seqToUnseq[uc]); else
|
||||
s->tt[nblock] = (UInt32)(s->seqToUnseq[uc]);
|
||||
nblock++;
|
||||
|
||||
GET_MTF_VAL(BZ_X_MTF_5, BZ_X_MTF_6, nextSym);
|
||||
continue;
|
||||
}
|
||||
}
|
||||
|
||||
s->state_out_len = 0;
|
||||
s->state_out_ch = 0;
|
||||
BZ_INITIALISE_CRC ( s->calculatedBlockCRC );
|
||||
s->state = BZ_X_OUTPUT;
|
||||
if (s->verbosity >= 2) VPrintf0 ( "rt+rld" );
|
||||
|
||||
/*-- Set up cftab to facilitate generation of T^(-1) --*/
|
||||
s->cftab[0] = 0;
|
||||
for (i = 1; i <= 256; i++) s->cftab[i] = s->unzftab[i-1];
|
||||
for (i = 1; i <= 256; i++) s->cftab[i] += s->cftab[i-1];
|
||||
|
||||
if (s->smallDecompress) {
|
||||
|
||||
/*-- Make a copy of cftab, used in generation of T --*/
|
||||
for (i = 0; i <= 256; i++) s->cftabCopy[i] = s->cftab[i];
|
||||
|
||||
/*-- compute the T vector --*/
|
||||
for (i = 0; i < nblock; i++) {
|
||||
uc = (UChar)(s->ll16[i]);
|
||||
SET_LL(i, s->cftabCopy[uc]);
|
||||
s->cftabCopy[uc]++;
|
||||
}
|
||||
|
||||
/*-- Compute T^(-1) by pointer reversal on T --*/
|
||||
i = s->origPtr;
|
||||
j = GET_LL(i);
|
||||
do {
|
||||
Int32 tmp = GET_LL(j);
|
||||
SET_LL(j, i);
|
||||
i = j;
|
||||
j = tmp;
|
||||
}
|
||||
while (i != s->origPtr);
|
||||
|
||||
s->tPos = s->origPtr;
|
||||
s->nblock_used = 0;
|
||||
if (s->blockRandomised) {
|
||||
BZ_RAND_INIT_MASK;
|
||||
BZ_GET_SMALL(s->k0); s->nblock_used++;
|
||||
BZ_RAND_UPD_MASK; s->k0 ^= BZ_RAND_MASK;
|
||||
} else {
|
||||
BZ_GET_SMALL(s->k0); s->nblock_used++;
|
||||
}
|
||||
|
||||
} else {
|
||||
|
||||
/*-- compute the T^(-1) vector --*/
|
||||
for (i = 0; i < nblock; i++) {
|
||||
uc = (UChar)(s->tt[i] & 0xff);
|
||||
s->tt[s->cftab[uc]] |= (i << 8);
|
||||
s->cftab[uc]++;
|
||||
}
|
||||
|
||||
s->tPos = s->tt[s->origPtr] >> 8;
|
||||
s->nblock_used = 0;
|
||||
if (s->blockRandomised) {
|
||||
BZ_RAND_INIT_MASK;
|
||||
BZ_GET_FAST(s->k0); s->nblock_used++;
|
||||
BZ_RAND_UPD_MASK; s->k0 ^= BZ_RAND_MASK;
|
||||
} else {
|
||||
BZ_GET_FAST(s->k0); s->nblock_used++;
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
RETURN(BZ_OK);
|
||||
|
||||
|
||||
|
||||
endhdr_2:
|
||||
|
||||
GET_UCHAR(BZ_X_ENDHDR_2, uc);
|
||||
if (uc != 0x72) RETURN(BZ_DATA_ERROR);
|
||||
GET_UCHAR(BZ_X_ENDHDR_3, uc);
|
||||
if (uc != 0x45) RETURN(BZ_DATA_ERROR);
|
||||
GET_UCHAR(BZ_X_ENDHDR_4, uc);
|
||||
if (uc != 0x38) RETURN(BZ_DATA_ERROR);
|
||||
GET_UCHAR(BZ_X_ENDHDR_5, uc);
|
||||
if (uc != 0x50) RETURN(BZ_DATA_ERROR);
|
||||
GET_UCHAR(BZ_X_ENDHDR_6, uc);
|
||||
if (uc != 0x90) RETURN(BZ_DATA_ERROR);
|
||||
|
||||
s->storedCombinedCRC = 0;
|
||||
GET_UCHAR(BZ_X_CCRC_1, uc);
|
||||
s->storedCombinedCRC = (s->storedCombinedCRC << 8) | ((UInt32)uc);
|
||||
GET_UCHAR(BZ_X_CCRC_2, uc);
|
||||
s->storedCombinedCRC = (s->storedCombinedCRC << 8) | ((UInt32)uc);
|
||||
GET_UCHAR(BZ_X_CCRC_3, uc);
|
||||
s->storedCombinedCRC = (s->storedCombinedCRC << 8) | ((UInt32)uc);
|
||||
GET_UCHAR(BZ_X_CCRC_4, uc);
|
||||
s->storedCombinedCRC = (s->storedCombinedCRC << 8) | ((UInt32)uc);
|
||||
|
||||
s->state = BZ_X_IDLE;
|
||||
RETURN(BZ_STREAM_END);
|
||||
|
||||
default: AssertH ( False, 4001 );
|
||||
}
|
||||
|
||||
AssertH ( False, 4002 );
|
||||
|
||||
save_state_and_return:
|
||||
|
||||
s->save_i = i;
|
||||
s->save_j = j;
|
||||
s->save_t = t;
|
||||
s->save_alphaSize = alphaSize;
|
||||
s->save_nGroups = nGroups;
|
||||
s->save_nSelectors = nSelectors;
|
||||
s->save_EOB = EOB;
|
||||
s->save_groupNo = groupNo;
|
||||
s->save_groupPos = groupPos;
|
||||
s->save_nextSym = nextSym;
|
||||
s->save_nblockMAX = nblockMAX;
|
||||
s->save_nblock = nblock;
|
||||
s->save_es = es;
|
||||
s->save_N = N;
|
||||
s->save_curr = curr;
|
||||
s->save_zt = zt;
|
||||
s->save_zn = zn;
|
||||
s->save_zvec = zvec;
|
||||
s->save_zj = zj;
|
||||
s->save_gSel = gSel;
|
||||
s->save_gMinlen = gMinlen;
|
||||
s->save_gLimit = gLimit;
|
||||
s->save_gBase = gBase;
|
||||
s->save_gPerm = gPerm;
|
||||
|
||||
return retVal;
|
||||
}
|
||||
|
||||
|
||||
/*-------------------------------------------------------------*/
|
||||
/*--- end decompress.c ---*/
|
||||
/*-------------------------------------------------------------*/
|
163
dlltest.c
Normal file
163
dlltest.c
Normal file
@ -0,0 +1,163 @@
|
||||
/*
|
||||
minibz2
|
||||
libbz2.dll test program.
|
||||
by Yoshioka Tsuneo(QWF00133@nifty.ne.jp/tsuneo-y@is.aist-nara.ac.jp)
|
||||
This file is Public Domain.
|
||||
welcome any email to me.
|
||||
|
||||
usage: minibz2 [-d] [-{1,2,..9}] [[srcfilename] destfilename]
|
||||
*/
|
||||
|
||||
#define BZ_IMPORT
|
||||
#include "bzlib.h"
|
||||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
#ifdef _WIN32
|
||||
#include <io.h>
|
||||
#endif
|
||||
|
||||
|
||||
#ifdef _WIN32
|
||||
|
||||
#include <windows.h>
|
||||
static int BZ2DLLLoaded = 0;
|
||||
static HINSTANCE BZ2DLLhLib;
|
||||
int BZ2DLLLoadLibrary(void)
|
||||
{
|
||||
HINSTANCE hLib;
|
||||
|
||||
if(BZ2DLLLoaded==1){return 0;}
|
||||
hLib=LoadLibrary("libbz2.dll");
|
||||
if(hLib == NULL){
|
||||
puts("Can't load libbz2.dll");
|
||||
return -1;
|
||||
}
|
||||
BZ2DLLLoaded=1;
|
||||
BZ2DLLhLib=hLib;
|
||||
bzlibVersion=GetProcAddress(hLib,"bzlibVersion");
|
||||
bzopen=GetProcAddress(hLib,"bzopen");
|
||||
bzdopen=GetProcAddress(hLib,"bzdopen");
|
||||
bzread=GetProcAddress(hLib,"bzread");
|
||||
bzwrite=GetProcAddress(hLib,"bzwrite");
|
||||
bzflush=GetProcAddress(hLib,"bzflush");
|
||||
bzclose=GetProcAddress(hLib,"bzclose");
|
||||
bzerror=GetProcAddress(hLib,"bzerror");
|
||||
return 0;
|
||||
|
||||
}
|
||||
int BZ2DLLFreeLibrary(void)
|
||||
{
|
||||
if(BZ2DLLLoaded==0){return 0;}
|
||||
FreeLibrary(BZ2DLLhLib);
|
||||
BZ2DLLLoaded=0;
|
||||
}
|
||||
#endif /* WIN32 */
|
||||
|
||||
void usage(void)
|
||||
{
|
||||
puts("usage: minibz2 [-d] [-{1,2,..9}] [[srcfilename] destfilename]");
|
||||
}
|
||||
|
||||
void main(int argc,char *argv[])
|
||||
{
|
||||
int decompress = 0;
|
||||
int level = 9;
|
||||
char *fn_r,*fn_w;
|
||||
|
||||
#ifdef _WIN32
|
||||
if(BZ2DLLLoadLibrary()<0){
|
||||
puts("can't load dll");
|
||||
exit(1);
|
||||
}
|
||||
#endif
|
||||
while(++argv,--argc){
|
||||
if(**argv =='-' || **argv=='/'){
|
||||
char *p;
|
||||
|
||||
for(p=*argv+1;*p;p++){
|
||||
if(*p=='d'){
|
||||
decompress = 1;
|
||||
}else if('1'<=*p && *p<='9'){
|
||||
level = *p - '0';
|
||||
}else{
|
||||
usage();
|
||||
exit(1);
|
||||
}
|
||||
}
|
||||
}else{
|
||||
break;
|
||||
}
|
||||
}
|
||||
if(argc>=1){
|
||||
fn_r = *argv;
|
||||
argc--;argv++;
|
||||
}else{
|
||||
fn_r = NULL;
|
||||
}
|
||||
if(argc>=1){
|
||||
fn_w = *argv;
|
||||
argc--;argv++;
|
||||
}else{
|
||||
fn_w = NULL;
|
||||
}
|
||||
{
|
||||
int len;
|
||||
char buff[0x1000];
|
||||
char mode[10];
|
||||
|
||||
if(decompress){
|
||||
BZFILE *BZ2fp_r;
|
||||
FILE *fp_w;
|
||||
|
||||
if(fn_w){
|
||||
if((fp_w = fopen(fn_w,"wb"))==NULL){
|
||||
printf("can't open [%s]\n",fn_w);
|
||||
perror("reason:");
|
||||
exit(1);
|
||||
}
|
||||
}else{
|
||||
fp_w = stdout;
|
||||
}
|
||||
if((BZ2fp_r == NULL && (BZ2fp_r = bzdopen(fileno(stdin),"rb"))==NULL)
|
||||
|| (BZ2fp_r != NULL && (BZ2fp_r = bzopen(fn_r,"rb"))==NULL)){
|
||||
printf("can't bz2openstream\n");
|
||||
exit(1);
|
||||
}
|
||||
while((len=bzread(BZ2fp_r,buff,0x1000))>0){
|
||||
fwrite(buff,1,len,fp_w);
|
||||
}
|
||||
bzclose(BZ2fp_r);
|
||||
if(fp_w != stdout) fclose(fp_w);
|
||||
}else{
|
||||
BZFILE *BZ2fp_w;
|
||||
FILE *fp_r;
|
||||
|
||||
if(fn_r){
|
||||
if((fp_r = fopen(fn_r,"rb"))==NULL){
|
||||
printf("can't open [%s]\n",fn_r);
|
||||
perror("reason:");
|
||||
exit(1);
|
||||
}
|
||||
}else{
|
||||
fp_r = stdin;
|
||||
}
|
||||
mode[0]='w';
|
||||
mode[1] = '0' + level;
|
||||
mode[2] = '\0';
|
||||
|
||||
if((fn_w == NULL && (BZ2fp_w = bzdopen(fileno(stdout),mode))==NULL)
|
||||
|| (fn_w !=NULL && (BZ2fp_w = bzopen(fn_w,mode))==NULL)){
|
||||
printf("can't bz2openstream\n");
|
||||
exit(1);
|
||||
}
|
||||
while((len=fread(buff,1,0x1000,fp_r))>0){
|
||||
bzwrite(BZ2fp_w,buff,len);
|
||||
}
|
||||
bzclose(BZ2fp_w);
|
||||
if(fp_r!=stdin)fclose(fp_r);
|
||||
}
|
||||
}
|
||||
#ifdef _WIN32
|
||||
BZ2DLLFreeLibrary();
|
||||
#endif
|
||||
}
|
93
dlltest.dsp
Normal file
93
dlltest.dsp
Normal file
@ -0,0 +1,93 @@
|
||||
# Microsoft Developer Studio Project File - Name="dlltest" - Package Owner=<4>
|
||||
# Microsoft Developer Studio Generated Build File, Format Version 5.00
|
||||
# ** 編集しないでください **
|
||||
|
||||
# TARGTYPE "Win32 (x86) Console Application" 0x0103
|
||||
|
||||
CFG=dlltest - Win32 Debug
|
||||
!MESSAGE これは有効なメイクファイルではありません。 このプロジェクトをビルドするためには NMAKE を使用してください。
|
||||
!MESSAGE [メイクファイルのエクスポート] コマンドを使用して実行してください
|
||||
!MESSAGE
|
||||
!MESSAGE NMAKE /f "dlltest.mak".
|
||||
!MESSAGE
|
||||
!MESSAGE NMAKE の実行時に構成を指定できます
|
||||
!MESSAGE コマンド ライン上でマクロの設定を定義します。例:
|
||||
!MESSAGE
|
||||
!MESSAGE NMAKE /f "dlltest.mak" CFG="dlltest - Win32 Debug"
|
||||
!MESSAGE
|
||||
!MESSAGE 選択可能なビルド モード:
|
||||
!MESSAGE
|
||||
!MESSAGE "dlltest - Win32 Release" ("Win32 (x86) Console Application" 用)
|
||||
!MESSAGE "dlltest - Win32 Debug" ("Win32 (x86) Console Application" 用)
|
||||
!MESSAGE
|
||||
|
||||
# Begin Project
|
||||
# PROP Scc_ProjName ""
|
||||
# PROP Scc_LocalPath ""
|
||||
CPP=cl.exe
|
||||
RSC=rc.exe
|
||||
|
||||
!IF "$(CFG)" == "dlltest - Win32 Release"
|
||||
|
||||
# PROP BASE Use_MFC 0
|
||||
# PROP BASE Use_Debug_Libraries 0
|
||||
# PROP BASE Output_Dir "Release"
|
||||
# PROP BASE Intermediate_Dir "Release"
|
||||
# PROP BASE Target_Dir ""
|
||||
# PROP Use_MFC 0
|
||||
# PROP Use_Debug_Libraries 0
|
||||
# PROP Output_Dir "Release"
|
||||
# PROP Intermediate_Dir "Release"
|
||||
# PROP Ignore_Export_Lib 0
|
||||
# PROP Target_Dir ""
|
||||
# ADD BASE CPP /nologo /W3 /GX /O2 /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_MBCS" /YX /FD /c
|
||||
# ADD CPP /nologo /W3 /GX /O2 /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_MBCS" /YX /FD /c
|
||||
# ADD BASE RSC /l 0x411 /d "NDEBUG"
|
||||
# ADD RSC /l 0x411 /d "NDEBUG"
|
||||
BSC32=bscmake.exe
|
||||
# ADD BASE BSC32 /nologo
|
||||
# ADD BSC32 /nologo
|
||||
LINK32=link.exe
|
||||
# ADD BASE LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:console /machine:I386
|
||||
# ADD LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:console /machine:I386 /out:"minibz2.exe"
|
||||
|
||||
!ELSEIF "$(CFG)" == "dlltest - Win32 Debug"
|
||||
|
||||
# PROP BASE Use_MFC 0
|
||||
# PROP BASE Use_Debug_Libraries 1
|
||||
# PROP BASE Output_Dir "dlltest_"
|
||||
# PROP BASE Intermediate_Dir "dlltest_"
|
||||
# PROP BASE Target_Dir ""
|
||||
# PROP Use_MFC 0
|
||||
# PROP Use_Debug_Libraries 1
|
||||
# PROP Output_Dir "dlltest_"
|
||||
# PROP Intermediate_Dir "dlltest_"
|
||||
# PROP Ignore_Export_Lib 0
|
||||
# PROP Target_Dir ""
|
||||
# ADD BASE CPP /nologo /W3 /Gm /GX /Zi /Od /D "WIN32" /D "_DEBUG" /D "_CONSOLE" /D "_MBCS" /YX /FD /c
|
||||
# ADD CPP /nologo /W3 /Gm /GX /Zi /Od /D "WIN32" /D "_DEBUG" /D "_CONSOLE" /D "_MBCS" /YX /FD /c
|
||||
# ADD BASE RSC /l 0x411 /d "_DEBUG"
|
||||
# ADD RSC /l 0x411 /d "_DEBUG"
|
||||
BSC32=bscmake.exe
|
||||
# ADD BASE BSC32 /nologo
|
||||
# ADD BSC32 /nologo
|
||||
LINK32=link.exe
|
||||
# ADD BASE LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:console /debug /machine:I386 /pdbtype:sept
|
||||
# ADD LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:console /debug /machine:I386 /out:"minibz2.exe" /pdbtype:sept
|
||||
|
||||
!ENDIF
|
||||
|
||||
# Begin Target
|
||||
|
||||
# Name "dlltest - Win32 Release"
|
||||
# Name "dlltest - Win32 Debug"
|
||||
# Begin Source File
|
||||
|
||||
SOURCE=.\bzlib.h
|
||||
# End Source File
|
||||
# Begin Source File
|
||||
|
||||
SOURCE=.\dlltest.c
|
||||
# End Source File
|
||||
# End Target
|
||||
# End Project
|
37
howbig.c
Normal file
37
howbig.c
Normal file
@ -0,0 +1,37 @@
|
||||
|
||||
#include <stdio.h>
|
||||
#include <assert.h>
|
||||
#include "bzlib.h"
|
||||
|
||||
unsigned char ibuff[1000000];
|
||||
unsigned char obuff[1000000];
|
||||
|
||||
void doone ( int n )
|
||||
{
|
||||
int i, j, k, q, nobuff;
|
||||
q = 0;
|
||||
|
||||
for (k = 0; k < 1; k++) {
|
||||
for (i = 0; i < n; i++)
|
||||
ibuff[i] = ((unsigned long)(random())) & 0xff;
|
||||
nobuff = 1000000;
|
||||
j = bzBuffToBuffCompress ( obuff, &nobuff, ibuff, n, 9,0,0 );
|
||||
assert (j == BZ_OK);
|
||||
if (nobuff > q) q = nobuff;
|
||||
}
|
||||
printf ( "%d %d(%d)\n", n, q, (int)((float)n * 1.01 - (float)q) );
|
||||
}
|
||||
|
||||
int main ( int argc, char** argv )
|
||||
{
|
||||
int i;
|
||||
i = 0;
|
||||
while (1) {
|
||||
if (i >= 900000) break;
|
||||
doone(i);
|
||||
if ( (int)(1.10 * i) > i )
|
||||
i = (int)(1.10 * i); else i++;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
228
huffman.c
Normal file
228
huffman.c
Normal file
@ -0,0 +1,228 @@
|
||||
|
||||
/*-------------------------------------------------------------*/
|
||||
/*--- Huffman coding low-level stuff ---*/
|
||||
/*--- huffman.c ---*/
|
||||
/*-------------------------------------------------------------*/
|
||||
|
||||
/*--
|
||||
This file is a part of bzip2 and/or libbzip2, a program and
|
||||
library for lossless, block-sorting data compression.
|
||||
|
||||
Copyright (C) 1996-1998 Julian R Seward. All rights reserved.
|
||||
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
modification, are permitted provided that the following conditions
|
||||
are met:
|
||||
|
||||
1. Redistributions of source code must retain the above copyright
|
||||
notice, this list of conditions and the following disclaimer.
|
||||
|
||||
2. The origin of this software must not be misrepresented; you must
|
||||
not claim that you wrote the original software. If you use this
|
||||
software in a product, an acknowledgment in the product
|
||||
documentation would be appreciated but is not required.
|
||||
|
||||
3. Altered source versions must be plainly marked as such, and must
|
||||
not be misrepresented as being the original software.
|
||||
|
||||
4. The name of the author may not be used to endorse or promote
|
||||
products derived from this software without specific prior written
|
||||
permission.
|
||||
|
||||
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS
|
||||
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
|
||||
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||||
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
|
||||
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
|
||||
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
|
||||
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
|
||||
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
|
||||
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
Julian Seward, Guildford, Surrey, UK.
|
||||
jseward@acm.org
|
||||
bzip2/libbzip2 version 0.9.0c of 18 October 1998
|
||||
|
||||
This program is based on (at least) the work of:
|
||||
Mike Burrows
|
||||
David Wheeler
|
||||
Peter Fenwick
|
||||
Alistair Moffat
|
||||
Radford Neal
|
||||
Ian H. Witten
|
||||
Robert Sedgewick
|
||||
Jon L. Bentley
|
||||
|
||||
For more information on these sources, see the manual.
|
||||
--*/
|
||||
|
||||
|
||||
#include "bzlib_private.h"
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
#define WEIGHTOF(zz0) ((zz0) & 0xffffff00)
|
||||
#define DEPTHOF(zz1) ((zz1) & 0x000000ff)
|
||||
#define MYMAX(zz2,zz3) ((zz2) > (zz3) ? (zz2) : (zz3))
|
||||
|
||||
#define ADDWEIGHTS(zw1,zw2) \
|
||||
(WEIGHTOF(zw1)+WEIGHTOF(zw2)) | \
|
||||
(1 + MYMAX(DEPTHOF(zw1),DEPTHOF(zw2)))
|
||||
|
||||
#define UPHEAP(z) \
|
||||
{ \
|
||||
Int32 zz, tmp; \
|
||||
zz = z; tmp = heap[zz]; \
|
||||
while (weight[tmp] < weight[heap[zz >> 1]]) { \
|
||||
heap[zz] = heap[zz >> 1]; \
|
||||
zz >>= 1; \
|
||||
} \
|
||||
heap[zz] = tmp; \
|
||||
}
|
||||
|
||||
#define DOWNHEAP(z) \
|
||||
{ \
|
||||
Int32 zz, yy, tmp; \
|
||||
zz = z; tmp = heap[zz]; \
|
||||
while (True) { \
|
||||
yy = zz << 1; \
|
||||
if (yy > nHeap) break; \
|
||||
if (yy < nHeap && \
|
||||
weight[heap[yy+1]] < weight[heap[yy]]) \
|
||||
yy++; \
|
||||
if (weight[tmp] < weight[heap[yy]]) break; \
|
||||
heap[zz] = heap[yy]; \
|
||||
zz = yy; \
|
||||
} \
|
||||
heap[zz] = tmp; \
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
void hbMakeCodeLengths ( UChar *len,
|
||||
Int32 *freq,
|
||||
Int32 alphaSize,
|
||||
Int32 maxLen )
|
||||
{
|
||||
/*--
|
||||
Nodes and heap entries run from 1. Entry 0
|
||||
for both the heap and nodes is a sentinel.
|
||||
--*/
|
||||
Int32 nNodes, nHeap, n1, n2, i, j, k;
|
||||
Bool tooLong;
|
||||
|
||||
Int32 heap [ BZ_MAX_ALPHA_SIZE + 2 ];
|
||||
Int32 weight [ BZ_MAX_ALPHA_SIZE * 2 ];
|
||||
Int32 parent [ BZ_MAX_ALPHA_SIZE * 2 ];
|
||||
|
||||
for (i = 0; i < alphaSize; i++)
|
||||
weight[i+1] = (freq[i] == 0 ? 1 : freq[i]) << 8;
|
||||
|
||||
while (True) {
|
||||
|
||||
nNodes = alphaSize;
|
||||
nHeap = 0;
|
||||
|
||||
heap[0] = 0;
|
||||
weight[0] = 0;
|
||||
parent[0] = -2;
|
||||
|
||||
for (i = 1; i <= alphaSize; i++) {
|
||||
parent[i] = -1;
|
||||
nHeap++;
|
||||
heap[nHeap] = i;
|
||||
UPHEAP(nHeap);
|
||||
}
|
||||
|
||||
AssertH( nHeap < (BZ_MAX_ALPHA_SIZE+2), 2001 );
|
||||
|
||||
while (nHeap > 1) {
|
||||
n1 = heap[1]; heap[1] = heap[nHeap]; nHeap--; DOWNHEAP(1);
|
||||
n2 = heap[1]; heap[1] = heap[nHeap]; nHeap--; DOWNHEAP(1);
|
||||
nNodes++;
|
||||
parent[n1] = parent[n2] = nNodes;
|
||||
weight[nNodes] = ADDWEIGHTS(weight[n1], weight[n2]);
|
||||
parent[nNodes] = -1;
|
||||
nHeap++;
|
||||
heap[nHeap] = nNodes;
|
||||
UPHEAP(nHeap);
|
||||
}
|
||||
|
||||
AssertH( nNodes < (BZ_MAX_ALPHA_SIZE * 2), 2002 );
|
||||
|
||||
tooLong = False;
|
||||
for (i = 1; i <= alphaSize; i++) {
|
||||
j = 0;
|
||||
k = i;
|
||||
while (parent[k] >= 0) { k = parent[k]; j++; }
|
||||
len[i-1] = j;
|
||||
if (j > maxLen) tooLong = True;
|
||||
}
|
||||
|
||||
if (! tooLong) break;
|
||||
|
||||
for (i = 1; i < alphaSize; i++) {
|
||||
j = weight[i] >> 8;
|
||||
j = 1 + (j / 2);
|
||||
weight[i] = j << 8;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
void hbAssignCodes ( Int32 *code,
|
||||
UChar *length,
|
||||
Int32 minLen,
|
||||
Int32 maxLen,
|
||||
Int32 alphaSize )
|
||||
{
|
||||
Int32 n, vec, i;
|
||||
|
||||
vec = 0;
|
||||
for (n = minLen; n <= maxLen; n++) {
|
||||
for (i = 0; i < alphaSize; i++)
|
||||
if (length[i] == n) { code[i] = vec; vec++; };
|
||||
vec <<= 1;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
void hbCreateDecodeTables ( Int32 *limit,
|
||||
Int32 *base,
|
||||
Int32 *perm,
|
||||
UChar *length,
|
||||
Int32 minLen,
|
||||
Int32 maxLen,
|
||||
Int32 alphaSize )
|
||||
{
|
||||
Int32 pp, i, j, vec;
|
||||
|
||||
pp = 0;
|
||||
for (i = minLen; i <= maxLen; i++)
|
||||
for (j = 0; j < alphaSize; j++)
|
||||
if (length[j] == i) { perm[pp] = j; pp++; };
|
||||
|
||||
for (i = 0; i < BZ_MAX_CODE_LEN; i++) base[i] = 0;
|
||||
for (i = 0; i < alphaSize; i++) base[length[i]+1]++;
|
||||
|
||||
for (i = 1; i < BZ_MAX_CODE_LEN; i++) base[i] += base[i-1];
|
||||
|
||||
for (i = 0; i < BZ_MAX_CODE_LEN; i++) limit[i] = 0;
|
||||
vec = 0;
|
||||
|
||||
for (i = minLen; i <= maxLen; i++) {
|
||||
vec += (base[i+1] - base[i]);
|
||||
limit[i] = vec-1;
|
||||
vec <<= 1;
|
||||
}
|
||||
for (i = minLen + 1; i <= maxLen; i++)
|
||||
base[i] = ((limit[i-1] + 1) << 1) - base[i];
|
||||
}
|
||||
|
||||
|
||||
/*-------------------------------------------------------------*/
|
||||
/*--- end huffman.c ---*/
|
||||
/*-------------------------------------------------------------*/
|
25
libbz2.def
Normal file
25
libbz2.def
Normal file
@ -0,0 +1,25 @@
|
||||
LIBRARY LIBBZ2
|
||||
DESCRIPTION "libbzip2: library for data compression"
|
||||
EXPORTS
|
||||
bzCompressInit
|
||||
bzCompress
|
||||
bzCompressEnd
|
||||
bzDecompressInit
|
||||
bzDecompress
|
||||
bzDecompressEnd
|
||||
bzReadOpen
|
||||
bzReadClose
|
||||
bzReadGetUnused
|
||||
bzRead
|
||||
bzWriteOpen
|
||||
bzWrite
|
||||
bzWriteClose
|
||||
bzBuffToBuffCompress
|
||||
bzBuffToBuffDecompress
|
||||
bzlibVersion
|
||||
bzopen
|
||||
bzdopen
|
||||
bzread
|
||||
bzwrite
|
||||
bzflush
|
||||
bzclose
|
130
libbz2.dsp
Normal file
130
libbz2.dsp
Normal file
@ -0,0 +1,130 @@
|
||||
# Microsoft Developer Studio Project File - Name="libbz2" - Package Owner=<4>
|
||||
# Microsoft Developer Studio Generated Build File, Format Version 5.00
|
||||
# ** 編集しないでください **
|
||||
|
||||
# TARGTYPE "Win32 (x86) Dynamic-Link Library" 0x0102
|
||||
|
||||
CFG=libbz2 - Win32 Debug
|
||||
!MESSAGE これは有効なメイクファイルではありません。 このプロジェクトをビルドするためには NMAKE を使用してください。
|
||||
!MESSAGE [メイクファイルのエクスポート] コマンドを使用して実行してください
|
||||
!MESSAGE
|
||||
!MESSAGE NMAKE /f "libbz2.mak".
|
||||
!MESSAGE
|
||||
!MESSAGE NMAKE の実行時に構成を指定できます
|
||||
!MESSAGE コマンド ライン上でマクロの設定を定義します。例:
|
||||
!MESSAGE
|
||||
!MESSAGE NMAKE /f "libbz2.mak" CFG="libbz2 - Win32 Debug"
|
||||
!MESSAGE
|
||||
!MESSAGE 選択可能なビルド モード:
|
||||
!MESSAGE
|
||||
!MESSAGE "libbz2 - Win32 Release" ("Win32 (x86) Dynamic-Link Library" 用)
|
||||
!MESSAGE "libbz2 - Win32 Debug" ("Win32 (x86) Dynamic-Link Library" 用)
|
||||
!MESSAGE
|
||||
|
||||
# Begin Project
|
||||
# PROP Scc_ProjName ""
|
||||
# PROP Scc_LocalPath ""
|
||||
CPP=cl.exe
|
||||
MTL=midl.exe
|
||||
RSC=rc.exe
|
||||
|
||||
!IF "$(CFG)" == "libbz2 - Win32 Release"
|
||||
|
||||
# PROP BASE Use_MFC 0
|
||||
# PROP BASE Use_Debug_Libraries 0
|
||||
# PROP BASE Output_Dir "Release"
|
||||
# PROP BASE Intermediate_Dir "Release"
|
||||
# PROP BASE Target_Dir ""
|
||||
# PROP Use_MFC 0
|
||||
# PROP Use_Debug_Libraries 0
|
||||
# PROP Output_Dir "Release"
|
||||
# PROP Intermediate_Dir "Release"
|
||||
# PROP Ignore_Export_Lib 0
|
||||
# PROP Target_Dir ""
|
||||
# ADD BASE CPP /nologo /MT /W3 /GX /O2 /D "WIN32" /D "NDEBUG" /D "_WINDOWS" /YX /FD /c
|
||||
# ADD CPP /nologo /MT /W3 /GX /O2 /D "WIN32" /D "NDEBUG" /D "_WINDOWS" /YX /FD /c
|
||||
# ADD BASE MTL /nologo /D "NDEBUG" /mktyplib203 /o NUL /win32
|
||||
# ADD MTL /nologo /D "NDEBUG" /mktyplib203 /o NUL /win32
|
||||
# ADD BASE RSC /l 0x411 /d "NDEBUG"
|
||||
# ADD RSC /l 0x411 /d "NDEBUG"
|
||||
BSC32=bscmake.exe
|
||||
# ADD BASE BSC32 /nologo
|
||||
# ADD BSC32 /nologo
|
||||
LINK32=link.exe
|
||||
# ADD BASE LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:windows /dll /machine:I386
|
||||
# ADD LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:windows /dll /machine:I386 /out:"libbz2.dll"
|
||||
|
||||
!ELSEIF "$(CFG)" == "libbz2 - Win32 Debug"
|
||||
|
||||
# PROP BASE Use_MFC 0
|
||||
# PROP BASE Use_Debug_Libraries 1
|
||||
# PROP BASE Output_Dir "Debug"
|
||||
# PROP BASE Intermediate_Dir "Debug"
|
||||
# PROP BASE Target_Dir ""
|
||||
# PROP Use_MFC 0
|
||||
# PROP Use_Debug_Libraries 1
|
||||
# PROP Output_Dir "Debug"
|
||||
# PROP Intermediate_Dir "Debug"
|
||||
# PROP Ignore_Export_Lib 0
|
||||
# PROP Target_Dir ""
|
||||
# ADD BASE CPP /nologo /MTd /W3 /Gm /GX /Zi /Od /D "WIN32" /D "_DEBUG" /D "_WINDOWS" /YX /FD /c
|
||||
# ADD CPP /nologo /MTd /W3 /Gm /GX /Zi /Od /D "WIN32" /D "_DEBUG" /D "_WINDOWS" /YX /FD /c
|
||||
# ADD BASE MTL /nologo /D "_DEBUG" /mktyplib203 /o NUL /win32
|
||||
# ADD MTL /nologo /D "_DEBUG" /mktyplib203 /o NUL /win32
|
||||
# ADD BASE RSC /l 0x411 /d "_DEBUG"
|
||||
# ADD RSC /l 0x411 /d "_DEBUG"
|
||||
BSC32=bscmake.exe
|
||||
# ADD BASE BSC32 /nologo
|
||||
# ADD BSC32 /nologo
|
||||
LINK32=link.exe
|
||||
# ADD BASE LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:windows /dll /debug /machine:I386 /pdbtype:sept
|
||||
# ADD LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:windows /dll /debug /machine:I386 /out:"libbz2.dll" /pdbtype:sept
|
||||
|
||||
!ENDIF
|
||||
|
||||
# Begin Target
|
||||
|
||||
# Name "libbz2 - Win32 Release"
|
||||
# Name "libbz2 - Win32 Debug"
|
||||
# Begin Source File
|
||||
|
||||
SOURCE=.\blocksort.c
|
||||
# End Source File
|
||||
# Begin Source File
|
||||
|
||||
SOURCE=.\bzlib.c
|
||||
# End Source File
|
||||
# Begin Source File
|
||||
|
||||
SOURCE=.\bzlib.h
|
||||
# End Source File
|
||||
# Begin Source File
|
||||
|
||||
SOURCE=.\bzlib_private.h
|
||||
# End Source File
|
||||
# Begin Source File
|
||||
|
||||
SOURCE=.\compress.c
|
||||
# End Source File
|
||||
# Begin Source File
|
||||
|
||||
SOURCE=.\crctable.c
|
||||
# End Source File
|
||||
# Begin Source File
|
||||
|
||||
SOURCE=.\decompress.c
|
||||
# End Source File
|
||||
# Begin Source File
|
||||
|
||||
SOURCE=.\huffman.c
|
||||
# End Source File
|
||||
# Begin Source File
|
||||
|
||||
SOURCE=.\libbz2.def
|
||||
# End Source File
|
||||
# Begin Source File
|
||||
|
||||
SOURCE=.\randtable.c
|
||||
# End Source File
|
||||
# End Target
|
||||
# End Project
|
2100
manual.texi
Normal file
2100
manual.texi
Normal file
File diff suppressed because it is too large
Load Diff
124
randtable.c
Normal file
124
randtable.c
Normal file
@ -0,0 +1,124 @@
|
||||
|
||||
/*-------------------------------------------------------------*/
|
||||
/*--- Table for randomising repetitive blocks ---*/
|
||||
/*--- randtable.c ---*/
|
||||
/*-------------------------------------------------------------*/
|
||||
|
||||
/*--
|
||||
This file is a part of bzip2 and/or libbzip2, a program and
|
||||
library for lossless, block-sorting data compression.
|
||||
|
||||
Copyright (C) 1996-1998 Julian R Seward. All rights reserved.
|
||||
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
modification, are permitted provided that the following conditions
|
||||
are met:
|
||||
|
||||
1. Redistributions of source code must retain the above copyright
|
||||
notice, this list of conditions and the following disclaimer.
|
||||
|
||||
2. The origin of this software must not be misrepresented; you must
|
||||
not claim that you wrote the original software. If you use this
|
||||
software in a product, an acknowledgment in the product
|
||||
documentation would be appreciated but is not required.
|
||||
|
||||
3. Altered source versions must be plainly marked as such, and must
|
||||
not be misrepresented as being the original software.
|
||||
|
||||
4. The name of the author may not be used to endorse or promote
|
||||
products derived from this software without specific prior written
|
||||
permission.
|
||||
|
||||
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS
|
||||
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
|
||||
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||||
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
|
||||
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
|
||||
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
|
||||
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
|
||||
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
|
||||
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
Julian Seward, Guildford, Surrey, UK.
|
||||
jseward@acm.org
|
||||
bzip2/libbzip2 version 0.9.0c of 18 October 1998
|
||||
|
||||
This program is based on (at least) the work of:
|
||||
Mike Burrows
|
||||
David Wheeler
|
||||
Peter Fenwick
|
||||
Alistair Moffat
|
||||
Radford Neal
|
||||
Ian H. Witten
|
||||
Robert Sedgewick
|
||||
Jon L. Bentley
|
||||
|
||||
For more information on these sources, see the manual.
|
||||
--*/
|
||||
|
||||
|
||||
#include "bzlib_private.h"
|
||||
|
||||
|
||||
/*---------------------------------------------*/
|
||||
Int32 rNums[512] = {
|
||||
619, 720, 127, 481, 931, 816, 813, 233, 566, 247,
|
||||
985, 724, 205, 454, 863, 491, 741, 242, 949, 214,
|
||||
733, 859, 335, 708, 621, 574, 73, 654, 730, 472,
|
||||
419, 436, 278, 496, 867, 210, 399, 680, 480, 51,
|
||||
878, 465, 811, 169, 869, 675, 611, 697, 867, 561,
|
||||
862, 687, 507, 283, 482, 129, 807, 591, 733, 623,
|
||||
150, 238, 59, 379, 684, 877, 625, 169, 643, 105,
|
||||
170, 607, 520, 932, 727, 476, 693, 425, 174, 647,
|
||||
73, 122, 335, 530, 442, 853, 695, 249, 445, 515,
|
||||
909, 545, 703, 919, 874, 474, 882, 500, 594, 612,
|
||||
641, 801, 220, 162, 819, 984, 589, 513, 495, 799,
|
||||
161, 604, 958, 533, 221, 400, 386, 867, 600, 782,
|
||||
382, 596, 414, 171, 516, 375, 682, 485, 911, 276,
|
||||
98, 553, 163, 354, 666, 933, 424, 341, 533, 870,
|
||||
227, 730, 475, 186, 263, 647, 537, 686, 600, 224,
|
||||
469, 68, 770, 919, 190, 373, 294, 822, 808, 206,
|
||||
184, 943, 795, 384, 383, 461, 404, 758, 839, 887,
|
||||
715, 67, 618, 276, 204, 918, 873, 777, 604, 560,
|
||||
951, 160, 578, 722, 79, 804, 96, 409, 713, 940,
|
||||
652, 934, 970, 447, 318, 353, 859, 672, 112, 785,
|
||||
645, 863, 803, 350, 139, 93, 354, 99, 820, 908,
|
||||
609, 772, 154, 274, 580, 184, 79, 626, 630, 742,
|
||||
653, 282, 762, 623, 680, 81, 927, 626, 789, 125,
|
||||
411, 521, 938, 300, 821, 78, 343, 175, 128, 250,
|
||||
170, 774, 972, 275, 999, 639, 495, 78, 352, 126,
|
||||
857, 956, 358, 619, 580, 124, 737, 594, 701, 612,
|
||||
669, 112, 134, 694, 363, 992, 809, 743, 168, 974,
|
||||
944, 375, 748, 52, 600, 747, 642, 182, 862, 81,
|
||||
344, 805, 988, 739, 511, 655, 814, 334, 249, 515,
|
||||
897, 955, 664, 981, 649, 113, 974, 459, 893, 228,
|
||||
433, 837, 553, 268, 926, 240, 102, 654, 459, 51,
|
||||
686, 754, 806, 760, 493, 403, 415, 394, 687, 700,
|
||||
946, 670, 656, 610, 738, 392, 760, 799, 887, 653,
|
||||
978, 321, 576, 617, 626, 502, 894, 679, 243, 440,
|
||||
680, 879, 194, 572, 640, 724, 926, 56, 204, 700,
|
||||
707, 151, 457, 449, 797, 195, 791, 558, 945, 679,
|
||||
297, 59, 87, 824, 713, 663, 412, 693, 342, 606,
|
||||
134, 108, 571, 364, 631, 212, 174, 643, 304, 329,
|
||||
343, 97, 430, 751, 497, 314, 983, 374, 822, 928,
|
||||
140, 206, 73, 263, 980, 736, 876, 478, 430, 305,
|
||||
170, 514, 364, 692, 829, 82, 855, 953, 676, 246,
|
||||
369, 970, 294, 750, 807, 827, 150, 790, 288, 923,
|
||||
804, 378, 215, 828, 592, 281, 565, 555, 710, 82,
|
||||
896, 831, 547, 261, 524, 462, 293, 465, 502, 56,
|
||||
661, 821, 976, 991, 658, 869, 905, 758, 745, 193,
|
||||
768, 550, 608, 933, 378, 286, 215, 979, 792, 961,
|
||||
61, 688, 793, 644, 986, 403, 106, 366, 905, 644,
|
||||
372, 567, 466, 434, 645, 210, 389, 550, 919, 135,
|
||||
780, 773, 635, 389, 707, 100, 626, 958, 165, 504,
|
||||
920, 176, 193, 713, 857, 265, 203, 50, 668, 108,
|
||||
645, 990, 626, 197, 510, 357, 358, 850, 858, 364,
|
||||
936, 638
|
||||
};
|
||||
|
||||
|
||||
/*-------------------------------------------------------------*/
|
||||
/*--- end randtable.c ---*/
|
||||
/*-------------------------------------------------------------*/
|
9
test.bat
9
test.bat
@ -1,9 +0,0 @@
|
||||
@rem
|
||||
@rem MSDOS test driver for bzip2
|
||||
@rem
|
||||
type words1
|
||||
.\bzip2 -1 < sample1.ref > sample1.rbz
|
||||
.\bzip2 -2 < sample2.ref > sample2.rbz
|
||||
.\bzip2 -dvv < sample1.bz2 > sample1.tst
|
||||
.\bzip2 -dvv < sample2.bz2 > sample2.tst
|
||||
type words3sh
|
9
test.cmd
9
test.cmd
@ -1,9 +0,0 @@
|
||||
@rem
|
||||
@rem OS/2 test driver for bzip2
|
||||
@rem
|
||||
type words1
|
||||
.\bzip2 -1 < sample1.ref > sample1.rbz
|
||||
.\bzip2 -2 < sample2.ref > sample2.rbz
|
||||
.\bzip2 -dvv < sample1.bz2 > sample1.tst
|
||||
.\bzip2 -dvv < sample2.bz2 > sample2.tst
|
||||
type words3sh
|
7
words0
7
words0
@ -1,7 +0,0 @@
|
||||
***-------------------------------------------------***
|
||||
***--------- IMPORTANT: READ WHAT FOLLOWS! ---------***
|
||||
***--------- viz: pay attention :-) ---------***
|
||||
***-------------------------------------------------***
|
||||
|
||||
Compiling bzip2 ...
|
||||
|
1
words1
1
words1
@ -1,5 +1,4 @@
|
||||
|
||||
|
||||
Doing 4 tests (2 compress, 2 uncompress) ...
|
||||
If there's a problem, things might stop at this point.
|
||||
|
||||
|
1
words2
1
words2
@ -1,5 +1,4 @@
|
||||
|
||||
|
||||
Checking test results. If any of the four "cmp"s which follow
|
||||
report any differences, something is wrong. If you can't easily
|
||||
figure out what, please let me know (jseward@acm.org).
|
||||
|
21
words3
21
words3
@ -1,23 +1,20 @@
|
||||
|
||||
|
||||
If you got this far and the "cmp"s didn't find anything amiss, looks
|
||||
like you're in business. You should install bzip2 and bunzip2:
|
||||
like you're in business. You should install bzip2, bunzip2 and bzcat:
|
||||
|
||||
copy bzip2 to a public place, maybe /usr/bin.
|
||||
In that public place, make bunzip2 a symbolic link
|
||||
to the bzip2 you just copied there.
|
||||
Copy bzip2 and bzip2recover to a public place, maybe /usr/bin.
|
||||
In that public place, make bunzip2 and bzcat be
|
||||
symbolic links to the bzip2 you just copied there.
|
||||
Put the manual page, bzip2.1, somewhere appropriate;
|
||||
perhaps in /usr/man/man1.
|
||||
|
||||
Complete instructions for use are in the preformatted
|
||||
manual page, in the file bzip2.1.preformatted.
|
||||
Instructions for use are in the preformatted manual page, in the file
|
||||
bzip2.txt. For more detailed documentation, read the full manual.
|
||||
It is available in Postscript form (manual.ps) and HTML form
|
||||
(manual_toc.html).
|
||||
|
||||
You can also do "bzip2 --help" to see some helpful information.
|
||||
|
||||
"bzip2 -L" displays the software license.
|
||||
|
||||
Please read the README file carefully.
|
||||
Finally, note that bzip2 comes with ABSOLUTELY NO WARRANTY.
|
||||
|
||||
Happy compressing!
|
||||
Happy compressing. -- JRS, 30 August 1998.
|
||||
|
||||
|
12
words3sh
12
words3sh
@ -1,12 +0,0 @@
|
||||
If you got this far and the "bzip2 -dvv"s give identical
|
||||
stored vs computed CRCs, you're probably in business.
|
||||
Complete instructions for use are in the preformatted manual page,
|
||||
in the file bzip2.txt.
|
||||
|
||||
You can also do "bzip2 --help" to see some helpful information.
|
||||
"bzip2 -L" displays the software license.
|
||||
|
||||
Please read the README file carefully.
|
||||
Finally, note that bzip2 comes with ABSOLUTELY NO WARRANTY.
|
||||
|
||||
Happy compressing!
|
Loading…
Reference in New Issue
Block a user