Merge remote-tracking branch 'refs/remotes/Cyan4973/dev' into dev

This commit is contained in:
inikep 2016-02-05 09:14:24 +01:00
commit 8deaf1d47e
51 changed files with 11747 additions and 1938 deletions

View File

@ -36,6 +36,7 @@
PRGDIR = programs
ZSTDDIR = lib
DICTDIR = dictBuilder
# Define nul output
ifneq (,$(filter Windows%,$(OS)))
@ -51,6 +52,7 @@ default: zstdprogram
all:
$(MAKE) -C $(ZSTDDIR) $@
$(MAKE) -C $(PRGDIR) $@
$(MAKE) -C $(DICTDIR) $@
zstdprogram:
$(MAKE) -C $(PRGDIR)
@ -58,6 +60,7 @@ zstdprogram:
clean:
@$(MAKE) -C $(ZSTDDIR) $@ > $(VOID)
@$(MAKE) -C $(PRGDIR) $@ > $(VOID)
@$(MAKE) -C $(DICTDIR) $@ > $(VOID)
@echo Cleaning completed
@ -78,6 +81,7 @@ travis-install:
test:
$(MAKE) -C $(PRGDIR) $@
$(MAKE) -C $(DICTDIR) $@
cmaketest:
cd contrib/cmake ; cmake . ; $(MAKE)

4
NEWS
View File

@ -1,3 +1,7 @@
v0.5.0
New : dictionary builder utility
Changed : streaming & dictionary API
v0.4.7
Improved : small compression speed improvement in HC mode
Changed : `zstd_decompress.c` has ZSTD_LEGACY_SUPPORT to 0 by default

339
dictBuilder/COPYING Normal file
View File

@ -0,0 +1,339 @@
GNU GENERAL PUBLIC LICENSE
Version 2, June 1991
Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The licenses for most software are designed to take away your
freedom to share and change it. By contrast, the GNU General Public
License is intended to guarantee your freedom to share and change free
software--to make sure the software is free for all its users. This
General Public License applies to most of the Free Software
Foundation's software and to any other program whose authors commit to
using it. (Some other Free Software Foundation software is covered by
the GNU Lesser General Public License instead.) You can apply it to
your programs, too.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
this service if you wish), that you receive source code or can get it
if you want it, that you can change the software or use pieces of it
in new free programs; and that you know you can do these things.
To protect your rights, we need to make restrictions that forbid
anyone to deny you these rights or to ask you to surrender the rights.
These restrictions translate to certain responsibilities for you if you
distribute copies of the software, or if you modify it.
For example, if you distribute copies of such a program, whether
gratis or for a fee, you must give the recipients all the rights that
you have. You must make sure that they, too, receive or can get the
source code. And you must show them these terms so they know their
rights.
We protect your rights with two steps: (1) copyright the software, and
(2) offer you this license which gives you legal permission to copy,
distribute and/or modify the software.
Also, for each author's protection and ours, we want to make certain
that everyone understands that there is no warranty for this free
software. If the software is modified by someone else and passed on, we
want its recipients to know that what they have is not the original, so
that any problems introduced by others will not reflect on the original
authors' reputations.
Finally, any free program is threatened constantly by software
patents. We wish to avoid the danger that redistributors of a free
program will individually obtain patent licenses, in effect making the
program proprietary. To prevent this, we have made it clear that any
patent must be licensed for everyone's free use or not licensed at all.
The precise terms and conditions for copying, distribution and
modification follow.
GNU GENERAL PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. This License applies to any program or other work which contains
a notice placed by the copyright holder saying it may be distributed
under the terms of this General Public License. The "Program", below,
refers to any such program or work, and a "work based on the Program"
means either the Program or any derivative work under copyright law:
that is to say, a work containing the Program or a portion of it,
either verbatim or with modifications and/or translated into another
language. (Hereinafter, translation is included without limitation in
the term "modification".) Each licensee is addressed as "you".
Activities other than copying, distribution and modification are not
covered by this License; they are outside its scope. The act of
running the Program is not restricted, and the output from the Program
is covered only if its contents constitute a work based on the
Program (independent of having been made by running the Program).
Whether that is true depends on what the Program does.
1. You may copy and distribute verbatim copies of the Program's
source code as you receive it, in any medium, provided that you
conspicuously and appropriately publish on each copy an appropriate
copyright notice and disclaimer of warranty; keep intact all the
notices that refer to this License and to the absence of any warranty;
and give any other recipients of the Program a copy of this License
along with the Program.
You may charge a fee for the physical act of transferring a copy, and
you may at your option offer warranty protection in exchange for a fee.
2. You may modify your copy or copies of the Program or any portion
of it, thus forming a work based on the Program, and copy and
distribute such modifications or work under the terms of Section 1
above, provided that you also meet all of these conditions:
a) You must cause the modified files to carry prominent notices
stating that you changed the files and the date of any change.
b) You must cause any work that you distribute or publish, that in
whole or in part contains or is derived from the Program or any
part thereof, to be licensed as a whole at no charge to all third
parties under the terms of this License.
c) If the modified program normally reads commands interactively
when run, you must cause it, when started running for such
interactive use in the most ordinary way, to print or display an
announcement including an appropriate copyright notice and a
notice that there is no warranty (or else, saying that you provide
a warranty) and that users may redistribute the program under
these conditions, and telling the user how to view a copy of this
License. (Exception: if the Program itself is interactive but
does not normally print such an announcement, your work based on
the Program is not required to print an announcement.)
These requirements apply to the modified work as a whole. If
identifiable sections of that work are not derived from the Program,
and can be reasonably considered independent and separate works in
themselves, then this License, and its terms, do not apply to those
sections when you distribute them as separate works. But when you
distribute the same sections as part of a whole which is a work based
on the Program, the distribution of the whole must be on the terms of
this License, whose permissions for other licensees extend to the
entire whole, and thus to each and every part regardless of who wrote it.
Thus, it is not the intent of this section to claim rights or contest
your rights to work written entirely by you; rather, the intent is to
exercise the right to control the distribution of derivative or
collective works based on the Program.
In addition, mere aggregation of another work not based on the Program
with the Program (or with a work based on the Program) on a volume of
a storage or distribution medium does not bring the other work under
the scope of this License.
3. You may copy and distribute the Program (or a work based on it,
under Section 2) in object code or executable form under the terms of
Sections 1 and 2 above provided that you also do one of the following:
a) Accompany it with the complete corresponding machine-readable
source code, which must be distributed under the terms of Sections
1 and 2 above on a medium customarily used for software interchange; or,
b) Accompany it with a written offer, valid for at least three
years, to give any third party, for a charge no more than your
cost of physically performing source distribution, a complete
machine-readable copy of the corresponding source code, to be
distributed under the terms of Sections 1 and 2 above on a medium
customarily used for software interchange; or,
c) Accompany it with the information you received as to the offer
to distribute corresponding source code. (This alternative is
allowed only for noncommercial distribution and only if you
received the program in object code or executable form with such
an offer, in accord with Subsection b above.)
The source code for a work means the preferred form of the work for
making modifications to it. For an executable work, complete source
code means all the source code for all modules it contains, plus any
associated interface definition files, plus the scripts used to
control compilation and installation of the executable. However, as a
special exception, the source code distributed need not include
anything that is normally distributed (in either source or binary
form) with the major components (compiler, kernel, and so on) of the
operating system on which the executable runs, unless that component
itself accompanies the executable.
If distribution of executable or object code is made by offering
access to copy from a designated place, then offering equivalent
access to copy the source code from the same place counts as
distribution of the source code, even though third parties are not
compelled to copy the source along with the object code.
4. You may not copy, modify, sublicense, or distribute the Program
except as expressly provided under this License. Any attempt
otherwise to copy, modify, sublicense or distribute the Program is
void, and will automatically terminate your rights under this License.
However, parties who have received copies, or rights, from you under
this License will not have their licenses terminated so long as such
parties remain in full compliance.
5. You are not required to accept this License, since you have not
signed it. However, nothing else grants you permission to modify or
distribute the Program or its derivative works. These actions are
prohibited by law if you do not accept this License. Therefore, by
modifying or distributing the Program (or any work based on the
Program), you indicate your acceptance of this License to do so, and
all its terms and conditions for copying, distributing or modifying
the Program or works based on it.
6. Each time you redistribute the Program (or any work based on the
Program), the recipient automatically receives a license from the
original licensor to copy, distribute or modify the Program subject to
these terms and conditions. You may not impose any further
restrictions on the recipients' exercise of the rights granted herein.
You are not responsible for enforcing compliance by third parties to
this License.
7. If, as a consequence of a court judgment or allegation of patent
infringement or for any other reason (not limited to patent issues),
conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot
distribute so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you
may not distribute the Program at all. For example, if a patent
license would not permit royalty-free redistribution of the Program by
all those who receive copies directly or indirectly through you, then
the only way you could satisfy both it and this License would be to
refrain entirely from distribution of the Program.
If any portion of this section is held invalid or unenforceable under
any particular circumstance, the balance of the section is intended to
apply and the section as a whole is intended to apply in other
circumstances.
It is not the purpose of this section to induce you to infringe any
patents or other property right claims or to contest validity of any
such claims; this section has the sole purpose of protecting the
integrity of the free software distribution system, which is
implemented by public license practices. Many people have made
generous contributions to the wide range of software distributed
through that system in reliance on consistent application of that
system; it is up to the author/donor to decide if he or she is willing
to distribute software through any other system and a licensee cannot
impose that choice.
This section is intended to make thoroughly clear what is believed to
be a consequence of the rest of this License.
8. If the distribution and/or use of the Program is restricted in
certain countries either by patents or by copyrighted interfaces, the
original copyright holder who places the Program under this License
may add an explicit geographical distribution limitation excluding
those countries, so that distribution is permitted only in or among
countries not thus excluded. In such case, this License incorporates
the limitation as if written in the body of this License.
9. The Free Software Foundation may publish revised and/or new versions
of the General Public License from time to time. Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the Program
specifies a version number of this License which applies to it and "any
later version", you have the option of following the terms and conditions
either of that version or of any later version published by the Free
Software Foundation. If the Program does not specify a version number of
this License, you may choose any version ever published by the Free Software
Foundation.
10. If you wish to incorporate parts of the Program into other free
programs whose distribution conditions are different, write to the author
to ask for permission. For software which is copyrighted by the Free
Software Foundation, write to the Free Software Foundation; we sometimes
make exceptions for this. Our decision will be guided by the two goals
of preserving the free status of all derivatives of our free software and
of promoting the sharing and reuse of software generally.
NO WARRANTY
11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
REPAIR OR CORRECTION.
12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGES.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
convey the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
Also add information on how to contact you by electronic and paper mail.
If the program is interactive, make it output a short notice like this
when it starts in an interactive mode:
Gnomovision version 69, Copyright (C) year name of author
Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.
The hypothetical commands `show w' and `show c' should show the appropriate
parts of the General Public License. Of course, the commands you use may
be called something other than `show w' and `show c'; they could even be
mouse-clicks or menu items--whatever suits your program.
You should also get your employer (if you work as a programmer) or your
school, if any, to sign a "copyright disclaimer" for the program, if
necessary. Here is a sample; alter the names:
Yoyodyne, Inc., hereby disclaims all copyright interest in the program
`Gnomovision' (which makes passes at compilers) written by James Hacker.
<signature of Ty Coon>, 1 April 1989
Ty Coon, President of Vice
This General Public License does not permit incorporating your program into
proprietary programs. If your program is a subroutine library, you may
consider it more useful to permit linking proprietary applications with the
library. If this is what you want to do, use the GNU Lesser General
Public License instead of this License.

69
dictBuilder/Makefile Normal file
View File

@ -0,0 +1,69 @@
# ##########################################################################
# Dict Builder - Makefile
# Copyright (C) Yann Collet 2015
#
# GPL v2 License
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License along
# with this program; if not, write to the Free Software Foundation, Inc.,
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
#
# You can contact the author at :
# - ZSTD source repository : http://code.google.com/p/zstd/
# - Public forum : https://groups.google.com/forum/#!forum/lz4c
# ##########################################################################
CPPFLAGS= -I../lib
CFLAGS ?= -O3
CFLAGS += -std=c99 -Wall -Wextra -Wshadow -Wcast-qual -Wcast-align -Wundef -Wstrict-prototypes -Wstrict-aliasing=1
FLAGS = $(CPPFLAGS) $(CFLAGS) $(LDFLAGS) $(MOREFLAGS)
ZSTDDIR = ../lib
# Define *.exe as extension for Windows systems
ifneq (,$(filter Windows%,$(OS)))
EXT =.exe
VOID = nul
else
EXT =
VOID = /dev/null
endif
.PHONY: default all test
default: dictBuilder
all: dictBuilder
dictBuilder: dictBuilder.c dibcli.c divsufsort.c sssort.c trsort.c $(ZSTDDIR)/huff0.c $(ZSTDDIR)/fse.c $(ZSTDDIR)/zstd_decompress.c
$(CC) $(FLAGS) $^ -o $@$(EXT)
clean:
@rm -f core *.o tmp* result* *.gcda \
dictBuilder$(EXT)
@echo Cleaning completed
test: dictBuilder
./dictBuilder *
@rm dictionary
clangtest: CC = clang
clangtest: CFLAGS += -Werror
clangtest: clean dictBuilder
gpptest: CC = g++
gpptest: CFLAGS=-O3 -Wall -Wextra -Wshadow -Wcast-align -Wcast-qual -Wundef -Werror
gpptest: clean dictBuilder

83
dictBuilder/config.h Normal file
View File

@ -0,0 +1,83 @@
/*
* config.h for libdivsufsort
* Copyright (c) 2003-2008 Yuta Mori All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person
* obtaining a copy of this software and associated documentation
* files (the "Software"), to deal in the Software without
* restriction, including without limitation the rights to use,
* copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following
* conditions:
*
* The above copyright notice and this permission notice shall be
* included in all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*/
#ifndef _CONFIG_H
#define _CONFIG_H 1
#ifdef __cplusplus
extern "C" {
#endif /* __cplusplus */
/** Define to the version of this package. **/
#define PROJECT_VERSION_FULL "2.0.1"
/** Define to 1 if you have the header files. **/
#define HAVE_INTTYPES_H 1
#define HAVE_STDDEF_H 1
#define HAVE_STDINT_H 1
#define HAVE_STDLIB_H 1
#define HAVE_STRING_H 1
#define HAVE_STRINGS_H 1
#define HAVE_MEMORY_H 1
#define HAVE_SYS_TYPES_H 1
/** for WinIO **/
/* #undef HAVE_IO_H */
/* #undef HAVE_FCNTL_H */
/* #undef HAVE__SETMODE */
/* #undef HAVE_SETMODE */
/* #undef HAVE__FILENO */
/* #undef HAVE_FOPEN_S */
/* #undef HAVE__O_BINARY */
/*
#ifndef HAVE__SETMODE
# if HAVE_SETMODE
# define _setmode setmode
# define HAVE__SETMODE 1
# endif
# if HAVE__SETMODE && !HAVE__O_BINARY
# define _O_BINARY 0
# define HAVE__O_BINARY 1
# endif
#endif
*/
/** for inline **/
#ifndef INLINE
# define INLINE inline
#endif
/** for VC++ warning **/
#ifdef _MSC_VER
#pragma warning(disable: 4127)
#endif
#ifdef __cplusplus
} /* extern "C" */
#endif /* __cplusplus */
#endif /* _CONFIG_H */

263
dictBuilder/dibcli.c Normal file
View File

@ -0,0 +1,263 @@
/*
dibcli - Command Line Interface (cli) for Dictionary Builder
Copyright (C) Yann Collet 2016
GPL v2 License
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
You can contact the author at :
- zstd source repository : https://github.com/Cyan4973/zstd
*/
/* **************************************
* Compiler Specifics
****************************************/
/* Disable some Visual warning messages */
#ifdef _MSC_VER
# pragma warning(disable : 4127) /* disable: C4127: conditional expression is constant */
#endif
/*-************************************
* Includes
**************************************/
#include <stdlib.h> /* exit, calloc, free */
#include <string.h> /* strcmp, strlen */
#include <stdio.h> /* fprintf, getchar */
#include "dictBuilder.h"
/*-************************************
* Constants
**************************************/
#define PROGRAM_DESCRIPTION "Dictionary builder"
#ifndef PROGRAM_VERSION
# define QUOTE(str) #str
# define EXP_Q(str) QUOTE(str)
# define PROGRAM_VERSION "v" EXP_Q(DiB_VERSION_MAJOR) "." EXP_Q(DiB_VERSION_MINOR) "." EXP_Q(DiB_VERSION_RELEASE)
#endif
#define AUTHOR "Yann Collet"
#define WELCOME_MESSAGE "*** %s %s %i-bits, by %s ***\n", PROGRAM_DESCRIPTION, PROGRAM_VERSION, (int)(sizeof(void*)*8), AUTHOR
#define KB *(1 <<10)
#define MB *(1 <<20)
#define GB *(1U<<30)
static const unsigned compressionLevelDefault = 5;
static const unsigned selectionLevelDefault = 9; /* determined experimentally */
static const unsigned maxDictSizeDefault = 110 KB;
static const char* dictFileNameDefault = "dictionary";
/*-************************************
* Display Macros
**************************************/
#define DISPLAY(...) fprintf(g_displayOut, __VA_ARGS__)
#define DISPLAYLEVEL(l, ...) if (g_displayLevel>=l) { DISPLAY(__VA_ARGS__); }
static FILE* g_displayOut;
static unsigned g_displayLevel = 2; // 0 : no display // 1: errors // 2 : + result + interaction + warnings ; // 3 : + progression; // 4 : + information
/*-************************************
* Exceptions
**************************************/
#define DEBUG 0
#define DEBUGOUTPUT(...) if (DEBUG) DISPLAY(__VA_ARGS__);
#define EXM_THROW(error, ...) \
{ \
DEBUGOUTPUT("Error defined at %s, line %i : \n", __FILE__, __LINE__); \
DISPLAYLEVEL(1, "Error %i : ", error); \
DISPLAYLEVEL(1, __VA_ARGS__); \
DISPLAYLEVEL(1, "\n"); \
exit(error); \
}
/*-************************************
* Command Line
**************************************/
static int usage(const char* programName)
{
DISPLAY( "Usage :\n");
DISPLAY( " %s [arg] [filenames]\n", programName);
DISPLAY( "\n");
DISPLAY( "Arguments :\n");
DISPLAY( " -o : name of dictionary file (default: %s) \n", dictFileNameDefault);
DISPLAY( "--maxdict : limit dictionary to specified size (default : %u) \n", maxDictSizeDefault);
DISPLAY( " -h/-H : display help/long help and exit\n");
return 0;
}
static int usage_advanced(const char* programName)
{
DISPLAY(WELCOME_MESSAGE);
usage(programName);
DISPLAY( "\n");
DISPLAY( "Advanced arguments :\n");
DISPLAY( " -V : display Version number and exit\n");
DISPLAY( "--fast : fast sampling mode\n");
DISPLAY( " -L# : target compression level (default: %u)\n", compressionLevelDefault);
DISPLAY( " -S# : dictionary selectivity level (default: %u)\n", selectionLevelDefault);
DISPLAY( " -v : verbose mode\n");
DISPLAY( " -q : suppress notifications; specify twice to suppress errors too\n");
return 0;
}
static int badusage(const char* programName)
{
DISPLAYLEVEL(1, "Incorrect parameters\n");
if (g_displayLevel >= 1) usage(programName);
return 1;
}
static void waitEnter(void)
{
int unused;
DISPLAY("Press enter to continue...\n");
unused = getchar();
(void)unused;
}
int main(int argCount, const char** argv)
{
int i,
main_pause=0,
operationResult=0,
nextArgumentIsMaxDict=0,
nextArgumentIsDictFileName=0;
unsigned cLevel = compressionLevelDefault;
unsigned maxDictSize = maxDictSizeDefault;
unsigned selectionLevel = selectionLevelDefault;
const char** filenameTable = (const char**)malloc(argCount * sizeof(const char*)); /* argCount >= 1 */
unsigned filenameIdx = 0;
const char* programName = argv[0];
const char* dictFileName = dictFileNameDefault;
/* init */
g_displayOut = stderr; /* unfortunately, cannot be set at declaration */
if (filenameTable==NULL) EXM_THROW(1, "not enough memory\n");
/* Pick out program name from path. Don't rely on stdlib because of conflicting behavior */
for (i = (int)strlen(programName); i > 0; i--) { if ((programName[i] == '/') || (programName[i] == '\\')) { i++; break; } }
programName += i;
/* command switches */
for(i=1; i<argCount; i++) {
const char* argument = argv[i];
if(!argument) continue; /* Protection if argument empty */
if (nextArgumentIsDictFileName) {
nextArgumentIsDictFileName=0;
dictFileName = argument;
continue;
}
if (nextArgumentIsMaxDict) {
nextArgumentIsMaxDict = 0;
maxDictSize = 0;
while ((*argument>='0') && (*argument<='9'))
maxDictSize = maxDictSize * 10 + (*argument - '0'), argument++;
if (*argument=='k' || *argument=='K')
maxDictSize <<= 10;
continue;
}
/* long commands (--long-word) */
if (!strcmp(argument, "--version")) { g_displayOut=stdout; DISPLAY(WELCOME_MESSAGE); return 0; }
if (!strcmp(argument, "--help")) { g_displayOut=stdout; return usage_advanced(programName); }
if (!strcmp(argument, "--verbose")) { g_displayLevel++; if (g_displayLevel<3) g_displayLevel=3; continue; }
if (!strcmp(argument, "--quiet")) { g_displayLevel--; continue; }
if (!strcmp(argument, "--maxdict")) { nextArgumentIsMaxDict=1; continue; }
if (!strcmp(argument, "--fast")) { selectionLevel=1; cLevel=1; continue; }
/* Decode commands (note : aggregated commands are allowed) */
if (argument[0]=='-') {
argument++;
while (argument[0]!=0) {
switch(argument[0])
{
/* Display help */
case 'V': g_displayOut=stdout; DISPLAY(WELCOME_MESSAGE); return 0; /* Version Only */
case 'H':
case 'h': g_displayOut=stdout; return usage_advanced(programName);
/* Selection level */
case 'S': argument++;
selectionLevel = 0;
while ((*argument >= '0') && (*argument <= '9'))
selectionLevel *= 10, selectionLevel += *argument++ - '0';
break;
/* Selection level */
case 'L': argument++;
cLevel = 0;
while ((*argument >= '0') && (*argument <= '9'))
cLevel *= 10, cLevel += *argument++ - '0';
break;
/* Verbose mode */
case 'v': g_displayLevel++; if (g_displayLevel<3) g_displayLevel=3; argument++; break;
/* Quiet mode */
case 'q': g_displayLevel--; argument++; break;
/* dictionary name */
case 'o': nextArgumentIsDictFileName=1; argument++; break;
/* Pause at the end (hidden option) */
case 'p': main_pause=1; argument++; break;
/* unknown command */
default : return badusage(programName);
} }
continue;
}
/* add filename to list */
filenameTable[filenameIdx++] = argument;
}
/* Welcome message (if verbose) */
DISPLAYLEVEL(3, WELCOME_MESSAGE);
/* check nb files */
if (filenameIdx==0) return badusage(programName);
if (filenameIdx < 100)
{
DISPLAYLEVEL(2, "Warning : set contains only %u files ... \n", filenameIdx);
DISPLAYLEVEL(3, "!! For better results, consider providing > 1.000 samples !!\n");
DISPLAYLEVEL(3, "!! Each sample should preferably be stored as a separate file !!\n");
}
/* building ... */
{
DiB_params_t param;
param.selectivityLevel = selectionLevel;
param.compressionLevel = cLevel;
DiB_setNotificationLevel(g_displayLevel);
operationResult = DiB_trainFromFiles(dictFileName, maxDictSize,
filenameTable, filenameIdx,
param);
}
if (main_pause) waitEnter();
free((void*)filenameTable);
return operationResult;
}

1020
dictBuilder/dictBuilder.c Normal file

File diff suppressed because it is too large Load Diff

94
dictBuilder/dictBuilder.h Normal file
View File

@ -0,0 +1,94 @@
/*
dictBuilder.h
Copyright (C) Yann Collet 2016
GPL v2 License
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
You can contact the author at :
- zstd source repository : https://github.com/Cyan4973/zstd
- ztsd public forum : https://groups.google.com/forum/#!forum/lz4c
*/
/* This library is designed for a single-threaded console application.
* It exit() and printf() into stderr when it encounters an error condition. */
#ifndef DICTBUILDER_H_001
#define DICTBUILDER_H_001
/*-*************************************
* Version
***************************************/
#define DiB_VERSION_MAJOR 0 /* for breaking interface changes */
#define DiB_VERSION_MINOR 0 /* for new (non-breaking) interface capabilities */
#define DiB_VERSION_RELEASE 1 /* for tweaks, bug-fixes, or development */
#define DiB_VERSION_NUMBER (DiB_VERSION_MAJOR *100*100 + DiB_VERSION_MINOR *100 + DiB_VERSION_RELEASE)
unsigned DiB_versionNumber (void);
/*-*************************************
* Public type
***************************************/
typedef struct {
unsigned selectivityLevel; /* 0 means default; larger => bigger selection => larger dictionary */
unsigned compressionLevel; /* 0 means default; target a specific zstd compression level */
} DiB_params_t;
/*-*************************************
* Public functions
***************************************/
/*! DiB_trainFromBuffer
Train a dictionary from a memory buffer @samplesBuffer
where @nbSamples samples have been stored concatenated.
Each sample size is provided into an orderly table @sampleSizes.
Resulting dictionary will be saved into @dictBuffer.
@parameters is optional and can be provided with 0 values to mean "default".
@result : size of dictionary stored into @dictBuffer (<= @dictBufferSize)
or an error code, which can be tested by DiB_isError().
note : DiB_trainFromBuffer() will send notifications into stderr if instructed to, using DiB_setNotificationLevel()
*/
size_t DiB_trainFromBuffer(void* dictBuffer, size_t dictBufferSize,
const void* samplesBuffer, const size_t* sampleSizes, unsigned nbSamples,
DiB_params_t parameters);
/*! DiB_trainFromFiles
Train a dictionary from a set of files provided by @fileNamesTable
Resulting dictionary is written into file @dictFileName.
@parameters is optional and can be provided with 0 values.
@result : 0 == ok. Any other : error.
*/
int DiB_trainFromFiles(const char* dictFileName, unsigned maxDictSize,
const char** fileNamesTable, unsigned nbFiles,
DiB_params_t parameters);
/*-*************************************
* Helper functions
***************************************/
unsigned DiB_isError(size_t errorCode);
const char* DiB_getErrorName(size_t errorCode);
/*! DiB_setNotificationLevel
Set amount of notification to be displayed on the console.
default initial value : 0 = no console notification.
Note : not thread-safe (use a global constant)
*/
void DiB_setNotificationLevel(unsigned l);
#endif

404
dictBuilder/divsufsort.c Normal file
View File

@ -0,0 +1,404 @@
/*
* divsufsort.c for libdivsufsort
* Copyright (c) 2003-2008 Yuta Mori All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person
* obtaining a copy of this software and associated documentation
* files (the "Software"), to deal in the Software without
* restriction, including without limitation the rights to use,
* copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following
* conditions:
*
* The above copyright notice and this permission notice shall be
* included in all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*/
/*- Compiler specifics -*/
#ifdef __clang__
#pragma clang diagnostic ignored "-Wshorten-64-to-32"
#endif
/*- Dependencies -*/
#include "divsufsort_private.h"
#ifdef _OPENMP
# include <omp.h>
#endif
/*- Private Functions -*/
/* Sorts suffixes of type B*. */
static
saidx_t
sort_typeBstar(const sauchar_t *T, saidx_t *SA,
saidx_t *bucket_A, saidx_t *bucket_B,
saidx_t n) {
saidx_t *PAb, *ISAb, *buf;
#ifdef _OPENMP
saidx_t *curbuf;
saidx_t l;
#endif
saidx_t i, j, k, t, m, bufsize;
saint_t c0, c1;
#ifdef _OPENMP
saint_t d0, d1;
int tmp;
#endif
/* Initialize bucket arrays. */
for(i = 0; i < BUCKET_A_SIZE; ++i) { bucket_A[i] = 0; }
for(i = 0; i < BUCKET_B_SIZE; ++i) { bucket_B[i] = 0; }
/* Count the number of occurrences of the first one or two characters of each
type A, B and B* suffix. Moreover, store the beginning position of all
type B* suffixes into the array SA. */
for(i = n - 1, m = n, c0 = T[n - 1]; 0 <= i;) {
/* type A suffix. */
do { ++BUCKET_A(c1 = c0); } while((0 <= --i) && ((c0 = T[i]) >= c1));
if(0 <= i) {
/* type B* suffix. */
++BUCKET_BSTAR(c0, c1);
SA[--m] = i;
/* type B suffix. */
for(--i, c1 = c0; (0 <= i) && ((c0 = T[i]) <= c1); --i, c1 = c0) {
++BUCKET_B(c0, c1);
}
}
}
m = n - m;
/*
note:
A type B* suffix is lexicographically smaller than a type B suffix that
begins with the same first two characters.
*/
/* Calculate the index of start/end point of each bucket. */
for(c0 = 0, i = 0, j = 0; c0 < ALPHABET_SIZE; ++c0) {
t = i + BUCKET_A(c0);
BUCKET_A(c0) = i + j; /* start point */
i = t + BUCKET_B(c0, c0);
for(c1 = c0 + 1; c1 < ALPHABET_SIZE; ++c1) {
j += BUCKET_BSTAR(c0, c1);
BUCKET_BSTAR(c0, c1) = j; /* end point */
i += BUCKET_B(c0, c1);
}
}
if(0 < m) {
/* Sort the type B* suffixes by their first two characters. */
PAb = SA + n - m; ISAb = SA + m;
for(i = m - 2; 0 <= i; --i) {
t = PAb[i], c0 = T[t], c1 = T[t + 1];
SA[--BUCKET_BSTAR(c0, c1)] = i;
}
t = PAb[m - 1], c0 = T[t], c1 = T[t + 1];
SA[--BUCKET_BSTAR(c0, c1)] = m - 1;
/* Sort the type B* substrings using sssort. */
#ifdef _OPENMP
tmp = omp_get_max_threads();
buf = SA + m, bufsize = (n - (2 * m)) / tmp;
c0 = ALPHABET_SIZE - 2, c1 = ALPHABET_SIZE - 1, j = m;
#pragma omp parallel default(shared) private(curbuf, k, l, d0, d1, tmp)
{
tmp = omp_get_thread_num();
curbuf = buf + tmp * bufsize;
k = 0;
for(;;) {
#pragma omp critical(sssort_lock)
{
if(0 < (l = j)) {
d0 = c0, d1 = c1;
do {
k = BUCKET_BSTAR(d0, d1);
if(--d1 <= d0) {
d1 = ALPHABET_SIZE - 1;
if(--d0 < 0) { break; }
}
} while(((l - k) <= 1) && (0 < (l = k)));
c0 = d0, c1 = d1, j = k;
}
}
if(l == 0) { break; }
sssort(T, PAb, SA + k, SA + l,
curbuf, bufsize, 2, n, *(SA + k) == (m - 1));
}
}
#else
buf = SA + m, bufsize = n - (2 * m);
for(c0 = ALPHABET_SIZE - 2, j = m; 0 < j; --c0) {
for(c1 = ALPHABET_SIZE - 1; c0 < c1; j = i, --c1) {
i = BUCKET_BSTAR(c0, c1);
if(1 < (j - i)) {
sssort(T, PAb, SA + i, SA + j,
buf, bufsize, 2, n, *(SA + i) == (m - 1));
}
}
}
#endif
/* Compute ranks of type B* substrings. */
for(i = m - 1; 0 <= i; --i) {
if(0 <= SA[i]) {
j = i;
do { ISAb[SA[i]] = i; } while((0 <= --i) && (0 <= SA[i]));
SA[i + 1] = i - j;
if(i <= 0) { break; }
}
j = i;
do { ISAb[SA[i] = ~SA[i]] = j; } while(SA[--i] < 0);
ISAb[SA[i]] = j;
}
/* Construct the inverse suffix array of type B* suffixes using trsort. */
trsort(ISAb, SA, m, 1);
/* Set the sorted order of tyoe B* suffixes. */
for(i = n - 1, j = m, c0 = T[n - 1]; 0 <= i;) {
for(--i, c1 = c0; (0 <= i) && ((c0 = T[i]) >= c1); --i, c1 = c0) { }
if(0 <= i) {
t = i;
for(--i, c1 = c0; (0 <= i) && ((c0 = T[i]) <= c1); --i, c1 = c0) { }
SA[ISAb[--j]] = ((t == 0) || (1 < (t - i))) ? t : ~t;
}
}
/* Calculate the index of start/end point of each bucket. */
BUCKET_B(ALPHABET_SIZE - 1, ALPHABET_SIZE - 1) = n; /* end point */
for(c0 = ALPHABET_SIZE - 2, k = m - 1; 0 <= c0; --c0) {
i = BUCKET_A(c0 + 1) - 1;
for(c1 = ALPHABET_SIZE - 1; c0 < c1; --c1) {
t = i - BUCKET_B(c0, c1);
BUCKET_B(c0, c1) = i; /* end point */
/* Move all type B* suffixes to the correct position. */
for(i = t, j = BUCKET_BSTAR(c0, c1);
j <= k;
--i, --k) { SA[i] = SA[k]; }
}
BUCKET_BSTAR(c0, c0 + 1) = i - BUCKET_B(c0, c0) + 1; /* start point */
BUCKET_B(c0, c0) = i; /* end point */
}
}
return m;
}
/* Constructs the suffix array by using the sorted order of type B* suffixes. */
static
void
construct_SA(const sauchar_t *T, saidx_t *SA,
saidx_t *bucket_A, saidx_t *bucket_B,
saidx_t n, saidx_t m) {
saidx_t *i, *j, *k;
saidx_t s;
saint_t c0, c1, c2;
if(0 < m) {
/* Construct the sorted order of type B suffixes by using
the sorted order of type B* suffixes. */
for(c1 = ALPHABET_SIZE - 2; 0 <= c1; --c1) {
/* Scan the suffix array from right to left. */
for(i = SA + BUCKET_BSTAR(c1, c1 + 1),
j = SA + BUCKET_A(c1 + 1) - 1, k = NULL, c2 = -1;
i <= j;
--j) {
if(0 < (s = *j)) {
assert(T[s] == c1);
assert(((s + 1) < n) && (T[s] <= T[s + 1]));
assert(T[s - 1] <= T[s]);
*j = ~s;
c0 = T[--s];
if((0 < s) && (T[s - 1] > c0)) { s = ~s; }
if(c0 != c2) {
if(0 <= c2) { BUCKET_B(c2, c1) = k - SA; }
k = SA + BUCKET_B(c2 = c0, c1);
}
assert(k < j);
*k-- = s;
} else {
assert(((s == 0) && (T[s] == c1)) || (s < 0));
*j = ~s;
}
}
}
}
/* Construct the suffix array by using
the sorted order of type B suffixes. */
k = SA + BUCKET_A(c2 = T[n - 1]);
*k++ = (T[n - 2] < c2) ? ~(n - 1) : (n - 1);
/* Scan the suffix array from left to right. */
for(i = SA, j = SA + n; i < j; ++i) {
if(0 < (s = *i)) {
assert(T[s - 1] >= T[s]);
c0 = T[--s];
if((s == 0) || (T[s - 1] < c0)) { s = ~s; }
if(c0 != c2) {
BUCKET_A(c2) = k - SA;
k = SA + BUCKET_A(c2 = c0);
}
assert(i < k);
*k++ = s;
} else {
assert(s < 0);
*i = ~s;
}
}
}
/* Constructs the burrows-wheeler transformed string directly
by using the sorted order of type B* suffixes. */
static
saidx_t
construct_BWT(const sauchar_t *T, saidx_t *SA,
saidx_t *bucket_A, saidx_t *bucket_B,
saidx_t n, saidx_t m) {
saidx_t *i, *j, *k, *orig;
saidx_t s;
saint_t c0, c1, c2;
if(0 < m) {
/* Construct the sorted order of type B suffixes by using
the sorted order of type B* suffixes. */
for(c1 = ALPHABET_SIZE - 2; 0 <= c1; --c1) {
/* Scan the suffix array from right to left. */
for(i = SA + BUCKET_BSTAR(c1, c1 + 1),
j = SA + BUCKET_A(c1 + 1) - 1, k = NULL, c2 = -1;
i <= j;
--j) {
if(0 < (s = *j)) {
assert(T[s] == c1);
assert(((s + 1) < n) && (T[s] <= T[s + 1]));
assert(T[s - 1] <= T[s]);
c0 = T[--s];
*j = ~((saidx_t)c0);
if((0 < s) && (T[s - 1] > c0)) { s = ~s; }
if(c0 != c2) {
if(0 <= c2) { BUCKET_B(c2, c1) = k - SA; }
k = SA + BUCKET_B(c2 = c0, c1);
}
assert(k < j);
*k-- = s;
} else if(s != 0) {
*j = ~s;
#ifndef NDEBUG
} else {
assert(T[s] == c1);
#endif
}
}
}
}
/* Construct the BWTed string by using
the sorted order of type B suffixes. */
k = SA + BUCKET_A(c2 = T[n - 1]);
*k++ = (T[n - 2] < c2) ? ~((saidx_t)T[n - 2]) : (n - 1);
/* Scan the suffix array from left to right. */
for(i = SA, j = SA + n, orig = SA; i < j; ++i) {
if(0 < (s = *i)) {
assert(T[s - 1] >= T[s]);
c0 = T[--s];
*i = c0;
if((0 < s) && (T[s - 1] < c0)) { s = ~((saidx_t)T[s - 1]); }
if(c0 != c2) {
BUCKET_A(c2) = k - SA;
k = SA + BUCKET_A(c2 = c0);
}
assert(i < k);
*k++ = s;
} else if(s != 0) {
*i = ~s;
} else {
orig = i;
}
}
return orig - SA;
}
/*---------------------------------------------------------------------------*/
/*- Function -*/
saint_t
divsufsort(const sauchar_t *T, saidx_t *SA, saidx_t n) {
saidx_t *bucket_A, *bucket_B;
saidx_t m;
saint_t err = 0;
/* Check arguments. */
if((T == NULL) || (SA == NULL) || (n < 0)) { return -1; }
else if(n == 0) { return 0; }
else if(n == 1) { SA[0] = 0; return 0; }
else if(n == 2) { m = (T[0] < T[1]); SA[m ^ 1] = 0, SA[m] = 1; return 0; }
bucket_A = (saidx_t *)malloc(BUCKET_A_SIZE * sizeof(saidx_t));
bucket_B = (saidx_t *)malloc(BUCKET_B_SIZE * sizeof(saidx_t));
/* Suffixsort. */
if((bucket_A != NULL) && (bucket_B != NULL)) {
m = sort_typeBstar(T, SA, bucket_A, bucket_B, n);
construct_SA(T, SA, bucket_A, bucket_B, n, m);
} else {
err = -2;
}
free(bucket_B);
free(bucket_A);
return err;
}
saidx_t
divbwt(const sauchar_t *T, sauchar_t *U, saidx_t *A, saidx_t n) {
saidx_t *B;
saidx_t *bucket_A, *bucket_B;
saidx_t m, pidx, i;
/* Check arguments. */
if((T == NULL) || (U == NULL) || (n < 0)) { return -1; }
else if(n <= 1) { if(n == 1) { U[0] = T[0]; } return n; }
if((B = A) == NULL) { B = (saidx_t *)malloc((size_t)(n + 1) * sizeof(saidx_t)); }
bucket_A = (saidx_t *)malloc(BUCKET_A_SIZE * sizeof(saidx_t));
bucket_B = (saidx_t *)malloc(BUCKET_B_SIZE * sizeof(saidx_t));
/* Burrows-Wheeler Transform. */
if((B != NULL) && (bucket_A != NULL) && (bucket_B != NULL)) {
m = sort_typeBstar(T, B, bucket_A, bucket_B, n);
pidx = construct_BWT(T, B, bucket_A, bucket_B, n, m);
/* Copy to output string. */
U[0] = T[n - 1];
for(i = 0; i < pidx; ++i) { U[i + 1] = (sauchar_t)B[i]; }
for(i += 1; i < n; ++i) { U[i] = (sauchar_t)B[i]; }
pidx += 1;
} else {
pidx = -2;
}
free(bucket_B);
free(bucket_A);
if(A == NULL) { free(B); }
return pidx;
}
const char *
divsufsort_version(void) {
return PROJECT_VERSION_FULL;
}

180
dictBuilder/divsufsort.h Normal file
View File

@ -0,0 +1,180 @@
/*
* divsufsort.h for libdivsufsort
* Copyright (c) 2003-2008 Yuta Mori All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person
* obtaining a copy of this software and associated documentation
* files (the "Software"), to deal in the Software without
* restriction, including without limitation the rights to use,
* copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following
* conditions:
*
* The above copyright notice and this permission notice shall be
* included in all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*/
#ifndef _DIVSUFSORT_H
#define _DIVSUFSORT_H 1
#ifdef __cplusplus
extern "C" {
#endif /* __cplusplus */
#include <inttypes.h>
#ifndef DIVSUFSORT_API
# ifdef DIVSUFSORT_BUILD_DLL
# define DIVSUFSORT_API
# else
# define DIVSUFSORT_API
# endif
#endif
/*- Datatypes -*/
#ifndef SAUCHAR_T
#define SAUCHAR_T
typedef uint8_t sauchar_t;
#endif /* SAUCHAR_T */
#ifndef SAINT_T
#define SAINT_T
typedef int32_t saint_t;
#endif /* SAINT_T */
#ifndef SAIDX_T
#define SAIDX_T
typedef int32_t saidx_t;
#endif /* SAIDX_T */
#ifndef PRIdSAINT_T
#define PRIdSAINT_T PRId32
#endif /* PRIdSAINT_T */
#ifndef PRIdSAIDX_T
#define PRIdSAIDX_T PRId32
#endif /* PRIdSAIDX_T */
/*- Prototypes -*/
/**
* Constructs the suffix array of a given string.
* @param T[0..n-1] The input string.
* @param SA[0..n-1] The output array of suffixes.
* @param n The length of the given string.
* @return 0 if no error occurred, -1 or -2 otherwise.
*/
DIVSUFSORT_API
saint_t
divsufsort(const sauchar_t *T, saidx_t *SA, saidx_t n);
/**
* Constructs the burrows-wheeler transformed string of a given string.
* @param T[0..n-1] The input string.
* @param U[0..n-1] The output string. (can be T)
* @param A[0..n-1] The temporary array. (can be NULL)
* @param n The length of the given string.
* @return The primary index if no error occurred, -1 or -2 otherwise.
*/
DIVSUFSORT_API
saidx_t
divbwt(const sauchar_t *T, sauchar_t *U, saidx_t *A, saidx_t n);
/**
* Returns the version of the divsufsort library.
* @return The version number string.
*/
DIVSUFSORT_API
const char *
divsufsort_version(void);
/**
* Constructs the burrows-wheeler transformed string of a given string and suffix array.
* @param T[0..n-1] The input string.
* @param U[0..n-1] The output string. (can be T)
* @param SA[0..n-1] The suffix array. (can be NULL)
* @param n The length of the given string.
* @param idx The output primary index.
* @return 0 if no error occurred, -1 or -2 otherwise.
*/
DIVSUFSORT_API
saint_t
bw_transform(const sauchar_t *T, sauchar_t *U,
saidx_t *SA /* can NULL */,
saidx_t n, saidx_t *idx);
/**
* Inverse BW-transforms a given BWTed string.
* @param T[0..n-1] The input string.
* @param U[0..n-1] The output string. (can be T)
* @param A[0..n-1] The temporary array. (can be NULL)
* @param n The length of the given string.
* @param idx The primary index.
* @return 0 if no error occurred, -1 or -2 otherwise.
*/
DIVSUFSORT_API
saint_t
inverse_bw_transform(const sauchar_t *T, sauchar_t *U,
saidx_t *A /* can NULL */,
saidx_t n, saidx_t idx);
/**
* Checks the correctness of a given suffix array.
* @param T[0..n-1] The input string.
* @param SA[0..n-1] The input suffix array.
* @param n The length of the given string.
* @param verbose The verbose mode.
* @return 0 if no error occurred.
*/
DIVSUFSORT_API
saint_t
sufcheck(const sauchar_t *T, const saidx_t *SA, saidx_t n, saint_t verbose);
/**
* Search for the pattern P in the string T.
* @param T[0..Tsize-1] The input string.
* @param Tsize The length of the given string.
* @param P[0..Psize-1] The input pattern string.
* @param Psize The length of the given pattern string.
* @param SA[0..SAsize-1] The input suffix array.
* @param SAsize The length of the given suffix array.
* @param idx The output index.
* @return The count of matches if no error occurred, -1 otherwise.
*/
DIVSUFSORT_API
saidx_t
sa_search(const sauchar_t *T, saidx_t Tsize,
const sauchar_t *P, saidx_t Psize,
const saidx_t *SA, saidx_t SAsize,
saidx_t *left);
/**
* Search for the character c in the string T.
* @param T[0..Tsize-1] The input string.
* @param Tsize The length of the given string.
* @param SA[0..SAsize-1] The input suffix array.
* @param SAsize The length of the given suffix array.
* @param c The input character.
* @param idx The output index.
* @return The count of matches if no error occurred, -1 otherwise.
*/
DIVSUFSORT_API
saidx_t
sa_simplesearch(const sauchar_t *T, saidx_t Tsize,
const saidx_t *SA, saidx_t SAsize,
saint_t c, saidx_t *left);
#ifdef __cplusplus
} /* extern "C" */
#endif /* __cplusplus */
#endif /* _DIVSUFSORT_H */

View File

@ -0,0 +1,212 @@
/*
* divsufsort_private.h for libdivsufsort
* Copyright (c) 2003-2008 Yuta Mori All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person
* obtaining a copy of this software and associated documentation
* files (the "Software"), to deal in the Software without
* restriction, including without limitation the rights to use,
* copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following
* conditions:
*
* The above copyright notice and this permission notice shall be
* included in all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*/
#ifndef _DIVSUFSORT_PRIVATE_H
#define _DIVSUFSORT_PRIVATE_H 1
#ifdef __cplusplus
extern "C" {
#endif /* __cplusplus */
/* *************************
* Includes
***************************/
#include <assert.h>
#include <stdlib.h> /* unconditional */
#include <stdio.h>
#include "config.h" /* unconditional */
#if HAVE_STRING_H
# include <string.h>
#endif
#if HAVE_MEMORY_H
# include <memory.h>
#endif
#if HAVE_STDDEF_H
# include <stddef.h>
#endif
#if HAVE_STRINGS_H
# ifdef _WIN32
# include <string.h>
# else
# include <strings.h>
# endif
#endif
#if HAVE_INTTYPES_H
# include <inttypes.h>
#else
# if HAVE_STDINT_H
# include <stdint.h>
# endif
#endif
#if defined(BUILD_DIVSUFSORT64)
# include "divsufsort64.h"
# ifndef SAIDX_T
# define SAIDX_T
# define saidx_t saidx64_t
# endif /* SAIDX_T */
# ifndef PRIdSAIDX_T
# define PRIdSAIDX_T PRIdSAIDX64_T
# endif /* PRIdSAIDX_T */
# define divsufsort divsufsort64
# define divbwt divbwt64
# define divsufsort_version divsufsort64_version
# define bw_transform bw_transform64
# define inverse_bw_transform inverse_bw_transform64
# define sufcheck sufcheck64
# define sa_search sa_search64
# define sa_simplesearch sa_simplesearch64
# define sssort sssort64
# define trsort trsort64
#else
# include "divsufsort.h"
#endif
/*- Constants -*/
#if !defined(UINT8_MAX)
# define UINT8_MAX (255)
#endif /* UINT8_MAX */
#if defined(ALPHABET_SIZE) && (ALPHABET_SIZE < 1)
# undef ALPHABET_SIZE
#endif
#if !defined(ALPHABET_SIZE)
# define ALPHABET_SIZE (UINT8_MAX + 1)
#endif
/* for divsufsort.c */
#define BUCKET_A_SIZE (ALPHABET_SIZE)
#define BUCKET_B_SIZE (ALPHABET_SIZE * ALPHABET_SIZE)
/* for sssort.c */
#if defined(SS_INSERTIONSORT_THRESHOLD)
# if SS_INSERTIONSORT_THRESHOLD < 1
# undef SS_INSERTIONSORT_THRESHOLD
# define SS_INSERTIONSORT_THRESHOLD (1)
# endif
#else
# define SS_INSERTIONSORT_THRESHOLD (8)
#endif
#if defined(SS_BLOCKSIZE)
# if SS_BLOCKSIZE < 0
# undef SS_BLOCKSIZE
# define SS_BLOCKSIZE (0)
# elif 32768 <= SS_BLOCKSIZE
# undef SS_BLOCKSIZE
# define SS_BLOCKSIZE (32767)
# endif
#else
# define SS_BLOCKSIZE (1024)
#endif
/* minstacksize = log(SS_BLOCKSIZE) / log(3) * 2 */
#if SS_BLOCKSIZE == 0
# if defined(BUILD_DIVSUFSORT64)
# define SS_MISORT_STACKSIZE (96)
# else
# define SS_MISORT_STACKSIZE (64)
# endif
#elif SS_BLOCKSIZE <= 4096
# define SS_MISORT_STACKSIZE (16)
#else
# define SS_MISORT_STACKSIZE (24)
#endif
#if defined(BUILD_DIVSUFSORT64)
# define SS_SMERGE_STACKSIZE (64)
#else
# define SS_SMERGE_STACKSIZE (32)
#endif
/* for trsort.c */
#define TR_INSERTIONSORT_THRESHOLD (8)
#if defined(BUILD_DIVSUFSORT64)
# define TR_STACKSIZE (96)
#else
# define TR_STACKSIZE (64)
#endif
/*- Macros -*/
#ifndef SWAP
# define SWAP(_a, _b) do { t = (_a); (_a) = (_b); (_b) = t; } while(0)
#endif /* SWAP */
#ifndef MIN
# define MIN(_a, _b) (((_a) < (_b)) ? (_a) : (_b))
#endif /* MIN */
#ifndef MAX
# define MAX(_a, _b) (((_a) > (_b)) ? (_a) : (_b))
#endif /* MAX */
#define STACK_PUSH(_a, _b, _c, _d)\
do {\
assert(ssize < STACK_SIZE);\
stack[ssize].a = (_a), stack[ssize].b = (_b),\
stack[ssize].c = (_c), stack[ssize++].d = (_d);\
} while(0)
#define STACK_PUSH5(_a, _b, _c, _d, _e)\
do {\
assert(ssize < STACK_SIZE);\
stack[ssize].a = (_a), stack[ssize].b = (_b),\
stack[ssize].c = (_c), stack[ssize].d = (_d), stack[ssize++].e = (_e);\
} while(0)
#define STACK_POP(_a, _b, _c, _d)\
do {\
assert(0 <= ssize);\
if(ssize == 0) { return; }\
(_a) = stack[--ssize].a, (_b) = stack[ssize].b,\
(_c) = stack[ssize].c, (_d) = stack[ssize].d;\
} while(0)
#define STACK_POP5(_a, _b, _c, _d, _e)\
do {\
assert(0 <= ssize);\
if(ssize == 0) { return; }\
(_a) = stack[--ssize].a, (_b) = stack[ssize].b,\
(_c) = stack[ssize].c, (_d) = stack[ssize].d, (_e) = stack[ssize].e;\
} while(0)
/* for divsufsort.c */
#define BUCKET_A(_c0) bucket_A[(_c0)]
#if ALPHABET_SIZE == 256
#define BUCKET_B(_c0, _c1) (bucket_B[((_c1) << 8) | (_c0)])
#define BUCKET_BSTAR(_c0, _c1) (bucket_B[((_c0) << 8) | (_c1)])
#else
#define BUCKET_B(_c0, _c1) (bucket_B[(_c1) * ALPHABET_SIZE + (_c0)])
#define BUCKET_BSTAR(_c0, _c1) (bucket_B[(_c0) * ALPHABET_SIZE + (_c1)])
#endif
/*- Private Prototypes -*/
/* sssort.c */
void
sssort(const sauchar_t *Td, const saidx_t *PA,
saidx_t *first, saidx_t *last,
saidx_t *buf, saidx_t bufsize,
saidx_t depth, saidx_t n, saint_t lastsuffix);
/* trsort.c */
void
trsort(saidx_t *ISA, saidx_t *SA, saidx_t n, saidx_t depth);
#ifdef __cplusplus
} /* extern "C" */
#endif /* __cplusplus */
#endif /* _DIVSUFSORT_PRIVATE_H */

56
dictBuilder/lfs.h Normal file
View File

@ -0,0 +1,56 @@
/*
* lfs.h for libdivsufsort
* Copyright (c) 2003-2008 Yuta Mori All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person
* obtaining a copy of this software and associated documentation
* files (the "Software"), to deal in the Software without
* restriction, including without limitation the rights to use,
* copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following
* conditions:
*
* The above copyright notice and this permission notice shall be
* included in all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*/
#ifndef _LFS_H
#define _LFS_H 1
#ifdef __cplusplus
extern "C" {
#endif /* __cplusplus */
#ifndef __STRICT_ANSI__
# define LFS_OFF_T off_t
# define LFS_FOPEN fopen
# define LFS_FTELL ftello
# define LFS_FSEEK fseeko
# define LFS_PRId PRIdMAX
#else
# define LFS_OFF_T long
# define LFS_FOPEN fopen
# define LFS_FTELL ftell
# define LFS_FSEEK fseek
# define LFS_PRId "ld"
#endif
#ifndef PRIdOFF_T
# define PRIdOFF_T LFS_PRId
#endif
#ifdef __cplusplus
} /* extern "C" */
#endif /* __cplusplus */
#endif /* _LFS_H */

844
dictBuilder/sssort.c Normal file
View File

@ -0,0 +1,844 @@
/*
* sssort.c for libdivsufsort
* Copyright (c) 2003-2008 Yuta Mori All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person
* obtaining a copy of this software and associated documentation
* files (the "Software"), to deal in the Software without
* restriction, including without limitation the rights to use,
* copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following
* conditions:
*
* The above copyright notice and this permission notice shall be
* included in all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*/
/*- Compiler specifics -*/
#ifdef __clang__
#pragma clang diagnostic ignored "-Wshorten-64-to-32"
#endif
#if defined (__cplusplus) || (defined (__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L) /* C99 */)
/* inline is defined */
#elif defined(_MSC_VER)
# define inline __inline
#else
# define inline /* disable inline */
#endif
#ifdef _MSC_VER /* Visual Studio */
# pragma warning(disable : 4127) /* disable: C4127: conditional expression is constant */
# define FORCE_INLINE static __forceinline
#else
# if defined (__STDC_VERSION__) && __STDC_VERSION__ >= 199901L /* C99 */
# ifdef __GNUC__
# define FORCE_INLINE static inline __attribute__((always_inline))
# else
# define FORCE_INLINE static inline
# endif
# else
# define FORCE_INLINE static
# endif /* __STDC_VERSION__ */
#endif
/*- Dependencies -*/
#include "divsufsort_private.h"
/*- Private Functions -*/
static const saint_t lg_table[256]= {
-1,0,1,1,2,2,2,2,3,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,
5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,
6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,
6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,
7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,
7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,
7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,
7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7
};
#if (SS_BLOCKSIZE == 0) || (SS_INSERTIONSORT_THRESHOLD < SS_BLOCKSIZE)
static INLINE
saint_t
ss_ilg(saidx_t n) {
#if SS_BLOCKSIZE == 0
# if defined(BUILD_DIVSUFSORT64)
return (n >> 32) ?
((n >> 48) ?
((n >> 56) ?
56 + lg_table[(n >> 56) & 0xff] :
48 + lg_table[(n >> 48) & 0xff]) :
((n >> 40) ?
40 + lg_table[(n >> 40) & 0xff] :
32 + lg_table[(n >> 32) & 0xff])) :
((n & 0xffff0000) ?
((n & 0xff000000) ?
24 + lg_table[(n >> 24) & 0xff] :
16 + lg_table[(n >> 16) & 0xff]) :
((n & 0x0000ff00) ?
8 + lg_table[(n >> 8) & 0xff] :
0 + lg_table[(n >> 0) & 0xff]));
# else
return (n & 0xffff0000) ?
((n & 0xff000000) ?
24 + lg_table[(n >> 24) & 0xff] :
16 + lg_table[(n >> 16) & 0xff]) :
((n & 0x0000ff00) ?
8 + lg_table[(n >> 8) & 0xff] :
0 + lg_table[(n >> 0) & 0xff]);
# endif
#elif SS_BLOCKSIZE < 256
return lg_table[n];
#else
return (n & 0xff00) ?
8 + lg_table[(n >> 8) & 0xff] :
0 + lg_table[(n >> 0) & 0xff];
#endif
}
#endif /* (SS_BLOCKSIZE == 0) || (SS_INSERTIONSORT_THRESHOLD < SS_BLOCKSIZE) */
#if SS_BLOCKSIZE != 0
static const saint_t sqq_table[256] = {
0, 16, 22, 27, 32, 35, 39, 42, 45, 48, 50, 53, 55, 57, 59, 61,
64, 65, 67, 69, 71, 73, 75, 76, 78, 80, 81, 83, 84, 86, 87, 89,
90, 91, 93, 94, 96, 97, 98, 99, 101, 102, 103, 104, 106, 107, 108, 109,
110, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126,
128, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142,
143, 144, 144, 145, 146, 147, 148, 149, 150, 150, 151, 152, 153, 154, 155, 155,
156, 157, 158, 159, 160, 160, 161, 162, 163, 163, 164, 165, 166, 167, 167, 168,
169, 170, 170, 171, 172, 173, 173, 174, 175, 176, 176, 177, 178, 178, 179, 180,
181, 181, 182, 183, 183, 184, 185, 185, 186, 187, 187, 188, 189, 189, 190, 191,
192, 192, 193, 193, 194, 195, 195, 196, 197, 197, 198, 199, 199, 200, 201, 201,
202, 203, 203, 204, 204, 205, 206, 206, 207, 208, 208, 209, 209, 210, 211, 211,
212, 212, 213, 214, 214, 215, 215, 216, 217, 217, 218, 218, 219, 219, 220, 221,
221, 222, 222, 223, 224, 224, 225, 225, 226, 226, 227, 227, 228, 229, 229, 230,
230, 231, 231, 232, 232, 233, 234, 234, 235, 235, 236, 236, 237, 237, 238, 238,
239, 240, 240, 241, 241, 242, 242, 243, 243, 244, 244, 245, 245, 246, 246, 247,
247, 248, 248, 249, 249, 250, 250, 251, 251, 252, 252, 253, 253, 254, 254, 255
};
static INLINE
saidx_t
ss_isqrt(saidx_t x) {
saidx_t y, e;
if(x >= (SS_BLOCKSIZE * SS_BLOCKSIZE)) { return SS_BLOCKSIZE; }
e = (x & 0xffff0000) ?
((x & 0xff000000) ?
24 + lg_table[(x >> 24) & 0xff] :
16 + lg_table[(x >> 16) & 0xff]) :
((x & 0x0000ff00) ?
8 + lg_table[(x >> 8) & 0xff] :
0 + lg_table[(x >> 0) & 0xff]);
if(e >= 16) {
y = sqq_table[x >> ((e - 6) - (e & 1))] << ((e >> 1) - 7);
if(e >= 24) { y = (y + 1 + x / y) >> 1; }
y = (y + 1 + x / y) >> 1;
} else if(e >= 8) {
y = (sqq_table[x >> ((e - 6) - (e & 1))] >> (7 - (e >> 1))) + 1;
} else {
return sqq_table[x] >> 4;
}
return (x < (y * y)) ? y - 1 : y;
}
#endif /* SS_BLOCKSIZE != 0 */
/*---------------------------------------------------------------------------*/
/* Compares two suffixes. */
static INLINE
saint_t
ss_compare(const sauchar_t *T,
const saidx_t *p1, const saidx_t *p2,
saidx_t depth) {
const sauchar_t *U1, *U2, *U1n, *U2n;
for(U1 = T + depth + *p1,
U2 = T + depth + *p2,
U1n = T + *(p1 + 1) + 2,
U2n = T + *(p2 + 1) + 2;
(U1 < U1n) && (U2 < U2n) && (*U1 == *U2);
++U1, ++U2) {
}
return U1 < U1n ?
(U2 < U2n ? *U1 - *U2 : 1) :
(U2 < U2n ? -1 : 0);
}
/*---------------------------------------------------------------------------*/
#if (SS_BLOCKSIZE != 1) && (SS_INSERTIONSORT_THRESHOLD != 1)
/* Insertionsort for small size groups */
static
void
ss_insertionsort(const sauchar_t *T, const saidx_t *PA,
saidx_t *first, saidx_t *last, saidx_t depth) {
saidx_t *i, *j;
saidx_t t;
saint_t r;
for(i = last - 2; first <= i; --i) {
for(t = *i, j = i + 1; 0 < (r = ss_compare(T, PA + t, PA + *j, depth));) {
do { *(j - 1) = *j; } while((++j < last) && (*j < 0));
if(last <= j) { break; }
}
if(r == 0) { *j = ~*j; }
*(j - 1) = t;
}
}
#endif /* (SS_BLOCKSIZE != 1) && (SS_INSERTIONSORT_THRESHOLD != 1) */
/*---------------------------------------------------------------------------*/
#if (SS_BLOCKSIZE == 0) || (SS_INSERTIONSORT_THRESHOLD < SS_BLOCKSIZE)
static INLINE
void
ss_fixdown(const sauchar_t *Td, const saidx_t *PA,
saidx_t *SA, saidx_t i, saidx_t size) {
saidx_t j, k;
saidx_t v;
saint_t c, d, e;
for(v = SA[i], c = Td[PA[v]]; (j = 2 * i + 1) < size; SA[i] = SA[k], i = k) {
d = Td[PA[SA[k = j++]]];
if(d < (e = Td[PA[SA[j]]])) { k = j; d = e; }
if(d <= c) { break; }
}
SA[i] = v;
}
/* Simple top-down heapsort. */
static
void
ss_heapsort(const sauchar_t *Td, const saidx_t *PA, saidx_t *SA, saidx_t size) {
saidx_t i, m;
saidx_t t;
m = size;
if((size % 2) == 0) {
m--;
if(Td[PA[SA[m / 2]]] < Td[PA[SA[m]]]) { SWAP(SA[m], SA[m / 2]); }
}
for(i = m / 2 - 1; 0 <= i; --i) { ss_fixdown(Td, PA, SA, i, m); }
if((size % 2) == 0) { SWAP(SA[0], SA[m]); ss_fixdown(Td, PA, SA, 0, m); }
for(i = m - 1; 0 < i; --i) {
t = SA[0], SA[0] = SA[i];
ss_fixdown(Td, PA, SA, 0, i);
SA[i] = t;
}
}
/*---------------------------------------------------------------------------*/
/* Returns the median of three elements. */
static INLINE
saidx_t *
ss_median3(const sauchar_t *Td, const saidx_t *PA,
saidx_t *v1, saidx_t *v2, saidx_t *v3) {
saidx_t *t;
if(Td[PA[*v1]] > Td[PA[*v2]]) { SWAP(v1, v2); }
if(Td[PA[*v2]] > Td[PA[*v3]]) {
if(Td[PA[*v1]] > Td[PA[*v3]]) { return v1; }
else { return v3; }
}
return v2;
}
/* Returns the median of five elements. */
static INLINE
saidx_t *
ss_median5(const sauchar_t *Td, const saidx_t *PA,
saidx_t *v1, saidx_t *v2, saidx_t *v3, saidx_t *v4, saidx_t *v5) {
saidx_t *t;
if(Td[PA[*v2]] > Td[PA[*v3]]) { SWAP(v2, v3); }
if(Td[PA[*v4]] > Td[PA[*v5]]) { SWAP(v4, v5); }
if(Td[PA[*v2]] > Td[PA[*v4]]) { SWAP(v2, v4); SWAP(v3, v5); }
if(Td[PA[*v1]] > Td[PA[*v3]]) { SWAP(v1, v3); }
if(Td[PA[*v1]] > Td[PA[*v4]]) { SWAP(v1, v4); SWAP(v3, v5); }
if(Td[PA[*v3]] > Td[PA[*v4]]) { return v4; }
return v3;
}
/* Returns the pivot element. */
static INLINE
saidx_t *
ss_pivot(const sauchar_t *Td, const saidx_t *PA, saidx_t *first, saidx_t *last) {
saidx_t *middle;
saidx_t t;
t = last - first;
middle = first + t / 2;
if(t <= 512) {
if(t <= 32) {
return ss_median3(Td, PA, first, middle, last - 1);
} else {
t >>= 2;
return ss_median5(Td, PA, first, first + t, middle, last - 1 - t, last - 1);
}
}
t >>= 3;
first = ss_median3(Td, PA, first, first + t, first + (t << 1));
middle = ss_median3(Td, PA, middle - t, middle, middle + t);
last = ss_median3(Td, PA, last - 1 - (t << 1), last - 1 - t, last - 1);
return ss_median3(Td, PA, first, middle, last);
}
/*---------------------------------------------------------------------------*/
/* Binary partition for substrings. */
static INLINE
saidx_t *
ss_partition(const saidx_t *PA,
saidx_t *first, saidx_t *last, saidx_t depth) {
saidx_t *a, *b;
saidx_t t;
for(a = first - 1, b = last;;) {
for(; (++a < b) && ((PA[*a] + depth) >= (PA[*a + 1] + 1));) { *a = ~*a; }
for(; (a < --b) && ((PA[*b] + depth) < (PA[*b + 1] + 1));) { }
if(b <= a) { break; }
t = ~*b;
*b = *a;
*a = t;
}
if(first < a) { *first = ~*first; }
return a;
}
/* Multikey introsort for medium size groups. */
static
void
ss_mintrosort(const sauchar_t *T, const saidx_t *PA,
saidx_t *first, saidx_t *last,
saidx_t depth) {
#define STACK_SIZE SS_MISORT_STACKSIZE
struct { saidx_t *a, *b, c; saint_t d; } stack[STACK_SIZE];
const sauchar_t *Td;
saidx_t *a, *b, *c, *d, *e, *f;
saidx_t s, t;
saint_t ssize;
saint_t limit;
saint_t v, x = 0;
for(ssize = 0, limit = ss_ilg(last - first);;) {
if((last - first) <= SS_INSERTIONSORT_THRESHOLD) {
#if 1 < SS_INSERTIONSORT_THRESHOLD
if(1 < (last - first)) { ss_insertionsort(T, PA, first, last, depth); }
#endif
STACK_POP(first, last, depth, limit);
continue;
}
Td = T + depth;
if(limit-- == 0) { ss_heapsort(Td, PA, first, last - first); }
if(limit < 0) {
for(a = first + 1, v = Td[PA[*first]]; a < last; ++a) {
if((x = Td[PA[*a]]) != v) {
if(1 < (a - first)) { break; }
v = x;
first = a;
}
}
if(Td[PA[*first] - 1] < v) {
first = ss_partition(PA, first, a, depth);
}
if((a - first) <= (last - a)) {
if(1 < (a - first)) {
STACK_PUSH(a, last, depth, -1);
last = a, depth += 1, limit = ss_ilg(a - first);
} else {
first = a, limit = -1;
}
} else {
if(1 < (last - a)) {
STACK_PUSH(first, a, depth + 1, ss_ilg(a - first));
first = a, limit = -1;
} else {
last = a, depth += 1, limit = ss_ilg(a - first);
}
}
continue;
}
/* choose pivot */
a = ss_pivot(Td, PA, first, last);
v = Td[PA[*a]];
SWAP(*first, *a);
/* partition */
for(b = first; (++b < last) && ((x = Td[PA[*b]]) == v);) { }
if(((a = b) < last) && (x < v)) {
for(; (++b < last) && ((x = Td[PA[*b]]) <= v);) {
if(x == v) { SWAP(*b, *a); ++a; }
}
}
for(c = last; (b < --c) && ((x = Td[PA[*c]]) == v);) { }
if((b < (d = c)) && (x > v)) {
for(; (b < --c) && ((x = Td[PA[*c]]) >= v);) {
if(x == v) { SWAP(*c, *d); --d; }
}
}
for(; b < c;) {
SWAP(*b, *c);
for(; (++b < c) && ((x = Td[PA[*b]]) <= v);) {
if(x == v) { SWAP(*b, *a); ++a; }
}
for(; (b < --c) && ((x = Td[PA[*c]]) >= v);) {
if(x == v) { SWAP(*c, *d); --d; }
}
}
if(a <= d) {
c = b - 1;
if((s = a - first) > (t = b - a)) { s = t; }
for(e = first, f = b - s; 0 < s; --s, ++e, ++f) { SWAP(*e, *f); }
if((s = d - c) > (t = last - d - 1)) { s = t; }
for(e = b, f = last - s; 0 < s; --s, ++e, ++f) { SWAP(*e, *f); }
a = first + (b - a), c = last - (d - c);
b = (v <= Td[PA[*a] - 1]) ? a : ss_partition(PA, a, c, depth);
if((a - first) <= (last - c)) {
if((last - c) <= (c - b)) {
STACK_PUSH(b, c, depth + 1, ss_ilg(c - b));
STACK_PUSH(c, last, depth, limit);
last = a;
} else if((a - first) <= (c - b)) {
STACK_PUSH(c, last, depth, limit);
STACK_PUSH(b, c, depth + 1, ss_ilg(c - b));
last = a;
} else {
STACK_PUSH(c, last, depth, limit);
STACK_PUSH(first, a, depth, limit);
first = b, last = c, depth += 1, limit = ss_ilg(c - b);
}
} else {
if((a - first) <= (c - b)) {
STACK_PUSH(b, c, depth + 1, ss_ilg(c - b));
STACK_PUSH(first, a, depth, limit);
first = c;
} else if((last - c) <= (c - b)) {
STACK_PUSH(first, a, depth, limit);
STACK_PUSH(b, c, depth + 1, ss_ilg(c - b));
first = c;
} else {
STACK_PUSH(first, a, depth, limit);
STACK_PUSH(c, last, depth, limit);
first = b, last = c, depth += 1, limit = ss_ilg(c - b);
}
}
} else {
limit += 1;
if(Td[PA[*first] - 1] < v) {
first = ss_partition(PA, first, last, depth);
limit = ss_ilg(last - first);
}
depth += 1;
}
}
#undef STACK_SIZE
}
#endif /* (SS_BLOCKSIZE == 0) || (SS_INSERTIONSORT_THRESHOLD < SS_BLOCKSIZE) */
/*---------------------------------------------------------------------------*/
#if SS_BLOCKSIZE != 0
static INLINE
void
ss_blockswap(saidx_t *a, saidx_t *b, saidx_t n) {
saidx_t t;
for(; 0 < n; --n, ++a, ++b) {
t = *a, *a = *b, *b = t;
}
}
static INLINE
void
ss_rotate(saidx_t *first, saidx_t *middle, saidx_t *last) {
saidx_t *a, *b, t;
saidx_t l, r;
l = middle - first, r = last - middle;
for(; (0 < l) && (0 < r);) {
if(l == r) { ss_blockswap(first, middle, l); break; }
if(l < r) {
a = last - 1, b = middle - 1;
t = *a;
do {
*a-- = *b, *b-- = *a;
if(b < first) {
*a = t;
last = a;
if((r -= l + 1) <= l) { break; }
a -= 1, b = middle - 1;
t = *a;
}
} while(1);
} else {
a = first, b = middle;
t = *a;
do {
*a++ = *b, *b++ = *a;
if(last <= b) {
*a = t;
first = a + 1;
if((l -= r + 1) <= r) { break; }
a += 1, b = middle;
t = *a;
}
} while(1);
}
}
}
/*---------------------------------------------------------------------------*/
static
void
ss_inplacemerge(const sauchar_t *T, const saidx_t *PA,
saidx_t *first, saidx_t *middle, saidx_t *last,
saidx_t depth) {
const saidx_t *p;
saidx_t *a, *b;
saidx_t len, half;
saint_t q, r;
saint_t x;
for(;;) {
if(*(last - 1) < 0) { x = 1; p = PA + ~*(last - 1); }
else { x = 0; p = PA + *(last - 1); }
for(a = first, len = middle - first, half = len >> 1, r = -1;
0 < len;
len = half, half >>= 1) {
b = a + half;
q = ss_compare(T, PA + ((0 <= *b) ? *b : ~*b), p, depth);
if(q < 0) {
a = b + 1;
half -= (len & 1) ^ 1;
} else {
r = q;
}
}
if(a < middle) {
if(r == 0) { *a = ~*a; }
ss_rotate(a, middle, last);
last -= middle - a;
middle = a;
if(first == middle) { break; }
}
--last;
if(x != 0) { while(*--last < 0) { } }
if(middle == last) { break; }
}
}
/*---------------------------------------------------------------------------*/
/* Merge-forward with internal buffer. */
static
void
ss_mergeforward(const sauchar_t *T, const saidx_t *PA,
saidx_t *first, saidx_t *middle, saidx_t *last,
saidx_t *buf, saidx_t depth) {
saidx_t *a, *b, *c, *bufend;
saidx_t t;
saint_t r;
bufend = buf + (middle - first) - 1;
ss_blockswap(buf, first, middle - first);
for(t = *(a = first), b = buf, c = middle;;) {
r = ss_compare(T, PA + *b, PA + *c, depth);
if(r < 0) {
do {
*a++ = *b;
if(bufend <= b) { *bufend = t; return; }
*b++ = *a;
} while(*b < 0);
} else if(r > 0) {
do {
*a++ = *c, *c++ = *a;
if(last <= c) {
while(b < bufend) { *a++ = *b, *b++ = *a; }
*a = *b, *b = t;
return;
}
} while(*c < 0);
} else {
*c = ~*c;
do {
*a++ = *b;
if(bufend <= b) { *bufend = t; return; }
*b++ = *a;
} while(*b < 0);
do {
*a++ = *c, *c++ = *a;
if(last <= c) {
while(b < bufend) { *a++ = *b, *b++ = *a; }
*a = *b, *b = t;
return;
}
} while(*c < 0);
}
}
}
/* Merge-backward with internal buffer. */
static
void
ss_mergebackward(const sauchar_t *T, const saidx_t *PA,
saidx_t *first, saidx_t *middle, saidx_t *last,
saidx_t *buf, saidx_t depth) {
const saidx_t *p1, *p2;
saidx_t *a, *b, *c, *bufend;
saidx_t t;
saint_t r;
saint_t x;
bufend = buf + (last - middle) - 1;
ss_blockswap(buf, middle, last - middle);
x = 0;
if(*bufend < 0) { p1 = PA + ~*bufend; x |= 1; }
else { p1 = PA + *bufend; }
if(*(middle - 1) < 0) { p2 = PA + ~*(middle - 1); x |= 2; }
else { p2 = PA + *(middle - 1); }
for(t = *(a = last - 1), b = bufend, c = middle - 1;;) {
r = ss_compare(T, p1, p2, depth);
if(0 < r) {
if(x & 1) { do { *a-- = *b, *b-- = *a; } while(*b < 0); x ^= 1; }
*a-- = *b;
if(b <= buf) { *buf = t; break; }
*b-- = *a;
if(*b < 0) { p1 = PA + ~*b; x |= 1; }
else { p1 = PA + *b; }
} else if(r < 0) {
if(x & 2) { do { *a-- = *c, *c-- = *a; } while(*c < 0); x ^= 2; }
*a-- = *c, *c-- = *a;
if(c < first) {
while(buf < b) { *a-- = *b, *b-- = *a; }
*a = *b, *b = t;
break;
}
if(*c < 0) { p2 = PA + ~*c; x |= 2; }
else { p2 = PA + *c; }
} else {
if(x & 1) { do { *a-- = *b, *b-- = *a; } while(*b < 0); x ^= 1; }
*a-- = ~*b;
if(b <= buf) { *buf = t; break; }
*b-- = *a;
if(x & 2) { do { *a-- = *c, *c-- = *a; } while(*c < 0); x ^= 2; }
*a-- = *c, *c-- = *a;
if(c < first) {
while(buf < b) { *a-- = *b, *b-- = *a; }
*a = *b, *b = t;
break;
}
if(*b < 0) { p1 = PA + ~*b; x |= 1; }
else { p1 = PA + *b; }
if(*c < 0) { p2 = PA + ~*c; x |= 2; }
else { p2 = PA + *c; }
}
}
}
/* D&C based merge. */
static
void
ss_swapmerge(const sauchar_t *T, const saidx_t *PA,
saidx_t *first, saidx_t *middle, saidx_t *last,
saidx_t *buf, saidx_t bufsize, saidx_t depth) {
#define STACK_SIZE SS_SMERGE_STACKSIZE
#define GETIDX(a) ((0 <= (a)) ? (a) : (~(a)))
#define MERGE_CHECK(a, b, c)\
do {\
if(((c) & 1) ||\
(((c) & 2) && (ss_compare(T, PA + GETIDX(*((a) - 1)), PA + *(a), depth) == 0))) {\
*(a) = ~*(a);\
}\
if(((c) & 4) && ((ss_compare(T, PA + GETIDX(*((b) - 1)), PA + *(b), depth) == 0))) {\
*(b) = ~*(b);\
}\
} while(0)
struct { saidx_t *a, *b, *c; saint_t d; } stack[STACK_SIZE];
saidx_t *l, *r, *lm, *rm;
saidx_t m, len, half;
saint_t ssize;
saint_t check, next;
for(check = 0, ssize = 0;;) {
if((last - middle) <= bufsize) {
if((first < middle) && (middle < last)) {
ss_mergebackward(T, PA, first, middle, last, buf, depth);
}
MERGE_CHECK(first, last, check);
STACK_POP(first, middle, last, check);
continue;
}
if((middle - first) <= bufsize) {
if(first < middle) {
ss_mergeforward(T, PA, first, middle, last, buf, depth);
}
MERGE_CHECK(first, last, check);
STACK_POP(first, middle, last, check);
continue;
}
for(m = 0, len = MIN(middle - first, last - middle), half = len >> 1;
0 < len;
len = half, half >>= 1) {
if(ss_compare(T, PA + GETIDX(*(middle + m + half)),
PA + GETIDX(*(middle - m - half - 1)), depth) < 0) {
m += half + 1;
half -= (len & 1) ^ 1;
}
}
if(0 < m) {
lm = middle - m, rm = middle + m;
ss_blockswap(lm, middle, m);
l = r = middle, next = 0;
if(rm < last) {
if(*rm < 0) {
*rm = ~*rm;
if(first < lm) { for(; *--l < 0;) { } next |= 4; }
next |= 1;
} else if(first < lm) {
for(; *r < 0; ++r) { }
next |= 2;
}
}
if((l - first) <= (last - r)) {
STACK_PUSH(r, rm, last, (next & 3) | (check & 4));
middle = lm, last = l, check = (check & 3) | (next & 4);
} else {
if((next & 2) && (r == middle)) { next ^= 6; }
STACK_PUSH(first, lm, l, (check & 3) | (next & 4));
first = r, middle = rm, check = (next & 3) | (check & 4);
}
} else {
if(ss_compare(T, PA + GETIDX(*(middle - 1)), PA + *middle, depth) == 0) {
*middle = ~*middle;
}
MERGE_CHECK(first, last, check);
STACK_POP(first, middle, last, check);
}
}
#undef STACK_SIZE
}
#endif /* SS_BLOCKSIZE != 0 */
/*---------------------------------------------------------------------------*/
/*- Function -*/
/* Substring sort */
void
sssort(const sauchar_t *T, const saidx_t *PA,
saidx_t *first, saidx_t *last,
saidx_t *buf, saidx_t bufsize,
saidx_t depth, saidx_t n, saint_t lastsuffix) {
saidx_t *a;
#if SS_BLOCKSIZE != 0
saidx_t *b, *middle, *curbuf;
saidx_t j, k, curbufsize, limit;
#endif
saidx_t i;
if(lastsuffix != 0) { ++first; }
#if SS_BLOCKSIZE == 0
ss_mintrosort(T, PA, first, last, depth);
#else
if((bufsize < SS_BLOCKSIZE) &&
(bufsize < (last - first)) &&
(bufsize < (limit = ss_isqrt(last - first)))) {
if(SS_BLOCKSIZE < limit) { limit = SS_BLOCKSIZE; }
buf = middle = last - limit, bufsize = limit;
} else {
middle = last, limit = 0;
}
for(a = first, i = 0; SS_BLOCKSIZE < (middle - a); a += SS_BLOCKSIZE, ++i) {
#if SS_INSERTIONSORT_THRESHOLD < SS_BLOCKSIZE
ss_mintrosort(T, PA, a, a + SS_BLOCKSIZE, depth);
#elif 1 < SS_BLOCKSIZE
ss_insertionsort(T, PA, a, a + SS_BLOCKSIZE, depth);
#endif
curbufsize = last - (a + SS_BLOCKSIZE);
curbuf = a + SS_BLOCKSIZE;
if(curbufsize <= bufsize) { curbufsize = bufsize, curbuf = buf; }
for(b = a, k = SS_BLOCKSIZE, j = i; j & 1; b -= k, k <<= 1, j >>= 1) {
ss_swapmerge(T, PA, b - k, b, b + k, curbuf, curbufsize, depth);
}
}
#if SS_INSERTIONSORT_THRESHOLD < SS_BLOCKSIZE
ss_mintrosort(T, PA, a, middle, depth);
#elif 1 < SS_BLOCKSIZE
ss_insertionsort(T, PA, a, middle, depth);
#endif
for(k = SS_BLOCKSIZE; i != 0; k <<= 1, i >>= 1) {
if(i & 1) {
ss_swapmerge(T, PA, a - k, a, middle, buf, bufsize, depth);
a -= k;
}
}
if(limit != 0) {
#if SS_INSERTIONSORT_THRESHOLD < SS_BLOCKSIZE
ss_mintrosort(T, PA, middle, last, depth);
#elif 1 < SS_BLOCKSIZE
ss_insertionsort(T, PA, middle, last, depth);
#endif
ss_inplacemerge(T, PA, first, middle, last, depth);
}
#endif
if(lastsuffix != 0) {
/* Insert last type B* suffix. */
saidx_t PAi[2]; PAi[0] = PA[*(first - 1)], PAi[1] = n - 2;
for(a = first, i = *(first - 1);
(a < last) && ((*a < 0) || (0 < ss_compare(T, &(PAi[0]), PA + *a, depth)));
++a) {
*(a - 1) = *a;
}
*(a - 1) = i;
}
}

615
dictBuilder/trsort.c Normal file
View File

@ -0,0 +1,615 @@
/*
* trsort.c for libdivsufsort
* Copyright (c) 2003-2008 Yuta Mori All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person
* obtaining a copy of this software and associated documentation
* files (the "Software"), to deal in the Software without
* restriction, including without limitation the rights to use,
* copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following
* conditions:
*
* The above copyright notice and this permission notice shall be
* included in all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*/
/*- Compiler specifics -*/
#ifdef __clang__
#pragma clang diagnostic ignored "-Wshorten-64-to-32"
#endif
#if defined (__cplusplus) || (defined (__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L) /* C99 */)
/* inline is defined */
#elif defined(_MSC_VER)
# define inline __inline
#else
# define inline /* disable inline */
#endif
#ifdef _MSC_VER /* Visual Studio */
# pragma warning(disable : 4127) /* disable: C4127: conditional expression is constant */
# define FORCE_INLINE static __forceinline
#else
# if defined (__STDC_VERSION__) && __STDC_VERSION__ >= 199901L /* C99 */
# ifdef __GNUC__
# define FORCE_INLINE static inline __attribute__((always_inline))
# else
# define FORCE_INLINE static inline
# endif
# else
# define FORCE_INLINE static
# endif /* __STDC_VERSION__ */
#endif
/*- Dependencies -*/
#include "divsufsort_private.h"
/*- Private Functions -*/
static const saint_t lg_table[256]= {
-1,0,1,1,2,2,2,2,3,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,
5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,
6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,
6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,
7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,
7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,
7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,
7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7
};
static INLINE
saint_t
tr_ilg(saidx_t n) {
#if defined(BUILD_DIVSUFSORT64)
return (n >> 32) ?
((n >> 48) ?
((n >> 56) ?
56 + lg_table[(n >> 56) & 0xff] :
48 + lg_table[(n >> 48) & 0xff]) :
((n >> 40) ?
40 + lg_table[(n >> 40) & 0xff] :
32 + lg_table[(n >> 32) & 0xff])) :
((n & 0xffff0000) ?
((n & 0xff000000) ?
24 + lg_table[(n >> 24) & 0xff] :
16 + lg_table[(n >> 16) & 0xff]) :
((n & 0x0000ff00) ?
8 + lg_table[(n >> 8) & 0xff] :
0 + lg_table[(n >> 0) & 0xff]));
#else
return (n & 0xffff0000) ?
((n & 0xff000000) ?
24 + lg_table[(n >> 24) & 0xff] :
16 + lg_table[(n >> 16) & 0xff]) :
((n & 0x0000ff00) ?
8 + lg_table[(n >> 8) & 0xff] :
0 + lg_table[(n >> 0) & 0xff]);
#endif
}
/*---------------------------------------------------------------------------*/
/* Simple insertionsort for small size groups. */
static
void
tr_insertionsort(const saidx_t *ISAd, saidx_t *first, saidx_t *last) {
saidx_t *a, *b;
saidx_t t, r;
for(a = first + 1; a < last; ++a) {
for(t = *a, b = a - 1; 0 > (r = ISAd[t] - ISAd[*b]);) {
do { *(b + 1) = *b; } while((first <= --b) && (*b < 0));
if(b < first) { break; }
}
if(r == 0) { *b = ~*b; }
*(b + 1) = t;
}
}
/*---------------------------------------------------------------------------*/
static INLINE
void
tr_fixdown(const saidx_t *ISAd, saidx_t *SA, saidx_t i, saidx_t size) {
saidx_t j, k;
saidx_t v;
saidx_t c, d, e;
for(v = SA[i], c = ISAd[v]; (j = 2 * i + 1) < size; SA[i] = SA[k], i = k) {
d = ISAd[SA[k = j++]];
if(d < (e = ISAd[SA[j]])) { k = j; d = e; }
if(d <= c) { break; }
}
SA[i] = v;
}
/* Simple top-down heapsort. */
static
void
tr_heapsort(const saidx_t *ISAd, saidx_t *SA, saidx_t size) {
saidx_t i, m;
saidx_t t;
m = size;
if((size % 2) == 0) {
m--;
if(ISAd[SA[m / 2]] < ISAd[SA[m]]) { SWAP(SA[m], SA[m / 2]); }
}
for(i = m / 2 - 1; 0 <= i; --i) { tr_fixdown(ISAd, SA, i, m); }
if((size % 2) == 0) { SWAP(SA[0], SA[m]); tr_fixdown(ISAd, SA, 0, m); }
for(i = m - 1; 0 < i; --i) {
t = SA[0], SA[0] = SA[i];
tr_fixdown(ISAd, SA, 0, i);
SA[i] = t;
}
}
/*---------------------------------------------------------------------------*/
/* Returns the median of three elements. */
static INLINE
saidx_t *
tr_median3(const saidx_t *ISAd, saidx_t *v1, saidx_t *v2, saidx_t *v3) {
saidx_t *t;
if(ISAd[*v1] > ISAd[*v2]) { SWAP(v1, v2); }
if(ISAd[*v2] > ISAd[*v3]) {
if(ISAd[*v1] > ISAd[*v3]) { return v1; }
else { return v3; }
}
return v2;
}
/* Returns the median of five elements. */
static INLINE
saidx_t *
tr_median5(const saidx_t *ISAd,
saidx_t *v1, saidx_t *v2, saidx_t *v3, saidx_t *v4, saidx_t *v5) {
saidx_t *t;
if(ISAd[*v2] > ISAd[*v3]) { SWAP(v2, v3); }
if(ISAd[*v4] > ISAd[*v5]) { SWAP(v4, v5); }
if(ISAd[*v2] > ISAd[*v4]) { SWAP(v2, v4); SWAP(v3, v5); }
if(ISAd[*v1] > ISAd[*v3]) { SWAP(v1, v3); }
if(ISAd[*v1] > ISAd[*v4]) { SWAP(v1, v4); SWAP(v3, v5); }
if(ISAd[*v3] > ISAd[*v4]) { return v4; }
return v3;
}
/* Returns the pivot element. */
static INLINE
saidx_t *
tr_pivot(const saidx_t *ISAd, saidx_t *first, saidx_t *last) {
saidx_t *middle;
saidx_t t;
t = last - first;
middle = first + t / 2;
if(t <= 512) {
if(t <= 32) {
return tr_median3(ISAd, first, middle, last - 1);
} else {
t >>= 2;
return tr_median5(ISAd, first, first + t, middle, last - 1 - t, last - 1);
}
}
t >>= 3;
first = tr_median3(ISAd, first, first + t, first + (t << 1));
middle = tr_median3(ISAd, middle - t, middle, middle + t);
last = tr_median3(ISAd, last - 1 - (t << 1), last - 1 - t, last - 1);
return tr_median3(ISAd, first, middle, last);
}
/*---------------------------------------------------------------------------*/
typedef struct _trbudget_t trbudget_t;
struct _trbudget_t {
saidx_t chance;
saidx_t remain;
saidx_t incval;
saidx_t count;
};
static INLINE
void
trbudget_init(trbudget_t *budget, saidx_t chance, saidx_t incval) {
budget->chance = chance;
budget->remain = budget->incval = incval;
}
static INLINE
saint_t
trbudget_check(trbudget_t *budget, saidx_t size) {
if(size <= budget->remain) { budget->remain -= size; return 1; }
if(budget->chance == 0) { budget->count += size; return 0; }
budget->remain += budget->incval - size;
budget->chance -= 1;
return 1;
}
/*---------------------------------------------------------------------------*/
static INLINE
void
tr_partition(const saidx_t *ISAd,
saidx_t *first, saidx_t *middle, saidx_t *last,
saidx_t **pa, saidx_t **pb, saidx_t v) {
saidx_t *a, *b, *c, *d, *e, *f;
saidx_t t, s;
saidx_t x = 0;
for(b = middle - 1; (++b < last) && ((x = ISAd[*b]) == v);) { }
if(((a = b) < last) && (x < v)) {
for(; (++b < last) && ((x = ISAd[*b]) <= v);) {
if(x == v) { SWAP(*b, *a); ++a; }
}
}
for(c = last; (b < --c) && ((x = ISAd[*c]) == v);) { }
if((b < (d = c)) && (x > v)) {
for(; (b < --c) && ((x = ISAd[*c]) >= v);) {
if(x == v) { SWAP(*c, *d); --d; }
}
}
for(; b < c;) {
SWAP(*b, *c);
for(; (++b < c) && ((x = ISAd[*b]) <= v);) {
if(x == v) { SWAP(*b, *a); ++a; }
}
for(; (b < --c) && ((x = ISAd[*c]) >= v);) {
if(x == v) { SWAP(*c, *d); --d; }
}
}
if(a <= d) {
c = b - 1;
if((s = a - first) > (t = b - a)) { s = t; }
for(e = first, f = b - s; 0 < s; --s, ++e, ++f) { SWAP(*e, *f); }
if((s = d - c) > (t = last - d - 1)) { s = t; }
for(e = b, f = last - s; 0 < s; --s, ++e, ++f) { SWAP(*e, *f); }
first += (b - a), last -= (d - c);
}
*pa = first, *pb = last;
}
static
void
tr_copy(saidx_t *ISA, const saidx_t *SA,
saidx_t *first, saidx_t *a, saidx_t *b, saidx_t *last,
saidx_t depth) {
/* sort suffixes of middle partition
by using sorted order of suffixes of left and right partition. */
saidx_t *c, *d, *e;
saidx_t s, v;
v = b - SA - 1;
for(c = first, d = a - 1; c <= d; ++c) {
if((0 <= (s = *c - depth)) && (ISA[s] == v)) {
*++d = s;
ISA[s] = d - SA;
}
}
for(c = last - 1, e = d + 1, d = b; e < d; --c) {
if((0 <= (s = *c - depth)) && (ISA[s] == v)) {
*--d = s;
ISA[s] = d - SA;
}
}
}
static
void
tr_partialcopy(saidx_t *ISA, const saidx_t *SA,
saidx_t *first, saidx_t *a, saidx_t *b, saidx_t *last,
saidx_t depth) {
saidx_t *c, *d, *e;
saidx_t s, v;
saidx_t rank, lastrank, newrank = -1;
v = b - SA - 1;
lastrank = -1;
for(c = first, d = a - 1; c <= d; ++c) {
if((0 <= (s = *c - depth)) && (ISA[s] == v)) {
*++d = s;
rank = ISA[s + depth];
if(lastrank != rank) { lastrank = rank; newrank = d - SA; }
ISA[s] = newrank;
}
}
lastrank = -1;
for(e = d; first <= e; --e) {
rank = ISA[*e];
if(lastrank != rank) { lastrank = rank; newrank = e - SA; }
if(newrank != rank) { ISA[*e] = newrank; }
}
lastrank = -1;
for(c = last - 1, e = d + 1, d = b; e < d; --c) {
if((0 <= (s = *c - depth)) && (ISA[s] == v)) {
*--d = s;
rank = ISA[s + depth];
if(lastrank != rank) { lastrank = rank; newrank = d - SA; }
ISA[s] = newrank;
}
}
}
static
void
tr_introsort(saidx_t *ISA, const saidx_t *ISAd,
saidx_t *SA, saidx_t *first, saidx_t *last,
trbudget_t *budget) {
#define STACK_SIZE TR_STACKSIZE
struct { const saidx_t *a; saidx_t *b, *c; saint_t d, e; }stack[STACK_SIZE];
saidx_t *a, *b, *c;
saidx_t t;
saidx_t v, x = 0;
saidx_t incr = ISAd - ISA;
saint_t limit, next;
saint_t ssize, trlink = -1;
for(ssize = 0, limit = tr_ilg(last - first);;) {
if(limit < 0) {
if(limit == -1) {
/* tandem repeat partition */
tr_partition(ISAd - incr, first, first, last, &a, &b, last - SA - 1);
/* update ranks */
if(a < last) {
for(c = first, v = a - SA - 1; c < a; ++c) { ISA[*c] = v; }
}
if(b < last) {
for(c = a, v = b - SA - 1; c < b; ++c) { ISA[*c] = v; }
}
/* push */
if(1 < (b - a)) {
STACK_PUSH5(NULL, a, b, 0, 0);
STACK_PUSH5(ISAd - incr, first, last, -2, trlink);
trlink = ssize - 2;
}
if((a - first) <= (last - b)) {
if(1 < (a - first)) {
STACK_PUSH5(ISAd, b, last, tr_ilg(last - b), trlink);
last = a, limit = tr_ilg(a - first);
} else if(1 < (last - b)) {
first = b, limit = tr_ilg(last - b);
} else {
STACK_POP5(ISAd, first, last, limit, trlink);
}
} else {
if(1 < (last - b)) {
STACK_PUSH5(ISAd, first, a, tr_ilg(a - first), trlink);
first = b, limit = tr_ilg(last - b);
} else if(1 < (a - first)) {
last = a, limit = tr_ilg(a - first);
} else {
STACK_POP5(ISAd, first, last, limit, trlink);
}
}
} else if(limit == -2) {
/* tandem repeat copy */
a = stack[--ssize].b, b = stack[ssize].c;
if(stack[ssize].d == 0) {
tr_copy(ISA, SA, first, a, b, last, ISAd - ISA);
} else {
if(0 <= trlink) { stack[trlink].d = -1; }
tr_partialcopy(ISA, SA, first, a, b, last, ISAd - ISA);
}
STACK_POP5(ISAd, first, last, limit, trlink);
} else {
/* sorted partition */
if(0 <= *first) {
a = first;
do { ISA[*a] = a - SA; } while((++a < last) && (0 <= *a));
first = a;
}
if(first < last) {
a = first; do { *a = ~*a; } while(*++a < 0);
next = (ISA[*a] != ISAd[*a]) ? tr_ilg(a - first + 1) : -1;
if(++a < last) { for(b = first, v = a - SA - 1; b < a; ++b) { ISA[*b] = v; } }
/* push */
if(trbudget_check(budget, a - first)) {
if((a - first) <= (last - a)) {
STACK_PUSH5(ISAd, a, last, -3, trlink);
ISAd += incr, last = a, limit = next;
} else {
if(1 < (last - a)) {
STACK_PUSH5(ISAd + incr, first, a, next, trlink);
first = a, limit = -3;
} else {
ISAd += incr, last = a, limit = next;
}
}
} else {
if(0 <= trlink) { stack[trlink].d = -1; }
if(1 < (last - a)) {
first = a, limit = -3;
} else {
STACK_POP5(ISAd, first, last, limit, trlink);
}
}
} else {
STACK_POP5(ISAd, first, last, limit, trlink);
}
}
continue;
}
if((last - first) <= TR_INSERTIONSORT_THRESHOLD) {
tr_insertionsort(ISAd, first, last);
limit = -3;
continue;
}
if(limit-- == 0) {
tr_heapsort(ISAd, first, last - first);
for(a = last - 1; first < a; a = b) {
for(x = ISAd[*a], b = a - 1; (first <= b) && (ISAd[*b] == x); --b) { *b = ~*b; }
}
limit = -3;
continue;
}
/* choose pivot */
a = tr_pivot(ISAd, first, last);
SWAP(*first, *a);
v = ISAd[*first];
/* partition */
tr_partition(ISAd, first, first + 1, last, &a, &b, v);
if((last - first) != (b - a)) {
next = (ISA[*a] != v) ? tr_ilg(b - a) : -1;
/* update ranks */
for(c = first, v = a - SA - 1; c < a; ++c) { ISA[*c] = v; }
if(b < last) { for(c = a, v = b - SA - 1; c < b; ++c) { ISA[*c] = v; } }
/* push */
if((1 < (b - a)) && (trbudget_check(budget, b - a))) {
if((a - first) <= (last - b)) {
if((last - b) <= (b - a)) {
if(1 < (a - first)) {
STACK_PUSH5(ISAd + incr, a, b, next, trlink);
STACK_PUSH5(ISAd, b, last, limit, trlink);
last = a;
} else if(1 < (last - b)) {
STACK_PUSH5(ISAd + incr, a, b, next, trlink);
first = b;
} else {
ISAd += incr, first = a, last = b, limit = next;
}
} else if((a - first) <= (b - a)) {
if(1 < (a - first)) {
STACK_PUSH5(ISAd, b, last, limit, trlink);
STACK_PUSH5(ISAd + incr, a, b, next, trlink);
last = a;
} else {
STACK_PUSH5(ISAd, b, last, limit, trlink);
ISAd += incr, first = a, last = b, limit = next;
}
} else {
STACK_PUSH5(ISAd, b, last, limit, trlink);
STACK_PUSH5(ISAd, first, a, limit, trlink);
ISAd += incr, first = a, last = b, limit = next;
}
} else {
if((a - first) <= (b - a)) {
if(1 < (last - b)) {
STACK_PUSH5(ISAd + incr, a, b, next, trlink);
STACK_PUSH5(ISAd, first, a, limit, trlink);
first = b;
} else if(1 < (a - first)) {
STACK_PUSH5(ISAd + incr, a, b, next, trlink);
last = a;
} else {
ISAd += incr, first = a, last = b, limit = next;
}
} else if((last - b) <= (b - a)) {
if(1 < (last - b)) {
STACK_PUSH5(ISAd, first, a, limit, trlink);
STACK_PUSH5(ISAd + incr, a, b, next, trlink);
first = b;
} else {
STACK_PUSH5(ISAd, first, a, limit, trlink);
ISAd += incr, first = a, last = b, limit = next;
}
} else {
STACK_PUSH5(ISAd, first, a, limit, trlink);
STACK_PUSH5(ISAd, b, last, limit, trlink);
ISAd += incr, first = a, last = b, limit = next;
}
}
} else {
if((1 < (b - a)) && (0 <= trlink)) { stack[trlink].d = -1; }
if((a - first) <= (last - b)) {
if(1 < (a - first)) {
STACK_PUSH5(ISAd, b, last, limit, trlink);
last = a;
} else if(1 < (last - b)) {
first = b;
} else {
STACK_POP5(ISAd, first, last, limit, trlink);
}
} else {
if(1 < (last - b)) {
STACK_PUSH5(ISAd, first, a, limit, trlink);
first = b;
} else if(1 < (a - first)) {
last = a;
} else {
STACK_POP5(ISAd, first, last, limit, trlink);
}
}
}
} else {
if(trbudget_check(budget, last - first)) {
limit = tr_ilg(last - first), ISAd += incr;
} else {
if(0 <= trlink) { stack[trlink].d = -1; }
STACK_POP5(ISAd, first, last, limit, trlink);
}
}
}
#undef STACK_SIZE
}
/*---------------------------------------------------------------------------*/
/*- Function -*/
/* Tandem repeat sort */
void
trsort(saidx_t *ISA, saidx_t *SA, saidx_t n, saidx_t depth) {
saidx_t *ISAd;
saidx_t *first, *last;
trbudget_t budget;
saidx_t t, skip, unsorted;
trbudget_init(&budget, tr_ilg(n) * 2 / 3, n);
/* trbudget_init(&budget, tr_ilg(n) * 3 / 4, n); */
for(ISAd = ISA + depth; -n < *SA; ISAd += ISAd - ISA) {
first = SA;
skip = 0;
unsorted = 0;
do {
if((t = *first) < 0) { first -= t; skip += t; }
else {
if(skip != 0) { *(first + skip) = skip; skip = 0; }
last = SA + ISA[t] + 1;
if(1 < (last - first)) {
budget.count = 0;
tr_introsort(ISA, ISAd, SA, first, last, &budget);
if(budget.count != 0) { unsorted += budget.count; }
else { skip = first - last; }
} else if((last - first) == 1) {
skip = -1;
}
first = last;
}
} while(first < (SA + n));
if(skip != 0) { *(first + skip) = skip; }
if(unsorted == 0) { break; }
}
}

381
dictBuilder/utils.c Normal file
View File

@ -0,0 +1,381 @@
/*
* utils.c for libdivsufsort
* Copyright (c) 2003-2008 Yuta Mori All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person
* obtaining a copy of this software and associated documentation
* files (the "Software"), to deal in the Software without
* restriction, including without limitation the rights to use,
* copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following
* conditions:
*
* The above copyright notice and this permission notice shall be
* included in all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*/
#include "divsufsort_private.h"
/*- Private Function -*/
/* Binary search for inverse bwt. */
static
saidx_t
binarysearch_lower(const saidx_t *A, saidx_t size, saidx_t value) {
saidx_t half, i;
for(i = 0, half = size >> 1;
0 < size;
size = half, half >>= 1) {
if(A[i + half] < value) {
i += half + 1;
half -= (size & 1) ^ 1;
}
}
return i;
}
/*- Functions -*/
/* Burrows-Wheeler transform. */
saint_t
bw_transform(const sauchar_t *T, sauchar_t *U, saidx_t *SA,
saidx_t n, saidx_t *idx) {
saidx_t *A, i, j, p, t;
saint_t c;
/* Check arguments. */
if((T == NULL) || (U == NULL) || (n < 0) || (idx == NULL)) { return -1; }
if(n <= 1) {
if(n == 1) { U[0] = T[0]; }
*idx = n;
return 0;
}
if((A = SA) == NULL) {
i = divbwt(T, U, NULL, n);
if(0 <= i) { *idx = i; i = 0; }
return (saint_t)i;
}
/* BW transform. */
if(T == U) {
t = n;
for(i = 0, j = 0; i < n; ++i) {
p = t - 1;
t = A[i];
if(0 <= p) {
c = T[j];
U[j] = (j <= p) ? T[p] : (sauchar_t)A[p];
A[j] = c;
j++;
} else {
*idx = i;
}
}
p = t - 1;
if(0 <= p) {
c = T[j];
U[j] = (j <= p) ? T[p] : (sauchar_t)A[p];
A[j] = c;
} else {
*idx = i;
}
} else {
U[0] = T[n - 1];
for(i = 0; A[i] != 0; ++i) { U[i + 1] = T[A[i] - 1]; }
*idx = i + 1;
for(++i; i < n; ++i) { U[i] = T[A[i] - 1]; }
}
if(SA == NULL) {
/* Deallocate memory. */
free(A);
}
return 0;
}
/* Inverse Burrows-Wheeler transform. */
saint_t
inverse_bw_transform(const sauchar_t *T, sauchar_t *U, saidx_t *A,
saidx_t n, saidx_t idx) {
saidx_t C[ALPHABET_SIZE];
sauchar_t D[ALPHABET_SIZE];
saidx_t *B;
saidx_t i, p;
saint_t c, d;
/* Check arguments. */
if((T == NULL) || (U == NULL) || (n < 0) || (idx < 0) ||
(n < idx) || ((0 < n) && (idx == 0))) {
return -1;
}
if(n <= 1) { return 0; }
if((B = A) == NULL) {
/* Allocate n*sizeof(saidx_t) bytes of memory. */
if((B = (saidx_t *)malloc((size_t)n * sizeof(saidx_t))) == NULL) { return -2; }
}
/* Inverse BW transform. */
for(c = 0; c < ALPHABET_SIZE; ++c) { C[c] = 0; }
for(i = 0; i < n; ++i) { ++C[T[i]]; }
for(c = 0, d = 0, i = 0; c < ALPHABET_SIZE; ++c) {
p = C[c];
if(0 < p) {
C[c] = i;
D[d++] = (sauchar_t)c;
i += p;
}
}
for(i = 0; i < idx; ++i) { B[C[T[i]]++] = i; }
for( ; i < n; ++i) { B[C[T[i]]++] = i + 1; }
for(c = 0; c < d; ++c) { C[c] = C[D[c]]; }
for(i = 0, p = idx; i < n; ++i) {
U[i] = D[binarysearch_lower(C, d, p)];
p = B[p - 1];
}
if(A == NULL) {
/* Deallocate memory. */
free(B);
}
return 0;
}
/* Checks the suffix array SA of the string T. */
saint_t
sufcheck(const sauchar_t *T, const saidx_t *SA,
saidx_t n, saint_t verbose) {
saidx_t C[ALPHABET_SIZE];
saidx_t i, p, q, t;
saint_t c;
if(verbose) { fprintf(stderr, "sufcheck: "); }
/* Check arguments. */
if((T == NULL) || (SA == NULL) || (n < 0)) {
if(verbose) { fprintf(stderr, "Invalid arguments.\n"); }
return -1;
}
if(n == 0) {
if(verbose) { fprintf(stderr, "Done.\n"); }
return 0;
}
/* check range: [0..n-1] */
for(i = 0; i < n; ++i) {
if((SA[i] < 0) || (n <= SA[i])) {
if(verbose) {
fprintf(stderr, "Out of the range [0,%" PRIdSAIDX_T "].\n"
" SA[%" PRIdSAIDX_T "]=%" PRIdSAIDX_T "\n",
n - 1, i, SA[i]);
}
return -2;
}
}
/* check first characters. */
for(i = 1; i < n; ++i) {
if(T[SA[i - 1]] > T[SA[i]]) {
if(verbose) {
fprintf(stderr, "Suffixes in wrong order.\n"
" T[SA[%" PRIdSAIDX_T "]=%" PRIdSAIDX_T "]=%d"
" > T[SA[%" PRIdSAIDX_T "]=%" PRIdSAIDX_T "]=%d\n",
i - 1, SA[i - 1], T[SA[i - 1]], i, SA[i], T[SA[i]]);
}
return -3;
}
}
/* check suffixes. */
for(i = 0; i < ALPHABET_SIZE; ++i) { C[i] = 0; }
for(i = 0; i < n; ++i) { ++C[T[i]]; }
for(i = 0, p = 0; i < ALPHABET_SIZE; ++i) {
t = C[i];
C[i] = p;
p += t;
}
q = C[T[n - 1]];
C[T[n - 1]] += 1;
for(i = 0; i < n; ++i) {
p = SA[i];
if(0 < p) {
c = T[--p];
t = C[c];
} else {
c = T[p = n - 1];
t = q;
}
if((t < 0) || (p != SA[t])) {
if(verbose) {
fprintf(stderr, "Suffix in wrong position.\n"
" SA[%" PRIdSAIDX_T "]=%" PRIdSAIDX_T " or\n"
" SA[%" PRIdSAIDX_T "]=%" PRIdSAIDX_T "\n",
t, (0 <= t) ? SA[t] : -1, i, SA[i]);
}
return -4;
}
if(t != q) {
++C[c];
if((n <= C[c]) || (T[SA[C[c]]] != c)) { C[c] = -1; }
}
}
if(1 <= verbose) { fprintf(stderr, "Done.\n"); }
return 0;
}
static
int
_compare(const sauchar_t *T, saidx_t Tsize,
const sauchar_t *P, saidx_t Psize,
saidx_t suf, saidx_t *match) {
saidx_t i, j;
saint_t r;
for(i = suf + *match, j = *match, r = 0;
(i < Tsize) && (j < Psize) && ((r = T[i] - P[j]) == 0); ++i, ++j) { }
*match = j;
return (r == 0) ? -(j != Psize) : r;
}
/* Search for the pattern P in the string T. */
saidx_t
sa_search(const sauchar_t *T, saidx_t Tsize,
const sauchar_t *P, saidx_t Psize,
const saidx_t *SA, saidx_t SAsize,
saidx_t *idx) {
saidx_t size, lsize, rsize, half;
saidx_t match, lmatch, rmatch;
saidx_t llmatch, lrmatch, rlmatch, rrmatch;
saidx_t i, j, k;
saint_t r;
if(idx != NULL) { *idx = -1; }
if((T == NULL) || (P == NULL) || (SA == NULL) ||
(Tsize < 0) || (Psize < 0) || (SAsize < 0)) { return -1; }
if((Tsize == 0) || (SAsize == 0)) { return 0; }
if(Psize == 0) { if(idx != NULL) { *idx = 0; } return SAsize; }
for(i = j = k = 0, lmatch = rmatch = 0, size = SAsize, half = size >> 1;
0 < size;
size = half, half >>= 1) {
match = MIN(lmatch, rmatch);
r = _compare(T, Tsize, P, Psize, SA[i + half], &match);
if(r < 0) {
i += half + 1;
half -= (size & 1) ^ 1;
lmatch = match;
} else if(r > 0) {
rmatch = match;
} else {
lsize = half, j = i, rsize = size - half - 1, k = i + half + 1;
/* left part */
for(llmatch = lmatch, lrmatch = match, half = lsize >> 1;
0 < lsize;
lsize = half, half >>= 1) {
lmatch = MIN(llmatch, lrmatch);
r = _compare(T, Tsize, P, Psize, SA[j + half], &lmatch);
if(r < 0) {
j += half + 1;
half -= (lsize & 1) ^ 1;
llmatch = lmatch;
} else {
lrmatch = lmatch;
}
}
/* right part */
for(rlmatch = match, rrmatch = rmatch, half = rsize >> 1;
0 < rsize;
rsize = half, half >>= 1) {
rmatch = MIN(rlmatch, rrmatch);
r = _compare(T, Tsize, P, Psize, SA[k + half], &rmatch);
if(r <= 0) {
k += half + 1;
half -= (rsize & 1) ^ 1;
rlmatch = rmatch;
} else {
rrmatch = rmatch;
}
}
break;
}
}
if(idx != NULL) { *idx = (0 < (k - j)) ? j : i; }
return k - j;
}
/* Search for the character c in the string T. */
saidx_t
sa_simplesearch(const sauchar_t *T, saidx_t Tsize,
const saidx_t *SA, saidx_t SAsize,
saint_t c, saidx_t *idx) {
saidx_t size, lsize, rsize, half;
saidx_t i, j, k, p;
saint_t r;
if(idx != NULL) { *idx = -1; }
if((T == NULL) || (SA == NULL) || (Tsize < 0) || (SAsize < 0)) { return -1; }
if((Tsize == 0) || (SAsize == 0)) { return 0; }
for(i = j = k = 0, size = SAsize, half = size >> 1;
0 < size;
size = half, half >>= 1) {
p = SA[i + half];
r = (p < Tsize) ? T[p] - c : -1;
if(r < 0) {
i += half + 1;
half -= (size & 1) ^ 1;
} else if(r == 0) {
lsize = half, j = i, rsize = size - half - 1, k = i + half + 1;
/* left part */
for(half = lsize >> 1;
0 < lsize;
lsize = half, half >>= 1) {
p = SA[j + half];
r = (p < Tsize) ? T[p] - c : -1;
if(r < 0) {
j += half + 1;
half -= (lsize & 1) ^ 1;
}
}
/* right part */
for(half = rsize >> 1;
0 < rsize;
rsize = half, half >>= 1) {
p = SA[k + half];
r = (p < Tsize) ? T[p] - c : -1;
if(r <= 0) {
k += half + 1;
half -= (rsize & 1) ^ 1;
}
}
break;
}
}
if(idx != NULL) { *idx = (0 < (k - j)) ? j : i; }
return k - j;
}

View File

@ -53,7 +53,7 @@ LIBDIR ?= $(PREFIX)/lib
INCLUDEDIR=$(PREFIX)/include
ZSTD_FILES := zstd_compress.c zstd_decompress.c fse.c huff0.c
ZSTD_LEGACY:= legacy/zstd_v01.c legacy/zstd_v02.c legacy/zstd_v03.c
ZSTD_LEGACY:= legacy/zstd_v01.c legacy/zstd_v02.c legacy/zstd_v03.c legacy/zstd_v04.c
ifeq ($(ZSTD_LEGACY_SUPPORT), 0)
CPPFLAGS += -DZSTD_LEGACY_SUPPORT=0

View File

@ -1,8 +1,8 @@
/* ******************************************************************
bitstream
Part of NewGen Entropy library
Part of FSE library
header file (to include)
Copyright (C) 2013-2015, Yann Collet.
Copyright (C) 2013-2016, Yann Collet.
BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
@ -31,7 +31,6 @@
You can contact the author at :
- Source repository : https://github.com/Cyan4973/FiniteStateEntropy
- Public forum : https://groups.google.com/forum/#!forum/lz4c
****************************************************************** */
#ifndef BITSTREAM_H_MODULE
#define BITSTREAM_H_MODULE
@ -47,17 +46,17 @@ extern "C" {
* these functions are defined into a .h to be included.
*/
/******************************************
* Includes
/*-****************************************
* Dependencies
******************************************/
#include "mem.h" /* unaligned access routines */
#include "error_private.h" /* error codes and messages */
/********************************************
* bitStream compression API (write forward)
/*-******************************************
* bitStream encoding API (write forward)
********************************************/
/*
/*!
* bitStream can mix input from multiple sources.
* A critical property of these streams is that they encode and decode in **reverse** direction.
* So the first bit sequence you add will be the last to be read, like a LIFO stack.
@ -71,32 +70,32 @@ typedef struct
char* endPtr;
} BIT_CStream_t;
MEM_STATIC size_t BIT_initCStream(BIT_CStream_t* bitC, void* dstBuffer, size_t maxDstSize);
MEM_STATIC size_t BIT_initCStream(BIT_CStream_t* bitC, void* dstBuffer, size_t dstCapacity);
MEM_STATIC void BIT_addBits(BIT_CStream_t* bitC, size_t value, unsigned nbBits);
MEM_STATIC void BIT_flushBits(BIT_CStream_t* bitC);
MEM_STATIC size_t BIT_closeCStream(BIT_CStream_t* bitC);
/*
* Start by initCStream, providing the maximum size of write buffer to write into.
/*!
* Start by initCStream, providing the size of buffer to write into.
* bitStream will never write outside of this buffer.
* buffer must be at least as large as a size_t, otherwise function result will be an error code.
* @dstCapacity must be >= sizeof(size_t), otherwise @return will be an error code.
*
* bits are first added to a local register.
* Local register is size_t, hence 64-bits on 64-bits systems, or 32-bits on 32-bits systems.
* Writing data into memory is a manual operation, performed by the flushBits function.
* Writing data into memory is an explicit operation, performed by the flushBits function.
* Hence keep track how many bits are potentially stored into local register to avoid register overflow.
* After a flushBits, a maximum of 7 bits might still be stored into local register.
*
* Avoid storing elements of more than 25 bits if you want compatibility with 32-bits bitstream readers.
* Avoid storing elements of more than 24 bits if you want compatibility with 32-bits bitstream readers.
*
* Last operation is to close the bitStream.
* The function returns the final size of CStream in bytes.
* If data couldn't fit into dstBuffer, it will return a 0 ( == not storable)
* If data couldn't fit into @dstBuffer, it will return a 0 ( == not storable)
*/
/**********************************************
* bitStream decompression API (read backward)
/*-********************************************
* bitStream decoding API (read backward)
**********************************************/
typedef struct
{
@ -118,19 +117,19 @@ MEM_STATIC BIT_DStream_status BIT_reloadDStream(BIT_DStream_t* bitD);
MEM_STATIC unsigned BIT_endOfDStream(const BIT_DStream_t* bitD);
/*
/*!
* Start by invoking BIT_initDStream().
* A chunk of the bitStream is then stored into a local register.
* Local register size is 64-bits on 64-bits systems, 32-bits on 32-bits systems (size_t).
* You can then retrieve bitFields stored into the local register, **in reverse order**.
* Local register is manually filled from memory by the BIT_reloadDStream() method.
* Local register is explicitly reloaded from memory by the BIT_reloadDStream() method.
* A reload guarantee a minimum of ((8*sizeof(size_t))-7) bits when its result is BIT_DStream_unfinished.
* Otherwise, it can be less than that, so proceed accordingly.
* Checking if DStream has reached its end can be performed with BIT_endOfDStream()
*/
/******************************************
/*-****************************************
* unsafe API
******************************************/
MEM_STATIC void BIT_addBitsFast(BIT_CStream_t* bitC, size_t value, unsigned nbBits);
@ -144,7 +143,7 @@ MEM_STATIC size_t BIT_readBitsFast(BIT_DStream_t* bitD, unsigned nbBits);
/****************************************************************
/*-**************************************************************
* Helper functions
****************************************************************/
MEM_STATIC unsigned BIT_highbit32 (register U32 val)
@ -170,10 +169,9 @@ MEM_STATIC unsigned BIT_highbit32 (register U32 val)
}
/****************************************************************
/*-**************************************************************
* bitStream encoding
****************************************************************/
MEM_STATIC size_t BIT_initCStream(BIT_CStream_t* bitC, void* startPtr, size_t maxSize)
{
bitC->bitContainer = 0;
@ -240,10 +238,9 @@ MEM_STATIC size_t BIT_closeCStream(BIT_CStream_t* bitC)
}
/**********************************************************
/*-********************************************************
* bitStream decoding
**********************************************************/
/*!BIT_initDStream
* Initialize a BIT_DStream_t.
* @bitD : a pointer to an already allocated BIT_DStream_t structure
@ -255,8 +252,7 @@ MEM_STATIC size_t BIT_initDStream(BIT_DStream_t* bitD, const void* srcBuffer, si
{
if (srcSize < 1) { memset(bitD, 0, sizeof(*bitD)); return ERROR(srcSize_wrong); }
if (srcSize >= sizeof(size_t)) /* normal case */
{
if (srcSize >= sizeof(size_t)) { /* normal case */
U32 contain32;
bitD->start = (const char*)srcBuffer;
bitD->ptr = (const char*)srcBuffer + srcSize - sizeof(size_t);
@ -264,9 +260,7 @@ MEM_STATIC size_t BIT_initDStream(BIT_DStream_t* bitD, const void* srcBuffer, si
contain32 = ((const BYTE*)srcBuffer)[srcSize-1];
if (contain32 == 0) return ERROR(GENERIC); /* endMark not present */
bitD->bitsConsumed = 8 - BIT_highbit32(contain32);
}
else
{
} else {
U32 contain32;
bitD->start = (const char*)srcBuffer;
bitD->ptr = bitD->start;
@ -342,23 +336,20 @@ MEM_STATIC BIT_DStream_status BIT_reloadDStream(BIT_DStream_t* bitD)
if (bitD->bitsConsumed > (sizeof(bitD->bitContainer)*8)) /* should never happen */
return BIT_DStream_overflow;
if (bitD->ptr >= bitD->start + sizeof(bitD->bitContainer))
{
if (bitD->ptr >= bitD->start + sizeof(bitD->bitContainer)) {
bitD->ptr -= bitD->bitsConsumed >> 3;
bitD->bitsConsumed &= 7;
bitD->bitContainer = MEM_readLEST(bitD->ptr);
return BIT_DStream_unfinished;
}
if (bitD->ptr == bitD->start)
{
if (bitD->ptr == bitD->start) {
if (bitD->bitsConsumed < sizeof(bitD->bitContainer)*8) return BIT_DStream_endOfBuffer;
return BIT_DStream_completed;
}
{
U32 nbBytes = bitD->bitsConsumed >> 3;
BIT_DStream_status result = BIT_DStream_unfinished;
if (bitD->ptr - nbBytes < bitD->start)
{
if (bitD->ptr - nbBytes < bitD->start) {
nbBytes = (U32)(bitD->ptr - bitD->start); /* ptr > start */
result = BIT_DStream_endOfBuffer;
}

View File

@ -40,14 +40,14 @@ extern "C" {
#endif
/* *****************************************
* Includes
/* ****************************************
* Dependencies
******************************************/
#include <stddef.h> /* size_t, ptrdiff_t */
#include <stddef.h> /* size_t */
#include "error_public.h" /* enum list */
/* *****************************************
/* ****************************************
* Compiler-specific
******************************************/
#if defined(__GNUC__)
@ -61,43 +61,52 @@ extern "C" {
#endif
/* *****************************************
* Error Codes
/*-****************************************
* Customization
******************************************/
typedef ZSTD_ErrorCode ERR_enum;
#define PREFIX(name) ZSTD_error_##name
/*-****************************************
* Error codes handling
******************************************/
#ifdef ERROR
# undef ERROR /* reported already defined on VS 2015 by Rich Geldreich */
# undef ERROR /* reported already defined on VS 2015 (Rich Geldreich) */
#endif
#define ERROR(name) (size_t)-PREFIX(name)
ERR_STATIC unsigned ERR_isError(size_t code) { return (code > ERROR(maxCode)); }
ERR_STATIC ERR_enum ERR_getError(size_t code) { if (!ERR_isError(code)) return (ERR_enum)0; return (ERR_enum) (0-code); }
/* *****************************************
/*-****************************************
* Error Strings
******************************************/
ERR_STATIC const char* ERR_getErrorName(size_t code)
{
static const char* codeError = "Unspecified error code";
switch( (size_t)(0-code) )
static const char* notErrorCode = "Unspecified error code";
switch( ERR_getError(code) )
{
case ZSTD_error_No_Error: return "No error detected";
case ZSTD_error_GENERIC: return "Error (generic)";
case ZSTD_error_prefix_unknown: return "Unknown frame descriptor";
case ZSTD_error_frameParameter_unsupported: return "Unsupported frame parameter";
case ZSTD_error_frameParameter_unsupportedBy32bitsImplementation: return "Frame parameter unsupported in 32-bits mode";
case ZSTD_error_init_missing: return "Context should be init first";
case ZSTD_error_memory_allocation: return "Allocation error : not enough memory";
case ZSTD_error_dstSize_tooSmall: return "Destination buffer is too small";
case ZSTD_error_srcSize_wrong: return "Src size incorrect";
case ZSTD_error_corruption_detected: return "Corrupted block detected";
case ZSTD_error_tableLog_tooLarge: return "tableLog requires too much memory";
case ZSTD_error_maxSymbolValue_tooLarge: return "Unsupported max possible Symbol Value : too large";
case ZSTD_error_maxSymbolValue_tooSmall: return "Specified maxSymbolValue is too small";
case ZSTD_error_maxCode:
default: return codeError;
case PREFIX(no_error): return "No error detected";
case PREFIX(GENERIC): return "Error (generic)";
case PREFIX(prefix_unknown): return "Unknown frame descriptor";
case PREFIX(frameParameter_unsupported): return "Unsupported frame parameter";
case PREFIX(frameParameter_unsupportedBy32bits): return "Frame parameter unsupported in 32-bits mode";
case PREFIX(init_missing): return "Context should be init first";
case PREFIX(memory_allocation): return "Allocation error : not enough memory";
case PREFIX(stage_wrong): return "Operation not authorized at current processing stage";
case PREFIX(dstSize_tooSmall): return "Destination buffer is too small";
case PREFIX(srcSize_wrong): return "Src size incorrect";
case PREFIX(corruption_detected): return "Corrupted block detected";
case PREFIX(tableLog_tooLarge): return "tableLog requires too much memory";
case PREFIX(maxSymbolValue_tooLarge): return "Unsupported max possible Symbol Value : too large";
case PREFIX(maxSymbolValue_tooSmall): return "Specified maxSymbolValue is too small";
case PREFIX(dictionary_corrupted): return "Dictionary is corrupted";
case PREFIX(maxCode):
default: return notErrorCode; /* should be impossible, due to ERR_getError() */
}
}

View File

@ -39,14 +39,14 @@ extern "C" {
/* ****************************************
* error list
* error codes list
******************************************/
enum {
ZSTD_error_No_Error,
typedef enum {
ZSTD_error_no_error,
ZSTD_error_GENERIC,
ZSTD_error_prefix_unknown,
ZSTD_error_frameParameter_unsupported,
ZSTD_error_frameParameter_unsupportedBy32bitsImplementation,
ZSTD_error_frameParameter_unsupportedBy32bits,
ZSTD_error_init_missing,
ZSTD_error_memory_allocation,
ZSTD_error_stage_wrong,
@ -56,8 +56,9 @@ enum {
ZSTD_error_tableLog_tooLarge,
ZSTD_error_maxSymbolValue_tooLarge,
ZSTD_error_maxSymbolValue_tooSmall,
ZSTD_error_dictionary_corrupted,
ZSTD_error_maxCode
};
} ZSTD_ErrorCode;
/* note : functions provide error codes in reverse negative order,
so compare with (size_t)(0-enum) */

478
lib/fse.c
View File

@ -141,102 +141,6 @@ typedef U32 DTable_max_t[FSE_DTABLE_SIZE_U32(FSE_MAX_TABLELOG)];
/* Function templates */
size_t FSE_count_generic(unsigned* count, unsigned* maxSymbolValuePtr, const FSE_FUNCTION_TYPE* source, size_t sourceSize, unsigned safe)
{
const FSE_FUNCTION_TYPE* ip = source;
const FSE_FUNCTION_TYPE* const iend = ip+sourceSize;
unsigned maxSymbolValue = *maxSymbolValuePtr;
unsigned max=0;
int s;
U32 Counting1[FSE_MAX_SYMBOL_VALUE+1] = { 0 };
U32 Counting2[FSE_MAX_SYMBOL_VALUE+1] = { 0 };
U32 Counting3[FSE_MAX_SYMBOL_VALUE+1] = { 0 };
U32 Counting4[FSE_MAX_SYMBOL_VALUE+1] = { 0 };
/* safety checks */
if (!sourceSize)
{
memset(count, 0, (maxSymbolValue + 1) * sizeof(FSE_FUNCTION_TYPE));
*maxSymbolValuePtr = 0;
return 0;
}
if (maxSymbolValue > FSE_MAX_SYMBOL_VALUE) return ERROR(GENERIC); /* maxSymbolValue too large : unsupported */
if (!maxSymbolValue) maxSymbolValue = FSE_MAX_SYMBOL_VALUE; /* 0 == default */
if ((safe) || (sizeof(FSE_FUNCTION_TYPE)>1))
{
/* check input values, to avoid count table overflow */
while (ip < iend-3)
{
if (*ip>maxSymbolValue) return ERROR(GENERIC); Counting1[*ip++]++;
if (*ip>maxSymbolValue) return ERROR(GENERIC); Counting2[*ip++]++;
if (*ip>maxSymbolValue) return ERROR(GENERIC); Counting3[*ip++]++;
if (*ip>maxSymbolValue) return ERROR(GENERIC); Counting4[*ip++]++;
}
}
else
{
U32 cached = MEM_read32(ip); ip += 4;
while (ip < iend-15)
{
U32 c = cached; cached = MEM_read32(ip); ip += 4;
Counting1[(BYTE) c ]++;
Counting2[(BYTE)(c>>8) ]++;
Counting3[(BYTE)(c>>16)]++;
Counting4[ c>>24 ]++;
c = cached; cached = MEM_read32(ip); ip += 4;
Counting1[(BYTE) c ]++;
Counting2[(BYTE)(c>>8) ]++;
Counting3[(BYTE)(c>>16)]++;
Counting4[ c>>24 ]++;
c = cached; cached = MEM_read32(ip); ip += 4;
Counting1[(BYTE) c ]++;
Counting2[(BYTE)(c>>8) ]++;
Counting3[(BYTE)(c>>16)]++;
Counting4[ c>>24 ]++;
c = cached; cached = MEM_read32(ip); ip += 4;
Counting1[(BYTE) c ]++;
Counting2[(BYTE)(c>>8) ]++;
Counting3[(BYTE)(c>>16)]++;
Counting4[ c>>24 ]++;
}
ip-=4;
}
/* finish last symbols */
while (ip<iend) { if ((safe) && (*ip>maxSymbolValue)) return ERROR(GENERIC); Counting1[*ip++]++; }
for (s=0; s<=(int)maxSymbolValue; s++)
{
count[s] = Counting1[s] + Counting2[s] + Counting3[s] + Counting4[s];
if (count[s] > max) max = count[s];
}
while (!count[maxSymbolValue]) maxSymbolValue--;
*maxSymbolValuePtr = maxSymbolValue;
return (size_t)max;
}
/* hidden fast variant (unsafe) */
size_t FSE_FUNCTION_NAME(FSE_countFast, FSE_FUNCTION_EXTENSION)
(unsigned* count, unsigned* maxSymbolValuePtr, const FSE_FUNCTION_TYPE* source, size_t sourceSize)
{
return FSE_count_generic(count, maxSymbolValuePtr, source, sourceSize, 0);
}
size_t FSE_FUNCTION_NAME(FSE_count, FSE_FUNCTION_EXTENSION)
(unsigned* count, unsigned* maxSymbolValuePtr, const FSE_FUNCTION_TYPE* source, size_t sourceSize)
{
if ((sizeof(FSE_FUNCTION_TYPE)==1) && (*maxSymbolValuePtr >= 255))
{
*maxSymbolValuePtr = 255;
return FSE_count_generic(count, maxSymbolValuePtr, source, sourceSize, 0);
}
return FSE_count_generic(count, maxSymbolValuePtr, source, sourceSize, 1);
}
static U32 FSE_tableStep(U32 tableSize) { return (tableSize>>1) + (tableSize>>3) + 3; }
size_t FSE_buildCTable(FSE_CTable* ct, const short* normalizedCounter, unsigned maxSymbolValue, unsigned tableLog)
@ -264,35 +168,28 @@ size_t FSE_buildCTable(FSE_CTable* ct, const short* normalizedCounter, unsigned
/* symbol start positions */
cumul[0] = 0;
for (i=1; i<=maxSymbolValue+1; i++)
{
if (normalizedCounter[i-1]==-1) /* Low proba symbol */
{
for (i=1; i<=maxSymbolValue+1; i++) {
if (normalizedCounter[i-1]==-1) { /* Low proba symbol */
cumul[i] = cumul[i-1] + 1;
tableSymbol[highThreshold--] = (FSE_FUNCTION_TYPE)(i-1);
}
else
} else {
cumul[i] = cumul[i-1] + normalizedCounter[i-1];
}
} }
cumul[maxSymbolValue+1] = tableSize+1;
/* Spread symbols */
for (symbol=0; symbol<=maxSymbolValue; symbol++)
{
for (symbol=0; symbol<=maxSymbolValue; symbol++) {
int nbOccurences;
for (nbOccurences=0; nbOccurences<normalizedCounter[symbol]; nbOccurences++)
{
for (nbOccurences=0; nbOccurences<normalizedCounter[symbol]; nbOccurences++) {
tableSymbol[position] = (FSE_FUNCTION_TYPE)symbol;
position = (position + step) & tableMask;
while (position > highThreshold) position = (position + step) & tableMask; /* Low proba area */
}
}
} }
if (position!=0) return ERROR(GENERIC); /* Must have gone through all positions */
/* Build table */
for (i=0; i<tableSize; i++)
{
for (i=0; i<tableSize; i++) {
FSE_FUNCTION_TYPE s = tableSymbol[i]; /* note : static analyzer may not understand tableSymbol is properly initialized */
tableU16[cumul[s]++] = (U16) (tableSize+i); /* TableU16 : sorted by symbol order; gives next state value */
}
@ -301,15 +198,14 @@ size_t FSE_buildCTable(FSE_CTable* ct, const short* normalizedCounter, unsigned
{
unsigned s;
unsigned total = 0;
for (s=0; s<=maxSymbolValue; s++)
{
for (s=0; s<=maxSymbolValue; s++) {
switch (normalizedCounter[s])
{
case 0:
break;
case -1:
case 1:
symbolTT[s].deltaNbBits = tableLog << 16;
symbolTT[s].deltaNbBits = (tableLog << 16) - (1<<tableLog);
symbolTT[s].deltaFindState = total - 1;
total ++;
break;
@ -320,10 +216,7 @@ size_t FSE_buildCTable(FSE_CTable* ct, const short* normalizedCounter, unsigned
symbolTT[s].deltaNbBits = (maxBitsOut << 16) - minStatePlus;
symbolTT[s].deltaFindState = total - normalizedCounter[s];
total += normalizedCounter[s];
}
}
}
}
} } } }
return 0;
}
@ -361,45 +254,35 @@ size_t FSE_buildDTable(FSE_DTable* dt, const short* normalizedCounter, unsigned
/* Init, lay down lowprob symbols */
DTableH.tableLog = (U16)tableLog;
for (s=0; s<=maxSymbolValue; s++)
{
if (normalizedCounter[s]==-1)
{
for (s=0; s<=maxSymbolValue; s++) {
if (normalizedCounter[s]==-1) {
tableDecode[highThreshold--].symbol = (FSE_FUNCTION_TYPE)s;
symbolNext[s] = 1;
}
else
{
} else {
if (normalizedCounter[s] >= largeLimit) noLarge=0;
symbolNext[s] = normalizedCounter[s];
}
}
} }
/* Spread symbols */
for (s=0; s<=maxSymbolValue; s++)
{
for (s=0; s<=maxSymbolValue; s++) {
int i;
for (i=0; i<normalizedCounter[s]; i++)
{
for (i=0; i<normalizedCounter[s]; i++) {
tableDecode[position].symbol = (FSE_FUNCTION_TYPE)s;
position = (position + step) & tableMask;
while (position > highThreshold) position = (position + step) & tableMask; /* lowprob area */
}
}
} }
if (position!=0) return ERROR(GENERIC); /* position must reach all cells once, otherwise normalizedCounter is incorrect */
/* Build Decoding table */
{
U32 i;
for (i=0; i<tableSize; i++)
{
for (i=0; i<tableSize; i++) {
FSE_FUNCTION_TYPE symbol = (FSE_FUNCTION_TYPE)(tableDecode[i].symbol);
U16 nextState = symbolNext[symbol]++;
tableDecode[i].nbBits = (BYTE) (tableLog - BIT_highbit32 ((U32)nextState) );
tableDecode[i].newState = (U16) ( (nextState << tableDecode[i].nbBits) - tableSize);
}
}
} }
DTableH.fastMode = (U16)noLarge;
memcpy(dt, &DTableH, sizeof(DTableH));
@ -408,7 +291,7 @@ size_t FSE_buildDTable(FSE_DTable* dt, const short* normalizedCounter, unsigned
#ifndef FSE_COMMONDEFS_ONLY
/******************************************
/*-****************************************
* FSE helper functions
******************************************/
unsigned FSE_isError(size_t code) { return ERR_isError(code); }
@ -416,7 +299,7 @@ unsigned FSE_isError(size_t code) { return ERR_isError(code); }
const char* FSE_getErrorName(size_t code) { return ERR_getErrorName(code); }
/****************************************************************
/*-**************************************************************
* FSE NCount encoding-decoding
****************************************************************/
size_t FSE_NCountWriteBound(unsigned maxSymbolValue, unsigned tableLog)
@ -425,10 +308,7 @@ size_t FSE_NCountWriteBound(unsigned maxSymbolValue, unsigned tableLog)
return maxSymbolValue ? maxHeaderSize : FSE_NCOUNTBOUND; /* maxSymbolValue==0 ? use default */
}
static short FSE_abs(short a)
{
return a<0 ? -a : a;
}
static short FSE_abs(short a) { return a<0 ? -a : a; }
static size_t FSE_writeNCount_generic (void* header, size_t headerBufferSize,
const short* normalizedCounter, unsigned maxSymbolValue, unsigned tableLog,
@ -457,14 +337,11 @@ static size_t FSE_writeNCount_generic (void* header, size_t headerBufferSize,
threshold = tableSize;
nbBits = tableLog+1;
while (remaining>1) /* stops at 1 */
{
if (previous0)
{
while (remaining>1) { /* stops at 1 */
if (previous0) {
unsigned start = charnum;
while (!normalizedCounter[charnum]) charnum++;
while (charnum >= start+24)
{
while (charnum >= start+24) {
start+=24;
bitStream += 0xFFFFU << bitCount;
if ((!writeIsSafe) && (out > oend-2)) return ERROR(dstSize_tooSmall); /* Buffer overflow */
@ -473,24 +350,21 @@ static size_t FSE_writeNCount_generic (void* header, size_t headerBufferSize,
out+=2;
bitStream>>=16;
}
while (charnum >= start+3)
{
while (charnum >= start+3) {
start+=3;
bitStream += 3 << bitCount;
bitCount += 2;
}
bitStream += (charnum-start) << bitCount;
bitCount += 2;
if (bitCount>16)
{
if (bitCount>16) {
if ((!writeIsSafe) && (out > oend - 2)) return ERROR(dstSize_tooSmall); /* Buffer overflow */
out[0] = (BYTE)bitStream;
out[1] = (BYTE)(bitStream>>8);
out += 2;
bitStream >>= 16;
bitCount -= 16;
}
}
} }
{
short count = normalizedCounter[charnum++];
const short max = (short)((2*threshold-1)-remaining);
@ -504,16 +378,14 @@ static size_t FSE_writeNCount_generic (void* header, size_t headerBufferSize,
previous0 = (count==1);
while (remaining<threshold) nbBits--, threshold>>=1;
}
if (bitCount>16)
{
if (bitCount>16) {
if ((!writeIsSafe) && (out > oend - 2)) return ERROR(dstSize_tooSmall); /* Buffer overflow */
out[0] = (BYTE)bitStream;
out[1] = (BYTE)(bitStream>>8);
out += 2;
bitStream >>= 16;
bitCount -= 16;
}
}
} }
/* flush remaining bitStream */
if ((!writeIsSafe) && (out > oend - 2)) return ERROR(dstSize_tooSmall); /* Buffer overflow */
@ -564,27 +436,19 @@ size_t FSE_readNCount (short* normalizedCounter, unsigned* maxSVPtr, unsigned* t
threshold = 1<<nbBits;
nbBits++;
while ((remaining>1) && (charnum<=*maxSVPtr))
{
if (previous0)
{
while ((remaining>1) && (charnum<=*maxSVPtr)) {
if (previous0) {
unsigned n0 = charnum;
while ((bitStream & 0xFFFF) == 0xFFFF)
{
while ((bitStream & 0xFFFF) == 0xFFFF) {
n0+=24;
if (ip < iend-5)
{
if (ip < iend-5) {
ip+=2;
bitStream = MEM_readLE32(ip) >> bitCount;
}
else
{
} else {
bitStream >>= 16;
bitCount+=16;
}
}
while ((bitStream & 3) == 3)
{
} }
while ((bitStream & 3) == 3) {
n0+=3;
bitStream>>=2;
bitCount+=2;
@ -593,8 +457,7 @@ size_t FSE_readNCount (short* normalizedCounter, unsigned* maxSVPtr, unsigned* t
bitCount += 2;
if (n0 > *maxSVPtr) return ERROR(maxSymbolValue_tooSmall);
while (charnum < n0) normalizedCounter[charnum++] = 0;
if ((ip <= iend-7) || (ip + (bitCount>>3) <= iend-4))
{
if ((ip <= iend-7) || (ip + (bitCount>>3) <= iend-4)) {
ip += bitCount>>3;
bitCount &= 7;
bitStream = MEM_readLE32(ip) >> bitCount;
@ -606,13 +469,10 @@ size_t FSE_readNCount (short* normalizedCounter, unsigned* maxSVPtr, unsigned* t
const short max = (short)((2*threshold-1)-remaining);
short count;
if ((bitStream & (threshold-1)) < (U32)max)
{
if ((bitStream & (threshold-1)) < (U32)max) {
count = (short)(bitStream & (threshold-1));
bitCount += nbBits-1;
}
else
{
} else {
count = (short)(bitStream & (2*threshold-1));
if (count >= threshold) count -= max;
bitCount += nbBits;
@ -622,27 +482,20 @@ size_t FSE_readNCount (short* normalizedCounter, unsigned* maxSVPtr, unsigned* t
remaining -= FSE_abs(count);
normalizedCounter[charnum++] = count;
previous0 = !count;
while (remaining < threshold)
{
while (remaining < threshold) {
nbBits--;
threshold >>= 1;
}
{
if ((ip <= iend-7) || (ip + (bitCount>>3) <= iend-4))
{
ip += bitCount>>3;
bitCount &= 7;
}
else
{
bitCount -= (int)(8 * (iend - 4 - ip));
ip = iend - 4;
}
bitStream = MEM_readLE32(ip) >> (bitCount & 31);
if ((ip <= iend-7) || (ip + (bitCount>>3) <= iend-4)) {
ip += bitCount>>3;
bitCount &= 7;
} else {
bitCount -= (int)(8 * (iend - 4 - ip));
ip = iend - 4;
}
}
}
bitStream = MEM_readLE32(ip) >> (bitCount & 31);
} }
if (remaining != 1) return ERROR(GENERIC);
*maxSVPtr = charnum-1;
@ -652,10 +505,130 @@ size_t FSE_readNCount (short* normalizedCounter, unsigned* maxSVPtr, unsigned* t
}
/****************************************************************
/*-**************************************************************
* Counting histogram
****************************************************************/
/*! FSE_count_simple
This function just counts byte values within @src,
and store the histogram into @count.
This function is unsafe : it doesn't check that all values within @src can fit into @count.
For this reason, prefer using a table @count with 256 elements.
@return : highest count for a single element
*/
static size_t FSE_count_simple(unsigned* count, unsigned* maxSymbolValuePtr,
const void* src, size_t srcSize)
{
const BYTE* ip = (const BYTE*)src;
const BYTE* const end = ip + srcSize;
unsigned maxSymbolValue = *maxSymbolValuePtr;
unsigned max=0;
U32 s;
memset(count, 0, (maxSymbolValue+1)*sizeof(*count));
if (srcSize==0) { *maxSymbolValuePtr = 0; return 0; }
while (ip<end) count[*ip++]++;
while (!count[maxSymbolValue]) maxSymbolValue--;
*maxSymbolValuePtr = maxSymbolValue;
for (s=0; s<=maxSymbolValue; s++) if (count[s] > max) max = count[s];
return (size_t)max;
}
static size_t FSE_count_parallel(unsigned* count, unsigned* maxSymbolValuePtr,
const void* source, size_t sourceSize,
unsigned checkMax)
{
const BYTE* ip = (const BYTE*)source;
const BYTE* const iend = ip+sourceSize;
unsigned maxSymbolValue = *maxSymbolValuePtr;
unsigned max=0;
U32 s;
U32 Counting1[256] = { 0 };
U32 Counting2[256] = { 0 };
U32 Counting3[256] = { 0 };
U32 Counting4[256] = { 0 };
/* safety checks */
if (!sourceSize) {
memset(count, 0, maxSymbolValue + 1);
*maxSymbolValuePtr = 0;
return 0;
}
if (!maxSymbolValue) maxSymbolValue = 255; /* 0 == default */
{ /* by stripes of 16 bytes */
U32 cached = MEM_read32(ip); ip += 4;
while (ip < iend-15) {
U32 c = cached; cached = MEM_read32(ip); ip += 4;
Counting1[(BYTE) c ]++;
Counting2[(BYTE)(c>>8) ]++;
Counting3[(BYTE)(c>>16)]++;
Counting4[ c>>24 ]++;
c = cached; cached = MEM_read32(ip); ip += 4;
Counting1[(BYTE) c ]++;
Counting2[(BYTE)(c>>8) ]++;
Counting3[(BYTE)(c>>16)]++;
Counting4[ c>>24 ]++;
c = cached; cached = MEM_read32(ip); ip += 4;
Counting1[(BYTE) c ]++;
Counting2[(BYTE)(c>>8) ]++;
Counting3[(BYTE)(c>>16)]++;
Counting4[ c>>24 ]++;
c = cached; cached = MEM_read32(ip); ip += 4;
Counting1[(BYTE) c ]++;
Counting2[(BYTE)(c>>8) ]++;
Counting3[(BYTE)(c>>16)]++;
Counting4[ c>>24 ]++;
}
ip-=4;
}
/* finish last symbols */
while (ip<iend) Counting1[*ip++]++;
if (checkMax) { /* verify stats will fit into destination table */
for (s=255; s>maxSymbolValue; s--) {
Counting1[s] += Counting2[s] + Counting3[s] + Counting4[s];
if (Counting1[s]) return ERROR(maxSymbolValue_tooSmall);
} }
for (s=0; s<=maxSymbolValue; s++) {
count[s] = Counting1[s] + Counting2[s] + Counting3[s] + Counting4[s];
if (count[s] > max) max = count[s];
}
while (!count[maxSymbolValue]) maxSymbolValue--;
*maxSymbolValuePtr = maxSymbolValue;
return (size_t)max;
}
/* fast variant (unsafe : won't check if src contains values beyond count[] limit) */
size_t FSE_countFast(unsigned* count, unsigned* maxSymbolValuePtr,
const void* source, size_t sourceSize)
{
if (sourceSize < 1500) return FSE_count_simple(count, maxSymbolValuePtr, source, sourceSize);
return FSE_count_parallel(count, maxSymbolValuePtr, source, sourceSize, 0);
}
size_t FSE_count(unsigned* count, unsigned* maxSymbolValuePtr,
const void* source, size_t sourceSize)
{
if (*maxSymbolValuePtr <255)
return FSE_count_parallel(count, maxSymbolValuePtr, source, sourceSize, 1);
*maxSymbolValuePtr = 255;
return FSE_countFast(count, maxSymbolValuePtr, source, sourceSize);
}
/*-**************************************************************
* FSE Compression Code
****************************************************************/
/*
/*!
FSE_CTable is a variable size structure which contains :
U16 tableLog;
U16 maxSymbolValue;
@ -686,7 +659,6 @@ void FSE_freeCTable (FSE_CTable* ct)
free(ct);
}
/* provides the minimum logSize to safely represent a distribution */
static unsigned FSE_minTableLog(size_t srcSize, unsigned maxSymbolValue)
{
@ -723,22 +695,18 @@ static size_t FSE_normalizeM2(short* norm, U32 tableLog, const unsigned* count,
U32 lowThreshold = (U32)(total >> tableLog);
U32 lowOne = (U32)((total * 3) >> (tableLog + 1));
for (s=0; s<=maxSymbolValue; s++)
{
if (count[s] == 0)
{
for (s=0; s<=maxSymbolValue; s++) {
if (count[s] == 0) {
norm[s]=0;
continue;
}
if (count[s] <= lowThreshold)
{
if (count[s] <= lowThreshold) {
norm[s] = -1;
distributed++;
total -= count[s];
continue;
}
if (count[s] <= lowOne)
{
if (count[s] <= lowOne) {
norm[s] = 1;
distributed++;
total -= count[s];
@ -748,25 +716,20 @@ static size_t FSE_normalizeM2(short* norm, U32 tableLog, const unsigned* count,
}
ToDistribute = (1 << tableLog) - distributed;
if ((total / ToDistribute) > lowOne)
{
if ((total / ToDistribute) > lowOne) {
/* risk of rounding to zero */
lowOne = (U32)((total * 3) / (ToDistribute * 2));
for (s=0; s<=maxSymbolValue; s++)
{
if ((norm[s] == -2) && (count[s] <= lowOne))
{
for (s=0; s<=maxSymbolValue; s++) {
if ((norm[s] == -2) && (count[s] <= lowOne)) {
norm[s] = 1;
distributed++;
total -= count[s];
continue;
}
}
} }
ToDistribute = (1 << tableLog) - distributed;
}
if (distributed == maxSymbolValue+1)
{
if (distributed == maxSymbolValue+1) {
/* all values are pretty poor;
probably incompressible data (should have already been detected);
find max, then give all remaining points to max */
@ -782,10 +745,8 @@ static size_t FSE_normalizeM2(short* norm, U32 tableLog, const unsigned* count,
U64 const mid = (1ULL << (vStepLog-1)) - 1;
U64 const rStep = ((((U64)1<<vStepLog) * ToDistribute) + mid) / total; /* scale on remaining */
U64 tmpTotal = mid;
for (s=0; s<=maxSymbolValue; s++)
{
if (norm[s]==-2)
{
for (s=0; s<=maxSymbolValue; s++) {
if (norm[s]==-2) {
U64 end = tmpTotal + (count[s] * rStep);
U32 sStart = (U32)(tmpTotal >> vStepLog);
U32 sEnd = (U32)(end >> vStepLog);
@ -794,9 +755,7 @@ static size_t FSE_normalizeM2(short* norm, U32 tableLog, const unsigned* count,
return ERROR(GENERIC);
norm[s] = (short)weight;
tmpTotal = end;
}
}
}
} } }
return 0;
}
@ -809,7 +768,7 @@ size_t FSE_normalizeCount (short* normalizedCounter, unsigned tableLog,
/* Sanity checks */
if (tableLog==0) tableLog = FSE_DEFAULT_TABLELOG;
if (tableLog < FSE_MIN_TABLELOG) return ERROR(GENERIC); /* Unsupported size */
if (tableLog > FSE_MAX_TABLELOG) return ERROR(GENERIC); /* Unsupported size */
if (tableLog > FSE_MAX_TABLELOG) return ERROR(tableLog_tooLarge); /* Unsupported size */
if (tableLog < FSE_minTableLog(total, maxSymbolValue)) return ERROR(GENERIC); /* Too small tableLog, compression potentially impossible */
{
@ -823,38 +782,23 @@ size_t FSE_normalizeCount (short* normalizedCounter, unsigned tableLog,
short largestP=0;
U32 lowThreshold = (U32)(total >> tableLog);
for (s=0; s<=maxSymbolValue; s++)
{
if (count[s] == total) return 0;
if (count[s] == 0)
{
normalizedCounter[s]=0;
continue;
}
if (count[s] <= lowThreshold)
{
for (s=0; s<=maxSymbolValue; s++) {
if (count[s] == total) return 0; /* rle special case */
if (count[s] == 0) { normalizedCounter[s]=0; continue; }
if (count[s] <= lowThreshold) {
normalizedCounter[s] = -1;
stillToDistribute--;
}
else
{
} else {
short proba = (short)((count[s]*step) >> scale);
if (proba<8)
{
if (proba<8) {
U64 restToBeat = vStep * rtbTable[proba];
proba += (count[s]*step) - ((U64)proba<<scale) > restToBeat;
}
if (proba > largestP)
{
largestP=proba;
largest=s;
}
if (proba > largestP) largestP=proba, largest=s;
normalizedCounter[s] = proba;
stillToDistribute -= proba;
}
}
if (-stillToDistribute >= (normalizedCounter[largest] >> 1))
{
} }
if (-stillToDistribute >= (normalizedCounter[largest] >> 1)) {
/* corner case, need another normalization method */
size_t errorCode = FSE_normalizeM2(normalizedCounter, tableLog, count, total, maxSymbolValue);
if (FSE_isError(errorCode)) return errorCode;
@ -904,10 +848,12 @@ size_t FSE_buildCTable_raw (FSE_CTable* ct, unsigned nbBits)
tableU16[s] = (U16)(tableSize + s);
/* Build Symbol Transformation Table */
for (s=0; s<=maxSymbolValue; s++)
{
symbolTT[s].deltaNbBits = nbBits << 16;
symbolTT[s].deltaFindState = s-1;
const U32 deltaNbBits = (nbBits << 16) - (1 << nbBits);
for (s=0; s<=maxSymbolValue; s++) {
symbolTT[s].deltaNbBits = deltaNbBits;
symbolTT[s].deltaFindState = s-1;
}
}
return 0;
@ -930,10 +876,8 @@ size_t FSE_buildCTable_rle (FSE_CTable* ct, BYTE symbolValue)
tableU16[1] = 0; /* just in case */
/* Build Symbol Transformation Table */
{
symbolTT[symbolValue].deltaNbBits = 0;
symbolTT[symbolValue].deltaFindState = 0;
}
symbolTT[symbolValue].deltaNbBits = 0;
symbolTT[symbolValue].deltaFindState = 0;
return 0;
}
@ -963,15 +907,13 @@ static size_t FSE_compress_usingCTable_generic (void* dst, size_t dstSize,
#define FSE_FLUSHBITS(s) (fast ? BIT_flushBitsFast(s) : BIT_flushBits(s))
/* join to even */
if (srcSize & 1)
{
if (srcSize & 1) {
FSE_encodeSymbol(&bitC, &CState1, *--ip);
FSE_FLUSHBITS(&bitC);
}
/* join to mod 4 */
if ((sizeof(bitC.bitContainer)*8 > FSE_MAX_TABLELOG*4+7 ) && (srcSize & 2)) /* test bit 2 */
{
if ((sizeof(bitC.bitContainer)*8 > FSE_MAX_TABLELOG*4+7 ) && (srcSize & 2)) { /* test bit 2 */
FSE_encodeSymbol(&bitC, &CState2, *--ip);
FSE_encodeSymbol(&bitC, &CState1, *--ip);
FSE_FLUSHBITS(&bitC);
@ -987,8 +929,7 @@ static size_t FSE_compress_usingCTable_generic (void* dst, size_t dstSize,
FSE_encodeSymbol(&bitC, &CState1, *--ip);
if (sizeof(bitC.bitContainer)*8 > FSE_MAX_TABLELOG*4+7 ) /* this test must be static */
{
if (sizeof(bitC.bitContainer)*8 > FSE_MAX_TABLELOG*4+7 ) { /* this test must be static */
FSE_encodeSymbol(&bitC, &CState2, *--ip);
FSE_encodeSymbol(&bitC, &CState1, *--ip);
}
@ -1071,7 +1012,7 @@ size_t FSE_compress (void* dst, size_t dstSize, const void* src, size_t srcSize)
}
/*********************************************************
/*-*******************************************************
* Decompression (Byte symbols)
*********************************************************/
size_t FSE_buildDTable_rle (FSE_DTable* dt, BYTE symbolValue)
@ -1109,8 +1050,7 @@ size_t FSE_buildDTable_raw (FSE_DTable* dt, unsigned nbBits)
/* Build Decoding Table */
DTableH->tableLog = (U16)nbBits;
DTableH->fastMode = 1;
for (s=0; s<=maxSymbolValue; s++)
{
for (s=0; s<=maxSymbolValue; s++) {
dinfo[s].newState = 0;
dinfo[s].symbol = (BYTE)s;
dinfo[s].nbBits = (BYTE)nbBits;
@ -1144,8 +1084,7 @@ FORCE_INLINE size_t FSE_decompress_usingDTable_generic(
#define FSE_GETSYMBOL(statePtr) fast ? FSE_decodeSymbolFast(statePtr, &bitD) : FSE_decodeSymbol(statePtr, &bitD)
/* 4 symbols per loop */
for ( ; (BIT_reloadDStream(&bitD)==BIT_DStream_unfinished) && (op<olimit) ; op+=4)
{
for ( ; (BIT_reloadDStream(&bitD)==BIT_DStream_unfinished) && (op<olimit) ; op+=4) {
op[0] = FSE_GETSYMBOL(&state1);
if (FSE_MAX_TABLELOG*2+7 > sizeof(bitD.bitContainer)*8) /* This test must be static */
@ -1166,8 +1105,7 @@ FORCE_INLINE size_t FSE_decompress_usingDTable_generic(
/* tail */
/* note : BIT_reloadDStream(&bitD) >= FSE_DStream_partiallyFilled; Ends at exactly BIT_DStream_completed */
while (1)
{
while (1) {
if ( (BIT_reloadDStream(&bitD)>BIT_DStream_completed) || (op==omax) || (BIT_endOfDStream(&bitD) && (fast || FSE_endOfDState(&state1))) )
break;

View File

@ -46,7 +46,7 @@ extern "C" {
#include <stddef.h> /* size_t, ptrdiff_t */
/* *****************************************
/*-****************************************
* FSE simple functions
******************************************/
size_t FSE_compress(void* dst, size_t maxDstSize,
@ -124,13 +124,13 @@ or to save and provide normalized distribution using external method.
/*!
FSE_count():
Provides the precise count of each symbol within a table 'count'
'count' is a table of unsigned int, of minimum size (maxSymbolValuePtr[0]+1).
maxSymbolValuePtr[0] will be updated if detected smaller than initially expected
return : the count of the most frequent symbol (which is not identified)
if return == srcSize, there is only one symbol.
if FSE_isError(return), it's an error code. */
size_t FSE_count(unsigned* count, unsigned* maxSymbolValuePtr, const unsigned char* src, size_t srcSize);
Provides the precise count of each byte within a table 'count'
'count' is a table of unsigned int, of minimum size (*maxSymbolValuePtr+1).
*maxSymbolValuePtr will be updated if detected smaller than initial value.
@return : the count of the most frequent symbol (which is not identified)
if return == srcSize, there is only one symbol.
Can also return an error code, which can be tested with FSE_isError() */
size_t FSE_count(unsigned* count, unsigned* maxSymbolValuePtr, const void* src, size_t srcSize);
/*!
FSE_optimalTableLog():
@ -170,18 +170,18 @@ void FSE_freeCTable (FSE_CTable* ct);
/*!
FSE_buildCTable():
Builds 'ct', which must be already allocated, using FSE_createCTable()
Builds @ct, which must be already allocated, using FSE_createCTable()
return : 0
or an errorCode, which can be tested using FSE_isError() */
size_t FSE_buildCTable(FSE_CTable* ct, const short* normalizedCounter, unsigned maxSymbolValue, unsigned tableLog);
/*!
FSE_compress_usingCTable():
Compress 'src' using 'ct' into 'dst' which must be already allocated
return : size of compressed data (<= maxDstSize)
or 0 if compressed data could not fit into 'dst'
Compress @src using @ct into @dst which must be already allocated
return : size of compressed data (<= @dstCapacity)
or 0 if compressed data could not fit into @dst
or an errorCode, which can be tested using FSE_isError() */
size_t FSE_compress_usingCTable (void* dst, size_t maxDstSize, const void* src, size_t srcSize, const FSE_CTable* ct);
size_t FSE_compress_usingCTable (void* dst, size_t dstCapacity, const void* src, size_t srcSize, const FSE_CTable* ct);
/*!
Tutorial :
@ -221,7 +221,7 @@ If there is an error, both functions will return an ErrorCode (which can be test
'CTable' can then be used to compress 'src', with FSE_compress_usingCTable().
Similar to FSE_count(), the convention is that 'src' is assumed to be a table of char of size 'srcSize'
The function returns the size of compressed data (without header), necessarily <= maxDstSize.
The function returns the size of compressed data (without header), necessarily <= @dstCapacity.
If it returns '0', compressed data could not fit into 'dst'.
If there is an error, the function will return an ErrorCode (which can be tested using FSE_isError()).
*/
@ -253,11 +253,11 @@ size_t FSE_buildDTable (FSE_DTable* dt, const short* normalizedCounter, unsigned
/*!
FSE_decompress_usingDTable():
Decompress compressed source 'cSrc' of size 'cSrcSize' using 'dt'
into 'dst' which must be already allocated.
return : size of regenerated data (necessarily <= maxDstSize)
Decompress compressed source @cSrc of size @cSrcSize using @dt
into @dst which must be already allocated.
return : size of regenerated data (necessarily <= @dstCapacity)
or an errorCode, which can be tested using FSE_isError() */
size_t FSE_decompress_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const FSE_DTable* dt);
size_t FSE_decompress_usingDTable(void* dst, size_t dstCapacity, const void* cSrc, size_t cSrcSize, const FSE_DTable* dt);
/*!
Tutorial :

View File

@ -63,8 +63,8 @@ extern "C" {
/* *****************************************
* FSE advanced API
*******************************************/
size_t FSE_countFast(unsigned* count, unsigned* maxSymbolValuePtr, const unsigned char* src, size_t srcSize);
/* same as FSE_count(), but blindly trust that all values within src are <= *maxSymbolValuePtr */
size_t FSE_countFast(unsigned* count, unsigned* maxSymbolValuePtr, const void* src, size_t srcSize);
/* same as FSE_count(), but blindly trusts that all byte values within src are <= *maxSymbolValuePtr */
size_t FSE_buildCTable_raw (FSE_CTable* ct, unsigned nbBits);
/* build a fake FSE_CTable, designed to not compress an input, where each symbol uses nbBits */
@ -223,8 +223,7 @@ static unsigned char FSE_decodeSymbolFast(FSE_DState_t* DStatePtr, BIT_DStream_t
/* *****************************************
* Implementation of inlined functions
*******************************************/
typedef struct
{
typedef struct {
int deltaFindState;
U32 deltaNbBits;
} FSE_symbolCompressionTransform; /* total 8 bytes */
@ -240,6 +239,19 @@ MEM_STATIC void FSE_initCState(FSE_CState_t* statePtr, const FSE_CTable* ct)
statePtr->stateLog = tableLog;
}
MEM_STATIC void FSE_initCState2(FSE_CState_t* statePtr, const FSE_CTable* ct, U32 symbol)
{
FSE_initCState(statePtr, ct);
{
const FSE_symbolCompressionTransform symbolTT = ((const FSE_symbolCompressionTransform*)(statePtr->symbolTT))[symbol];
const U16* stateTable = (const U16*)(statePtr->stateTable);
U32 nbBitsOut = (U32)((symbolTT.deltaNbBits + (1<<15)) >> 16);
statePtr->value = (nbBitsOut << 16) - symbolTT.deltaNbBits;
statePtr->value = stateTable[(statePtr->value >> nbBitsOut) + symbolTT.deltaFindState];
}
}
MEM_STATIC void FSE_encodeSymbol(BIT_CStream_t* bitC, FSE_CState_t* statePtr, U32 symbol)
{
const FSE_symbolCompressionTransform symbolTT = ((const FSE_symbolCompressionTransform*)(statePtr->symbolTT))[symbol];
@ -278,6 +290,17 @@ MEM_STATIC void FSE_initDState(FSE_DState_t* DStatePtr, BIT_DStream_t* bitD, con
DStatePtr->table = dt + 1;
}
MEM_STATIC size_t FSE_getStateValue(FSE_DState_t* DStatePtr)
{
return DStatePtr->state;
}
MEM_STATIC BYTE FSE_peakSymbol(FSE_DState_t* DStatePtr)
{
const FSE_decode_t DInfo = ((const FSE_decode_t*)(DStatePtr->table))[DStatePtr->state];
return DInfo.symbol;
}
MEM_STATIC BYTE FSE_decodeSymbol(FSE_DState_t* DStatePtr, BIT_DStream_t* bitD)
{
const FSE_decode_t DInfo = ((const FSE_decode_t*)(DStatePtr->table))[DStatePtr->state];

File diff suppressed because it is too large Load Diff

View File

@ -1,7 +1,7 @@
/* ******************************************************************
Huff0 : Huffman coder, part of New Generation Entropy library
header file
Copyright (C) 2013-2015, Yann Collet.
Copyright (C) 2013-2016, Yann Collet.
BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
@ -30,7 +30,6 @@
You can contact the author at :
- Source repository : https://github.com/Cyan4973/FiniteStateEntropy
- Public forum : https://groups.google.com/forum/#!forum/lz4c
****************************************************************** */
#ifndef HUFF0_H
#define HUFF0_H
@ -66,8 +65,10 @@ HUF_compress():
HUF_decompress():
Decompress Huff0 data from buffer 'cSrc', of size 'cSrcSize',
into already allocated destination buffer 'dst', of size 'dstSize'.
'dstSize' must be the exact size of original (uncompressed) data.
Note : in contrast with FSE, HUF_decompress can regenerate RLE (cSrcSize==1) and uncompressed (cSrcSize==dstSize) data, because it knows size to regenerate.
@dstSize : must be the **exact** size of original (uncompressed) data.
Note : in contrast with FSE, HUF_decompress can regenerate
RLE (cSrcSize==1) and uncompressed (cSrcSize==dstSize) data,
because it knows size to regenerate.
@return : size of regenerated data (== dstSize)
or an error code, which can be tested using HUF_isError()
*/

View File

@ -1,7 +1,7 @@
/* ******************************************************************
Huff0 : Huffman coder, part of New Generation Entropy library
header file for static linking (only)
Copyright (C) 2013-2015, Yann Collet
Huff0 : Huffman codec, part of New Generation Entropy library
header file, for static linking only
Copyright (C) 2013-2016, Yann Collet
BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
@ -30,7 +30,6 @@
You can contact the author at :
- Source repository : https://github.com/Cyan4973/FiniteStateEntropy
- Public forum : https://groups.google.com/forum/#!forum/lz4c
****************************************************************** */
#ifndef HUFF0_STATIC_H
#define HUFF0_STATIC_H
@ -47,15 +46,21 @@ extern "C" {
/* ****************************************
* Static allocation macros
* Static allocation
******************************************/
/* Huff0 buffer bounds */
#define HUF_CTABLEBOUND 129
#define HUF_BLOCKBOUND(size) (size + (size>>8) + 8) /* only true if incompressible pre-filtered with fast heuristic */
#define HUF_COMPRESSBOUND(size) (HUF_CTABLEBOUND + HUF_BLOCKBOUND(size)) /* Macro version, useful for static allocation */
/* static allocation of Huff0's Compression Table */
#define HUF_CREATE_STATIC_CTABLE(name, maxSymbolValue) \
U32 name##hb[maxSymbolValue+1]; \
void* name##hv = &(name##hb); \
HUF_CElt* name = (HUF_CElt*)(name##hv) /* no final ; */
/* static allocation of Huff0's DTable */
#define HUF_DTABLE_SIZE(maxTableLog) (1 + (1<<maxTableLog)) /* nb Cells; use unsigned short for X2, unsigned int for X4 */
#define HUF_DTABLE_SIZE(maxTableLog) (1 + (1<<maxTableLog))
#define HUF_CREATE_STATIC_DTABLEX2(DTable, maxTableLog) \
unsigned short DTable[HUF_DTABLE_SIZE(maxTableLog)] = { maxTableLog }
#define HUF_CREATE_STATIC_DTABLEX4(DTable, maxTableLog) \
@ -86,13 +91,11 @@ The following API allows targeting specific sub-functions for advanced tasks.
For example, it's possible to compress several blocks using the same 'CTable',
or to save and regenerate 'CTable' using external methods.
*/
/* FSE_count() : find it within "fse.h" */
typedef struct HUF_CElt_s HUF_CElt; /* incomplete type */
size_t HUF_buildCTable (HUF_CElt* tree, const unsigned* count, unsigned maxSymbolValue, unsigned maxNbBits);
size_t HUF_writeCTable (void* dst, size_t maxDstSize, const HUF_CElt* tree, unsigned maxSymbolValue, unsigned huffLog);
size_t HUF_compress_usingCTable(void* dst, size_t dstSize, const void* src, size_t srcSize, const HUF_CElt* CTable);
size_t HUF_buildCTable (HUF_CElt* CTable, const unsigned* count, unsigned maxSymbolValue, unsigned maxNbBits);
size_t HUF_writeCTable (void* dst, size_t maxDstSize, const HUF_CElt* CTable, unsigned maxSymbolValue, unsigned huffLog);
size_t HUF_compress4X_into4Segments(void* dst, size_t dstSize, const void* src, size_t srcSize, const HUF_CElt* CTable);
/*!
@ -100,19 +103,33 @@ HUF_decompress() does the following:
1. select the decompression algorithm (X2, X4, X6) based on pre-computed heuristics
2. build Huffman table from save, using HUF_readDTableXn()
3. decode 1 or 4 segments in parallel using HUF_decompressSXn_usingDTable
*/
size_t HUF_readDTableX2 (unsigned short* DTable, const void* src, size_t srcSize);
size_t HUF_readDTableX4 (unsigned* DTable, const void* src, size_t srcSize);
size_t HUF_readDTableX6 (unsigned* DTable, const void* src, size_t srcSize);
size_t HUF_decompress4X2_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const unsigned short* DTable);
size_t HUF_decompress4X4_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const unsigned* DTable);
size_t HUF_decompress4X6_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const unsigned* DTable);
/* single stream variants */
size_t HUF_compress1X (void* dst, size_t dstSize, const void* src, size_t srcSize, unsigned maxSymbolValue, unsigned tableLog);
size_t HUF_compress1X_usingCTable(void* dst, size_t dstSize, const void* src, size_t srcSize, const HUF_CElt* CTable);
size_t HUF_decompress1X2 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /* single-symbol decoder */
size_t HUF_decompress1X4 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /* double-symbol decoder */
size_t HUF_decompress1X6 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /* quad-symbol decoder */
size_t HUF_decompress4X2_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const unsigned short* DTable);
size_t HUF_decompress4X4_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const unsigned* DTable);
size_t HUF_decompress4X6_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const unsigned* DTable);
size_t HUF_decompress1X2_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const unsigned short* DTable);
size_t HUF_decompress1X4_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const unsigned* DTable);
size_t HUF_decompress1X6_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const unsigned* DTable);
/* Loading a CTable saved with HUF_writeCTable() */
size_t HUF_readCTable (HUF_CElt* CTable, unsigned maxSymbolValue, const void* src, size_t srcSize);
#if defined (__cplusplus)

View File

@ -45,6 +45,7 @@ extern "C" {
#include "zstd_v01.h"
#include "zstd_v02.h"
#include "zstd_v03.h"
#include "zstd_v04.h"
MEM_STATIC unsigned ZSTD_isLegacy (U32 magicNumberLE)
{
@ -52,7 +53,8 @@ MEM_STATIC unsigned ZSTD_isLegacy (U32 magicNumberLE)
{
case ZSTDv01_magicNumberLE :
case ZSTDv02_magicNumber :
case ZSTDv03_magicNumber : return 1;
case ZSTDv03_magicNumber :
case ZSTDv04_magicNumber : return 1;
default : return 0;
}
}
@ -71,6 +73,8 @@ MEM_STATIC size_t ZSTD_decompressLegacy(
return ZSTDv02_decompress(dst, maxOriginalSize, src, compressedSize);
case ZSTDv03_magicNumber :
return ZSTDv03_decompress(dst, maxOriginalSize, src, compressedSize);
case ZSTDv04_magicNumber :
return ZSTDv04_decompress(dst, maxOriginalSize, src, compressedSize);
default :
return ERROR(prefix_unknown);
}

4431
lib/legacy/zstd_v04.c Normal file

File diff suppressed because it is too large Load Diff

148
lib/legacy/zstd_v04.h Normal file
View File

@ -0,0 +1,148 @@
/*
zstd_v04 - decoder for 0.4 format
Header File
Copyright (C) 2016, Yann Collet.
BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following disclaimer
in the documentation and/or other materials provided with the
distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
You can contact the author at :
- zstd source repository : https://github.com/Cyan4973/zstd
- ztsd public forum : https://groups.google.com/forum/#!forum/lz4c
*/
#pragma once
#if defined (__cplusplus)
extern "C" {
#endif
/* *************************************
* Includes
***************************************/
#include <stddef.h> /* size_t */
/* *************************************
* Simple one-step function
***************************************/
/**
ZSTDv04_decompress() : decompress ZSTD frames compliant with v0.4.x format
compressedSize : is the exact source size
maxOriginalSize : is the size of the 'dst' buffer, which must be already allocated.
It must be equal or larger than originalSize, otherwise decompression will fail.
return : the number of bytes decompressed into destination buffer (originalSize)
or an errorCode if it fails (which can be tested using ZSTDv01_isError())
*/
size_t ZSTDv04_decompress( void* dst, size_t maxOriginalSize,
const void* src, size_t compressedSize);
/**
ZSTDv04_isError() : tells if the result of ZSTDv04_decompress() is an error
*/
unsigned ZSTDv04_isError(size_t code);
/* *************************************
* Advanced functions
***************************************/
typedef struct ZSTDv04_Dctx_s ZSTDv04_Dctx;
ZSTDv04_Dctx* ZSTDv04_createDCtx(void);
size_t ZSTDv04_freeDCtx(ZSTDv04_Dctx* dctx);
size_t ZSTDv04_decompressDCtx(ZSTDv04_Dctx* dctx,
void* dst, size_t maxOriginalSize,
const void* src, size_t compressedSize);
/* *************************************
* Direct Streaming
***************************************/
size_t ZSTDv04_resetDCtx(ZSTDv04_Dctx* dctx);
size_t ZSTDv04_nextSrcSizeToDecompress(ZSTDv04_Dctx* dctx);
size_t ZSTDv04_decompressContinue(ZSTDv04_Dctx* dctx, void* dst, size_t maxDstSize, const void* src, size_t srcSize);
/**
Use above functions alternatively.
ZSTD_nextSrcSizeToDecompress() tells how much bytes to provide as 'srcSize' to ZSTD_decompressContinue().
ZSTD_decompressContinue() will use previous data blocks to improve compression if they are located prior to current block.
Result is the number of bytes regenerated within 'dst'.
It can be zero, which is not an error; it just means ZSTD_decompressContinue() has decoded some header.
*/
/* *************************************
* Buffered Streaming
***************************************/
typedef struct ZBUFFv04_DCtx_s ZBUFFv04_DCtx;
ZBUFFv04_DCtx* ZBUFFv04_createDCtx(void);
size_t ZBUFFv04_freeDCtx(ZBUFFv04_DCtx* dctx);
size_t ZBUFFv04_decompressInit(ZBUFFv04_DCtx* dctx);
size_t ZBUFFv04_decompressWithDictionary(ZBUFFv04_DCtx* dctx, const void* dict, size_t dictSize);
size_t ZBUFFv04_decompressContinue(ZBUFFv04_DCtx* dctx, void* dst, size_t* maxDstSizePtr, const void* src, size_t* srcSizePtr);
/** ************************************************
* Streaming decompression
*
* A ZBUFF_DCtx object is required to track streaming operation.
* Use ZBUFF_createDCtx() and ZBUFF_freeDCtx() to create/release resources.
* Use ZBUFF_decompressInit() to start a new decompression operation.
* ZBUFF_DCtx objects can be reused multiple times.
*
* Optionally, a reference to a static dictionary can be set, using ZBUFF_decompressWithDictionary()
* It must be the same content as the one set during compression phase.
* Dictionary content must remain accessible during the decompression process.
*
* Use ZBUFF_decompressContinue() repetitively to consume your input.
* *srcSizePtr and *maxDstSizePtr can be any size.
* The function will report how many bytes were read or written by modifying *srcSizePtr and *maxDstSizePtr.
* Note that it may not consume the entire input, in which case it's up to the caller to present remaining input again.
* The content of dst will be overwritten (up to *maxDstSizePtr) at each function call, so save its content if it matters or change dst.
* @return : a hint to preferred nb of bytes to use as input for next function call (it's only a hint, to improve latency)
* or 0 when a frame is completely decoded
* or an error code, which can be tested using ZBUFF_isError().
*
* Hint : recommended buffer sizes (not compulsory) : ZBUFF_recommendedDInSize / ZBUFF_recommendedDOutSize
* output : ZBUFF_recommendedDOutSize==128 KB block size is the internal unit, it ensures it's always possible to write a full block when it's decoded.
* input : ZBUFF_recommendedDInSize==128Kb+3; just follow indications from ZBUFF_decompressContinue() to minimize latency. It should always be <= 128 KB + 3 .
* **************************************************/
unsigned ZBUFFv04_isError(size_t errorCode);
const char* ZBUFFv04_getErrorName(size_t errorCode);
/** The below functions provide recommended buffer sizes for Compression or Decompression operations.
* These sizes are not compulsory, they just tend to offer better latency */
size_t ZBUFFv04_recommendedDInSize(void);
size_t ZBUFFv04_recommendedDOutSize(void);
/* *************************************
* Prefix - version detection
***************************************/
#define ZSTDv04_magicNumber 0xFD2FB524 /* v0.4 */
#if defined (__cplusplus)
}
#endif

View File

@ -39,15 +39,15 @@
extern "C" {
#endif
/******************************************
* Includes
/*-****************************************
* Dependencies
******************************************/
#include <stddef.h> /* size_t, ptrdiff_t */
#include <string.h> /* memcpy */
/******************************************
* Compiler-specific
/*-****************************************
* Compiler specifics
******************************************/
#if defined(__GNUC__)
# define MEM_STATIC static __attribute__((unused))
@ -60,7 +60,7 @@ extern "C" {
#endif
/****************************************************************
/*-**************************************************************
* Basic Types
*****************************************************************/
#if defined (__cplusplus) || (defined (__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L) /* C99 */)
@ -83,10 +83,10 @@ extern "C" {
#endif
/****************************************************************
/*-**************************************************************
* Memory I/O
*****************************************************************/
/* MEM_FORCE_MEMORY_ACCESS
/*!MEM_FORCE_MEMORY_ACCESS
* By default, access to unaligned memory is controlled by `memcpy()`, which is safe and portable.
* Unfortunately, on some target/compiler combinations, the generated assembly is sub-optimal.
* The below switch allow to select different access method for improved performance.
@ -94,8 +94,8 @@ extern "C" {
* Method 1 : `__packed` statement. It depends on compiler extension (ie, not portable).
* This method is safe if your compiler supports it, and *generally* as fast or faster than `memcpy`.
* Method 2 : direct access. This method is portable but violate C standard.
* It can generate buggy code on targets generating assembly depending on alignment.
* But in some circumstances, it's the only known way to get the most performance (ie GCC + ARMv6)
* It can generate buggy code on targets depending on alignment.
* In some circumstances, it's the only known way to get the most performance (ie GCC + ARMv6)
* See http://fastcompression.blogspot.fr/2015/08/accessing-unaligned-memory.html for details.
* Prefer these methods in priority order (0 > 1 > 2)
*/
@ -185,8 +185,7 @@ MEM_STATIC U16 MEM_readLE16(const void* memPtr)
{
if (MEM_isLittleEndian())
return MEM_read16(memPtr);
else
{
else {
const BYTE* p = (const BYTE*)memPtr;
return (U16)(p[0] + (p[1]<<8));
}
@ -194,12 +193,9 @@ MEM_STATIC U16 MEM_readLE16(const void* memPtr)
MEM_STATIC void MEM_writeLE16(void* memPtr, U16 val)
{
if (MEM_isLittleEndian())
{
if (MEM_isLittleEndian()) {
MEM_write16(memPtr, val);
}
else
{
} else {
BYTE* p = (BYTE*)memPtr;
p[0] = (BYTE)val;
p[1] = (BYTE)(val>>8);
@ -210,8 +206,7 @@ MEM_STATIC U32 MEM_readLE32(const void* memPtr)
{
if (MEM_isLittleEndian())
return MEM_read32(memPtr);
else
{
else {
const BYTE* p = (const BYTE*)memPtr;
return (U32)((U32)p[0] + ((U32)p[1]<<8) + ((U32)p[2]<<16) + ((U32)p[3]<<24));
}
@ -219,12 +214,9 @@ MEM_STATIC U32 MEM_readLE32(const void* memPtr)
MEM_STATIC void MEM_writeLE32(void* memPtr, U32 val32)
{
if (MEM_isLittleEndian())
{
if (MEM_isLittleEndian()) {
MEM_write32(memPtr, val32);
}
else
{
} else {
BYTE* p = (BYTE*)memPtr;
p[0] = (BYTE)val32;
p[1] = (BYTE)(val32>>8);
@ -237,8 +229,7 @@ MEM_STATIC U64 MEM_readLE64(const void* memPtr)
{
if (MEM_isLittleEndian())
return MEM_read64(memPtr);
else
{
else {
const BYTE* p = (const BYTE*)memPtr;
return (U64)((U64)p[0] + ((U64)p[1]<<8) + ((U64)p[2]<<16) + ((U64)p[3]<<24)
+ ((U64)p[4]<<32) + ((U64)p[5]<<40) + ((U64)p[6]<<48) + ((U64)p[7]<<56));
@ -247,12 +238,9 @@ MEM_STATIC U64 MEM_readLE64(const void* memPtr)
MEM_STATIC void MEM_writeLE64(void* memPtr, U64 val64)
{
if (MEM_isLittleEndian())
{
if (MEM_isLittleEndian()) {
MEM_write64(memPtr, val64);
}
else
{
} else {
BYTE* p = (BYTE*)memPtr;
p[0] = (BYTE)val64;
p[1] = (BYTE)(val64>>8);

View File

@ -1,7 +1,7 @@
/*
zstd - standard compression library
Header File
Copyright (C) 2014-2015, Yann Collet.
Copyright (C) 2014-2016, Yann Collet.
BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
@ -28,7 +28,6 @@
You can contact the author at :
- zstd source repository : https://github.com/Cyan4973/zstd
- ztsd public forum : https://groups.google.com/forum/#!forum/lz4c
*/
#ifndef ZSTD_H
#define ZSTD_H
@ -37,13 +36,13 @@
extern "C" {
#endif
/* *************************************
* Includes
/*-*************************************
* Dependencies
***************************************/
#include <stddef.h> /* size_t */
/* ***************************************************************
/*-***************************************************************
* Export parameters
*****************************************************************/
/*!
@ -61,8 +60,8 @@ extern "C" {
* Version
***************************************/
#define ZSTD_VERSION_MAJOR 0 /* for breaking interface changes */
#define ZSTD_VERSION_MINOR 4 /* for new (non-breaking) interface capabilities */
#define ZSTD_VERSION_RELEASE 7 /* for tweaks, bug-fixes, or development */
#define ZSTD_VERSION_MINOR 5 /* for new (non-breaking) interface capabilities */
#define ZSTD_VERSION_RELEASE 0 /* for tweaks, bug-fixes, or development */
#define ZSTD_VERSION_NUMBER (ZSTD_VERSION_MAJOR *100*100 + ZSTD_VERSION_MINOR *100 + ZSTD_VERSION_RELEASE)
ZSTDLIB_API unsigned ZSTD_versionNumber (void);
@ -70,60 +69,77 @@ ZSTDLIB_API unsigned ZSTD_versionNumber (void);
/* *************************************
* Simple functions
***************************************/
ZSTDLIB_API size_t ZSTD_compress( void* dst, size_t maxDstSize,
/*! ZSTD_compress() :
Compresses `srcSize` bytes from buffer `src` into buffer `dst` of size `dstCapacity`.
Destination buffer must be already allocated.
Compression runs faster if `dstCapacity` >= `ZSTD_compressBound(srcSize)`.
@return : the number of bytes written into `dst`,
or an error code if it fails (which can be tested using ZSTD_isError()) */
ZSTDLIB_API size_t ZSTD_compress( void* dst, size_t dstCapacity,
const void* src, size_t srcSize,
int compressionLevel);
ZSTDLIB_API size_t ZSTD_decompress( void* dst, size_t maxOriginalSize,
/*! ZSTD_decompress() :
`compressedSize` : is the _exact_ size of the compressed blob, otherwise decompression will fail.
`dstCapacity` must be large enough, equal or larger than originalSize.
@return : the number of bytes decompressed into `dst` (<= `dstCapacity`),
or an errorCode if it fails (which can be tested using ZSTD_isError()) */
ZSTDLIB_API size_t ZSTD_decompress( void* dst, size_t dstCapacity,
const void* src, size_t compressedSize);
/**
ZSTD_compress() :
Compresses 'srcSize' bytes from buffer 'src' into buffer 'dst', of maximum size 'dstSize'.
Destination buffer must be already allocated.
Compression runs faster if maxDstSize >= ZSTD_compressBound(srcSize).
return : the number of bytes written into buffer 'dst'
or an error code if it fails (which can be tested using ZSTD_isError())
ZSTD_decompress() :
compressedSize : is the exact source size
maxOriginalSize : is the size of the 'dst' buffer, which must be already allocated.
It must be equal or larger than originalSize, otherwise decompression will fail.
return : the number of bytes decompressed into destination buffer (<= maxOriginalSize)
or an errorCode if it fails (which can be tested using ZSTD_isError())
*/
/* *************************************
* Tool functions
* Helper functions
***************************************/
ZSTDLIB_API size_t ZSTD_compressBound(size_t srcSize); /** maximum compressed size (worst case scenario) */
ZSTDLIB_API size_t ZSTD_compressBound(size_t srcSize); /*!< maximum compressed size (worst case scenario) */
/* Error Management */
ZSTDLIB_API unsigned ZSTD_isError(size_t code); /** tells if a return value is an error code */
ZSTDLIB_API const char* ZSTD_getErrorName(size_t code); /** provides error code string */
ZSTDLIB_API unsigned ZSTD_isError(size_t code); /*!< tells if a `size_t` function result is an error code */
ZSTDLIB_API const char* ZSTD_getErrorName(size_t code); /*!< provides readable string for an error code */
/* *************************************
* Advanced functions
* Explicit memory management
***************************************/
/** Compression context management */
typedef struct ZSTD_CCtx_s ZSTD_CCtx; /* incomplete type */
/** Compression context */
typedef struct ZSTD_CCtx_s ZSTD_CCtx; /*< incomplete type */
ZSTDLIB_API ZSTD_CCtx* ZSTD_createCCtx(void);
ZSTDLIB_API size_t ZSTD_freeCCtx(ZSTD_CCtx* cctx);
ZSTDLIB_API size_t ZSTD_freeCCtx(ZSTD_CCtx* cctx); /*!< @return : errorCode */
/** ZSTD_compressCCtx() :
Same as ZSTD_compress(), but requires an already allocated ZSTD_CCtx (see ZSTD_createCCtx()) */
ZSTDLIB_API size_t ZSTD_compressCCtx(ZSTD_CCtx* ctx, void* dst, size_t maxDstSize, const void* src, size_t srcSize, int compressionLevel);
ZSTDLIB_API size_t ZSTD_compressCCtx(ZSTD_CCtx* ctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize, int compressionLevel);
/** Decompression context management */
/** Decompression context */
typedef struct ZSTD_DCtx_s ZSTD_DCtx;
ZSTDLIB_API ZSTD_DCtx* ZSTD_createDCtx(void);
ZSTDLIB_API size_t ZSTD_freeDCtx(ZSTD_DCtx* dctx);
ZSTDLIB_API size_t ZSTD_freeDCtx(ZSTD_DCtx* dctx); /*!< @return : errorCode */
/** ZSTD_decompressDCtx
/** ZSTD_decompressDCtx() :
* Same as ZSTD_decompress(), but requires an already allocated ZSTD_DCtx (see ZSTD_createDCtx()) */
ZSTDLIB_API size_t ZSTD_decompressDCtx(ZSTD_DCtx* ctx, void* dst, size_t maxDstSize, const void* src, size_t srcSize);
ZSTDLIB_API size_t ZSTD_decompressDCtx(ZSTD_DCtx* ctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize);
/*-***********************
* Dictionary API
*************************/
/*! ZSTD_compress_usingDict() :
* Compression using a pre-defined Dictionary content (see dictBuilder).
* Note : dict can be NULL, in which case, it's equivalent to ZSTD_compressCCtx() */
ZSTDLIB_API size_t ZSTD_compress_usingDict(ZSTD_CCtx* ctx,
void* dst, size_t dstCapacity,
const void* src, size_t srcSize,
const void* dict,size_t dictSize,
int compressionLevel);
/*! ZSTD_decompress_usingDict() :
* Decompression using a pre-defined Dictionary content (see dictBuilder).
* Dictionary must be identical to the one used during compression, otherwise regenerated data will be corrupted.
* Note : dict can be NULL, in which case, it's equivalent to ZSTD_decompressDCtx() */
ZSTDLIB_API size_t ZSTD_decompress_usingDict(ZSTD_DCtx* dctx,
void* dst, size_t dstCapacity,
const void* src, size_t srcSize,
const void* dict,size_t dictSize);
#if defined (__cplusplus)

View File

@ -1,6 +1,6 @@
/*
Buffered version of Zstd compression library
Copyright (C) 2015, Yann Collet.
Copyright (C) 2015-2016, Yann Collet.
BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
@ -36,7 +36,7 @@
*/
/* *************************************
* Includes
* Dependencies
***************************************/
#include <stdlib.h>
#include "error_private.h"
@ -44,6 +44,12 @@
#include "zstd_buffered_static.h"
/* *************************************
* Constants
***************************************/
static size_t ZBUFF_blockHeaderSize = 3;
static size_t ZBUFF_endFrameSize = 3;
/** ************************************************
* Streaming compression
*
@ -119,7 +125,7 @@ size_t ZBUFF_freeCCtx(ZBUFF_CCtx* zbc)
#define MIN(a,b) ( ((a)<(b)) ? (a) : (b) )
#define BLOCKSIZE (128 * 1024) /* a bit too "magic", should come from reference */
size_t ZBUFF_compressInit_advanced(ZBUFF_CCtx* zbc, ZSTD_parameters params)
size_t ZBUFF_compressInit_advanced(ZBUFF_CCtx* zbc, const void* dict, size_t dictSize, ZSTD_parameters params)
{
size_t neededInBuffSize;
@ -127,23 +133,21 @@ size_t ZBUFF_compressInit_advanced(ZBUFF_CCtx* zbc, ZSTD_parameters params)
neededInBuffSize = (size_t)1 << params.windowLog;
/* allocate buffers */
if (zbc->inBuffSize < neededInBuffSize)
{
if (zbc->inBuffSize < neededInBuffSize) {
zbc->inBuffSize = neededInBuffSize;
free(zbc->inBuff); /* should not be necessary */
zbc->inBuff = (char*)malloc(neededInBuffSize);
if (zbc->inBuff == NULL) return ERROR(memory_allocation);
}
zbc->blockSize = MIN(BLOCKSIZE, zbc->inBuffSize);
if (zbc->outBuffSize < ZSTD_compressBound(zbc->blockSize)+1)
{
if (zbc->outBuffSize < ZSTD_compressBound(zbc->blockSize)+1) {
zbc->outBuffSize = ZSTD_compressBound(zbc->blockSize)+1;
free(zbc->outBuff); /* should not be necessary */
zbc->outBuff = (char*)malloc(zbc->outBuffSize);
if (zbc->outBuff == NULL) return ERROR(memory_allocation);
}
zbc->outBuffContentSize = ZSTD_compressBegin_advanced(zbc->zc, params);
zbc->outBuffContentSize = ZSTD_compressBegin_advanced(zbc->zc, dict, dictSize, params);
if (ZSTD_isError(zbc->outBuffContentSize)) return zbc->outBuffContentSize;
zbc->inToCompress = 0;
@ -156,14 +160,13 @@ size_t ZBUFF_compressInit_advanced(ZBUFF_CCtx* zbc, ZSTD_parameters params)
size_t ZBUFF_compressInit(ZBUFF_CCtx* zbc, int compressionLevel)
{
return ZBUFF_compressInit_advanced(zbc, ZSTD_getParams(compressionLevel, 0));
return ZBUFF_compressInit_advanced(zbc, NULL, 0, ZSTD_getParams(compressionLevel, 0));
}
ZSTDLIB_API size_t ZBUFF_compressWithDictionary(ZBUFF_CCtx* zbc, const void* src, size_t srcSize)
ZSTDLIB_API size_t ZBUFF_compressInitDictionary(ZBUFF_CCtx* zbc, const void* dict, size_t dictSize, int compressionLevel)
{
ZSTD_compress_insertDictionary(zbc->zc, src, srcSize);
return 0;
return ZBUFF_compressInit_advanced(zbc, dict, dictSize, ZSTD_getParams(compressionLevel, 0));
}
@ -189,8 +192,7 @@ static size_t ZBUFF_compressContinue_generic(ZBUFF_CCtx* zbc,
char* op = ostart;
char* const oend = ostart + *maxDstSizePtr;
while (notDone)
{
while (notDone) {
switch(zbc->stage)
{
case ZBUFFcs_init: return ERROR(init_missing); /* call ZBUFF_compressInit() first ! */
@ -202,9 +204,9 @@ static size_t ZBUFF_compressContinue_generic(ZBUFF_CCtx* zbc,
size_t loaded = ZBUFF_limitCopy(zbc->inBuff + zbc->inBuffPos, toLoad, ip, iend-ip);
zbc->inBuffPos += loaded;
ip += loaded;
if ( (zbc->inBuffPos==zbc->inToCompress) || (!flush && (toLoad != loaded)) )
{ notDone = 0; break; } /* not enough input to get a full block : stop there, wait for more */
}
if ( (zbc->inBuffPos==zbc->inToCompress) || (!flush && (toLoad != loaded)) ) {
notDone = 0; break; /* not enough input to get a full block : stop there, wait for more */
} }
/* compress current block (note : this stage cannot be stopped in the middle) */
{
void* cDst;
@ -236,8 +238,7 @@ static size_t ZBUFF_compressContinue_generic(ZBUFF_CCtx* zbc,
size_t flushed = ZBUFF_limitCopy(op, oend-op, zbc->outBuff + zbc->outBuffFlushedSize, toFlush);
op += flushed;
zbc->outBuffFlushedSize += flushed;
if (toFlush!=flushed)
{ notDone = 0; break; } /* not enough space within dst to store compressed block : stop there */
if (toFlush!=flushed) { notDone = 0; break; } /* not enough space within dst to store compressed block : stop there */
zbc->outBuffContentSize = 0;
zbc->outBuffFlushedSize = 0;
zbc->stage = ZBUFFcs_load;
@ -260,7 +261,9 @@ static size_t ZBUFF_compressContinue_generic(ZBUFF_CCtx* zbc,
size_t ZBUFF_compressContinue(ZBUFF_CCtx* zbc,
void* dst, size_t* maxDstSizePtr,
const void* src, size_t* srcSizePtr)
{ return ZBUFF_compressContinue_generic(zbc, dst, maxDstSizePtr, src, srcSizePtr, 0); }
{
return ZBUFF_compressContinue_generic(zbc, dst, maxDstSizePtr, src, srcSizePtr, 0);
}
@ -335,8 +338,6 @@ struct ZBUFF_DCtx_s {
size_t outStart;
size_t outEnd;
size_t hPos;
const char* dict;
size_t dictSize;
ZBUFF_dStage stage;
unsigned char headerBuffer[ZSTD_frameHeaderSize_max];
}; /* typedef'd to ZBUFF_DCtx within "zstd_buffered.h" */
@ -365,19 +366,16 @@ size_t ZBUFF_freeDCtx(ZBUFF_DCtx* zbc)
/* *** Initialization *** */
size_t ZBUFF_decompressInit(ZBUFF_DCtx* zbc)
size_t ZBUFF_decompressInitDictionary(ZBUFF_DCtx* zbc, const void* dict, size_t dictSize)
{
zbc->stage = ZBUFFds_readHeader;
zbc->hPos = zbc->inPos = zbc->outStart = zbc->outEnd = zbc->dictSize = 0;
return ZSTD_resetDCtx(zbc->zc);
zbc->hPos = zbc->inPos = zbc->outStart = zbc->outEnd = 0;
return ZSTD_decompressBegin_usingDict(zbc->zc, dict, dictSize);
}
size_t ZBUFF_decompressWithDictionary(ZBUFF_DCtx* zbc, const void* src, size_t srcSize)
size_t ZBUFF_decompressInit(ZBUFF_DCtx* zbc)
{
zbc->dict = (const char*)src;
zbc->dictSize = srcSize;
return 0;
return ZBUFF_decompressInitDictionary(zbc, NULL, 0);
}
@ -393,11 +391,9 @@ size_t ZBUFF_decompressContinue(ZBUFF_DCtx* zbc, void* dst, size_t* maxDstSizePt
char* const oend = ostart + *maxDstSizePtr;
U32 notDone = 1;
while (notDone)
{
while (notDone) {
switch(zbc->stage)
{
case ZBUFFds_init :
return ERROR(init_missing);
@ -406,8 +402,7 @@ size_t ZBUFF_decompressContinue(ZBUFF_DCtx* zbc, void* dst, size_t* maxDstSizePt
{
size_t headerSize = ZSTD_getFrameParams(&(zbc->params), src, *srcSizePtr);
if (ZSTD_isError(headerSize)) return headerSize;
if (headerSize)
{
if (headerSize) {
/* not enough input to decode header : tell how many bytes would be necessary */
memcpy(zbc->headerBuffer+zbc->hPos, src, *srcSizePtr);
zbc->hPos += *srcSizePtr;
@ -429,8 +424,7 @@ size_t ZBUFF_decompressContinue(ZBUFF_DCtx* zbc, void* dst, size_t* maxDstSizePt
ip += headerSize;
headerSize = ZSTD_getFrameParams(&(zbc->params), zbc->headerBuffer, zbc->hPos);
if (ZSTD_isError(headerSize)) return headerSize;
if (headerSize)
{
if (headerSize) {
/* not enough input to decode header : tell how many bytes would be necessary */
*maxDstSizePtr = 0;
return headerSize - zbc->hPos;
@ -443,25 +437,19 @@ size_t ZBUFF_decompressContinue(ZBUFF_DCtx* zbc, void* dst, size_t* maxDstSizePt
{
size_t neededOutSize = (size_t)1 << zbc->params.windowLog;
size_t neededInSize = BLOCKSIZE; /* a block is never > BLOCKSIZE */
if (zbc->inBuffSize < neededInSize)
{
if (zbc->inBuffSize < neededInSize) {
free(zbc->inBuff);
zbc->inBuffSize = neededInSize;
zbc->inBuff = (char*)malloc(neededInSize);
if (zbc->inBuff == NULL) return ERROR(memory_allocation);
}
if (zbc->outBuffSize < neededOutSize)
{
if (zbc->outBuffSize < neededOutSize) {
free(zbc->outBuff);
zbc->outBuffSize = neededOutSize;
zbc->outBuff = (char*)malloc(neededOutSize);
if (zbc->outBuff == NULL) return ERROR(memory_allocation);
}
}
if (zbc->dictSize)
ZSTD_decompress_insertDictionary(zbc->zc, zbc->dict, zbc->dictSize);
if (zbc->hPos)
{
} }
if (zbc->hPos) {
/* some data already loaded into headerBuffer : transfer into inBuff */
memcpy(zbc->inBuff, zbc->headerBuffer, zbc->hPos);
zbc->inPos = zbc->hPos;
@ -474,14 +462,12 @@ size_t ZBUFF_decompressContinue(ZBUFF_DCtx* zbc, void* dst, size_t* maxDstSizePt
case ZBUFFds_read:
{
size_t neededInSize = ZSTD_nextSrcSizeToDecompress(zbc->zc);
if (neededInSize==0) /* end of frame */
{
if (neededInSize==0) { /* end of frame */
zbc->stage = ZBUFFds_init;
notDone = 0;
break;
}
if ((size_t)(iend-ip) >= neededInSize)
{
if ((size_t)(iend-ip) >= neededInSize) {
/* directly decode from src */
size_t decodedSize = ZSTD_decompressContinue(zbc->zc,
zbc->outBuff + zbc->outStart, zbc->outBuffSize - zbc->outStart,
@ -517,16 +503,14 @@ size_t ZBUFF_decompressContinue(ZBUFF_DCtx* zbc, void* dst, size_t* maxDstSizePt
zbc->outEnd = zbc->outStart + decodedSize;
zbc->stage = ZBUFFds_flush;
// break; /* ZBUFFds_flush follows */
}
}
} }
case ZBUFFds_flush:
{
size_t toFlushSize = zbc->outEnd - zbc->outStart;
size_t flushedSize = ZBUFF_limitCopy(op, oend-op, zbc->outBuff + zbc->outStart, toFlushSize);
op += flushedSize;
zbc->outStart += flushedSize;
if (flushedSize == toFlushSize)
{
if (flushedSize == toFlushSize) {
zbc->stage = ZBUFFds_read;
if (zbc->outStart + BLOCKSIZE > zbc->outBuffSize)
zbc->outStart = zbc->outEnd = 0;
@ -537,15 +521,14 @@ size_t ZBUFF_decompressContinue(ZBUFF_DCtx* zbc, void* dst, size_t* maxDstSizePt
break;
}
default: return ERROR(GENERIC); /* impossible */
}
}
} }
*srcSizePtr = ip-istart;
*maxDstSizePtr = op-ostart;
{
size_t nextSrcSizeHint = ZSTD_nextSrcSizeToDecompress(zbc->zc);
if (nextSrcSizeHint > 3) nextSrcSizeHint+= 3; /* get the next block header while at it */
if (nextSrcSizeHint > ZBUFF_blockHeaderSize) nextSrcSizeHint+= ZBUFF_blockHeaderSize; /* get next block header too */
nextSrcSizeHint -= zbc->inPos; /* already loaded*/
return nextSrcSizeHint;
}
@ -553,18 +536,13 @@ size_t ZBUFF_decompressContinue(ZBUFF_DCtx* zbc, void* dst, size_t* maxDstSizePt
/* *************************************
* Tool functions
***************************************/
unsigned ZBUFF_isError(size_t errorCode) { return ERR_isError(errorCode); }
const char* ZBUFF_getErrorName(size_t errorCode) { return ERR_getErrorName(errorCode); }
size_t ZBUFF_recommendedCInSize() { return BLOCKSIZE; }
size_t ZBUFF_recommendedCOutSize() { return ZSTD_compressBound(BLOCKSIZE) + 6; }
size_t ZBUFF_recommendedDInSize() { return BLOCKSIZE + 3; }
size_t ZBUFF_recommendedDOutSize() { return BLOCKSIZE; }
size_t ZBUFF_recommendedCInSize(void) { return BLOCKSIZE; }
size_t ZBUFF_recommendedCOutSize(void) { return ZSTD_compressBound(BLOCKSIZE) + ZBUFF_blockHeaderSize + ZBUFF_endFrameSize; }
size_t ZBUFF_recommendedDInSize(void) { return BLOCKSIZE + ZBUFF_blockHeaderSize /* block header size*/ ; }
size_t ZBUFF_recommendedDOutSize(void) { return BLOCKSIZE; }

View File

@ -48,7 +48,7 @@ extern "C" {
/* ***************************************************************
* Tuning parameters
* Compiler specifics
*****************************************************************/
/*!
* ZSTD_DLL_EXPORT :
@ -69,48 +69,50 @@ ZSTDLIB_API ZBUFF_CCtx* ZBUFF_createCCtx(void);
ZSTDLIB_API size_t ZBUFF_freeCCtx(ZBUFF_CCtx* cctx);
ZSTDLIB_API size_t ZBUFF_compressInit(ZBUFF_CCtx* cctx, int compressionLevel);
ZSTDLIB_API size_t ZBUFF_compressWithDictionary(ZBUFF_CCtx* cctx, const void* src, size_t srcSize);
ZSTDLIB_API size_t ZBUFF_compressContinue(ZBUFF_CCtx* cctx, void* dst, size_t* maxDstSizePtr, const void* src, size_t* srcSizePtr);
ZSTDLIB_API size_t ZBUFF_compressFlush(ZBUFF_CCtx* cctx, void* dst, size_t* maxDstSizePtr);
ZSTDLIB_API size_t ZBUFF_compressEnd(ZBUFF_CCtx* cctx, void* dst, size_t* maxDstSizePtr);
ZSTDLIB_API size_t ZBUFF_compressInitDictionary(ZBUFF_CCtx* cctx, const void* dict, size_t dictSize, int compressionLevel);
ZSTDLIB_API size_t ZBUFF_compressContinue(ZBUFF_CCtx* cctx, void* dst, size_t* dstCapacityPtr, const void* src, size_t* srcSizePtr);
ZSTDLIB_API size_t ZBUFF_compressFlush(ZBUFF_CCtx* cctx, void* dst, size_t* dstCapacityPtr);
ZSTDLIB_API size_t ZBUFF_compressEnd(ZBUFF_CCtx* cctx, void* dst, size_t* dstCapacityPtr);
/** ************************************************
* Streaming compression
*
* A ZBUFF_CCtx object is required to track streaming operation.
* Use ZBUFF_createCCtx() and ZBUFF_freeCCtx() to create/release resources.
* Use ZBUFF_compressInit() to start a new compression operation.
* ZBUFF_CCtx objects can be reused multiple times.
*
* Optionally, a reference to a static dictionary can be created with ZBUFF_compressWithDictionary()
* Note that the dictionary content must remain accessible during the compression process.
* Start by initializing ZBUF_CCtx.
* Use ZBUFF_compressInit() to start a new compression operation.
* Use ZBUFF_compressInitDictionary() for a compression which requires a dictionary.
*
* Use ZBUFF_compressContinue() repetitively to consume input stream.
* *srcSizePtr and *maxDstSizePtr can be any size.
* The function will report how many bytes were read or written within *srcSizePtr and *maxDstSizePtr.
* *srcSizePtr and *dstCapacityPtr can be any size.
* The function will report how many bytes were read or written within *srcSizePtr and *dstCapacityPtr.
* Note that it may not consume the entire input, in which case it's up to the caller to present again remaining data.
* The content of dst will be overwritten (up to *maxDstSizePtr) at each function call, so save its content if it matters or move dst .
* @return : a hint to preferred nb of bytes to use as input for next function call (it's only a hint, to improve latency)
* The content of @dst will be overwritten (up to *dstCapacityPtr) at each call, so save its content if it matters or change @dst .
* @return : a hint to preferred nb of bytes to use as input for next function call (it's just a hint, to improve latency)
* or an error code, which can be tested using ZBUFF_isError().
*
* ZBUFF_compressFlush() can be used to instruct ZBUFF to compress and output whatever remains within its buffer.
* Note that it will not output more than *maxDstSizePtr.
* Therefore, some content might still be left into its internal buffer if dst buffer is too small.
* At any moment, it's possible to flush whatever data remains within buffer, using ZBUFF_compressFlush().
* The nb of bytes written into @dst will be reported into *dstCapacityPtr.
* Note that the function cannot output more than *dstCapacityPtr,
* therefore, some content might still be left into internal buffer if *dstCapacityPtr is too small.
* @return : nb of bytes still present into internal buffer (0 if it's empty)
* or an error code, which can be tested using ZBUFF_isError().
*
* ZBUFF_compressEnd() instructs to finish a frame.
* It will perform a flush and write frame epilogue.
* Note that the epilogue is necessary for decoders to consider a frame completed.
* Similar to ZBUFF_compressFlush(), it may not be able to output the entire internal buffer content if *maxDstSizePtr is too small.
* The epilogue is required for decoders to consider a frame completed.
* Similar to ZBUFF_compressFlush(), it may not be able to output the entire internal buffer content if *dstCapacityPtr is too small.
* In which case, call again ZBUFF_compressFlush() to complete the flush.
* @return : nb of bytes still present into internal buffer (0 if it's empty)
* or an error code, which can be tested using ZBUFF_isError().
*
* Hint : recommended buffer sizes (not compulsory) : ZBUFF_recommendedCInSize / ZBUFF_recommendedCOutSize
* input : ZBUFF_recommendedCInSize==128 KB block size is the internal unit, it improves latency to use this value.
* input : ZBUFF_recommendedCInSize==128 KB block size is the internal unit, it improves latency to use this value (skipped buffering).
* output : ZBUFF_recommendedCOutSize==ZSTD_compressBound(128 KB) + 3 + 3 : ensures it's always possible to write/flush/end a full block. Skip some buffering.
* By using both, you ensure that input will be entirely consumed, and output will always contain the result.
* By using both, it ensures that input will be entirely consumed, and output will always contain the result, reducing intermediate buffering.
* **************************************************/
@ -119,28 +121,25 @@ ZSTDLIB_API ZBUFF_DCtx* ZBUFF_createDCtx(void);
ZSTDLIB_API size_t ZBUFF_freeDCtx(ZBUFF_DCtx* dctx);
ZSTDLIB_API size_t ZBUFF_decompressInit(ZBUFF_DCtx* dctx);
ZSTDLIB_API size_t ZBUFF_decompressWithDictionary(ZBUFF_DCtx* dctx, const void* src, size_t srcSize);
ZSTDLIB_API size_t ZBUFF_decompressInitDictionary(ZBUFF_DCtx* dctx, const void* dict, size_t dictSize);
ZSTDLIB_API size_t ZBUFF_decompressContinue(ZBUFF_DCtx* dctx, void* dst, size_t* maxDstSizePtr, const void* src, size_t* srcSizePtr);
ZSTDLIB_API size_t ZBUFF_decompressContinue(ZBUFF_DCtx* dctx, void* dst, size_t* dstCapacityPtr, const void* src, size_t* srcSizePtr);
/** ************************************************
* Streaming decompression
*
* A ZBUFF_DCtx object is required to track streaming operation.
* Use ZBUFF_createDCtx() and ZBUFF_freeDCtx() to create/release resources.
* Use ZBUFF_decompressInit() to start a new decompression operation.
* ZBUFF_DCtx objects can be reused multiple times.
*
* Optionally, a reference to a static dictionary can be set, using ZBUFF_decompressWithDictionary()
* It must be the same content as the one set during compression phase.
* Dictionary content must remain accessible during the decompression process.
* Use ZBUFF_decompressInit() to start a new decompression operation,
* or ZBUFF_decompressInitDictionary() if decompression requires a dictionary.
* Note that ZBUFF_DCtx objects can be reused multiple times.
*
* Use ZBUFF_decompressContinue() repetitively to consume your input.
* *srcSizePtr and *maxDstSizePtr can be any size.
* The function will report how many bytes were read or written by modifying *srcSizePtr and *maxDstSizePtr.
* *srcSizePtr and *dstCapacityPtr can be any size.
* The function will report how many bytes were read or written by modifying *srcSizePtr and *dstCapacityPtr.
* Note that it may not consume the entire input, in which case it's up to the caller to present remaining input again.
* The content of dst will be overwritten (up to *maxDstSizePtr) at each function call, so save its content if it matters or change dst.
* @return : a hint to preferred nb of bytes to use as input for next function call (it's only a hint, to improve latency)
* The content of @dst will be overwritten (up to *dstCapacityPtr) at each function call, so save its content if it matters or change @dst.
* @return : a hint to preferred nb of bytes to use as input for next function call (it's only a hint, to help latency)
* or 0 when a frame is completely decoded
* or an error code, which can be tested using ZBUFF_isError().
*
@ -157,7 +156,7 @@ ZSTDLIB_API unsigned ZBUFF_isError(size_t errorCode);
ZSTDLIB_API const char* ZBUFF_getErrorName(size_t errorCode);
/** The below functions provide recommended buffer sizes for Compression or Decompression operations.
* These sizes are not compulsory, they just tend to offer better latency */
* These sizes are just hints, and tend to offer better latency */
ZSTDLIB_API size_t ZBUFF_recommendedCInSize(void);
ZSTDLIB_API size_t ZBUFF_recommendedCOutSize(void);
ZSTDLIB_API size_t ZBUFF_recommendedDInSize(void);

View File

@ -45,14 +45,14 @@ extern "C" {
/* *************************************
* Includes
***************************************/
#include "zstd_static.h"
#include "zstd_static.h" /* ZSTD_parameters */
#include "zstd_buffered.h"
/* *************************************
* Advanced Streaming functions
***************************************/
ZSTDLIB_API size_t ZBUFF_compressInit_advanced(ZBUFF_CCtx* cctx, ZSTD_parameters params);
ZSTDLIB_API size_t ZBUFF_compressInit_advanced(ZBUFF_CCtx* cctx, const void* dict, size_t dictSize, ZSTD_parameters params);
#if defined (__cplusplus)

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -1,7 +1,7 @@
/*
zstd_internal - common functions to include
Header File for include
Copyright (C) 2014-2015, Yann Collet.
Copyright (C) 2014-2016, Yann Collet.
BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
@ -28,7 +28,6 @@
You can contact the author at :
- zstd source repository : https://github.com/Cyan4973/zstd
- ztsd public forum : https://groups.google.com/forum/#!forum/lz4c
*/
#ifndef ZSTD_CCOMMON_H_MODULE
#define ZSTD_CCOMMON_H_MODULE
@ -38,10 +37,11 @@ extern "C" {
#endif
/* *************************************
* Includes
* Dependencies
***************************************/
#include "mem.h"
#include "error_private.h"
#include "zstd_static.h"
/* *************************************
@ -54,7 +54,8 @@ extern "C" {
/* *************************************
* Common constants
***************************************/
#define ZSTD_MAGICNUMBER 0xFD2FB524 /* v0.4 */
#define ZSTD_MAGICNUMBER 0xFD2FB525 /* v0.5 */
#define ZSTD_DICT_MAGIC 0xEC30A435
#define KB *(1 <<10)
#define MB *(1 <<20)
@ -73,11 +74,13 @@ static const size_t ZSTD_frameHeaderSize_min = 5;
#define BIT1 2
#define BIT0 1
#define IS_RAW BIT0
#define IS_RLE BIT1
#define IS_HUF 0
#define IS_PCH 1
#define IS_RAW 2
#define IS_RLE 3
#define MINMATCH 4
#define REPCODE_STARTVALUE 4
#define REPCODE_STARTVALUE 1
#define MLbits 7
#define LLbits 6
@ -90,21 +93,32 @@ static const size_t ZSTD_frameHeaderSize_min = 5;
#define OffFSELog 9
#define MaxSeq MAX(MaxLL, MaxML)
#define MIN_SEQUENCES_SIZE (2 /*seqNb*/ + 2 /*dumps*/ + 3 /*seqTables*/ + 1 /*bitStream*/)
#define MIN_CBLOCK_SIZE (3 /*litCSize*/ + MIN_SEQUENCES_SIZE)
#define FSE_ENCODING_RAW 0
#define FSE_ENCODING_RLE 1
#define FSE_ENCODING_STATIC 2
#define FSE_ENCODING_DYNAMIC 3
#define HufLog 12
#define MIN_SEQUENCES_SIZE 1 /* nbSeq==0 */
#define MIN_CBLOCK_SIZE (1 /*litCSize*/ + 1 /* RLE or RAW */ + MIN_SEQUENCES_SIZE /* nbSeq==0 */) /* for a non-null block */
#define WILDCOPY_OVERLENGTH 8
typedef enum { bt_compressed, bt_raw, bt_rle, bt_end } blockType_t;
/* ******************************************
/*-*******************************************
* Shared functions to include for inlining
********************************************/
*********************************************/
static void ZSTD_copy8(void* dst, const void* src) { memcpy(dst, src, 8); }
#define COPY8(d,s) { ZSTD_copy8(d,s); d+=8; s+=8; }
/*! ZSTD_wildcopy : custom version of memcpy(), can copy up to 7-8 bytes too many */
static void ZSTD_wildcopy(void* dst, const void* src, size_t length)
/*! ZSTD_wildcopy() :
* custom version of memcpy(), can copy up to 7 bytes too many (8 bytes if length==0) */
MEM_STATIC void ZSTD_wildcopy(void* dst, const void* src, size_t length)
{
const BYTE* ip = (const BYTE*)src;
BYTE* op = (BYTE*)dst;

View File

@ -1,7 +1,7 @@
/*
zstd - standard compression library
Header File for static linking only
Copyright (C) 2014-2015, Yann Collet.
Copyright (C) 2014-2016, Yann Collet.
BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
@ -28,7 +28,6 @@
You can contact the author at :
- zstd source repository : https://github.com/Cyan4973/zstd
- ztsd public forum : https://groups.google.com/forum/#!forum/lz4c
*/
#ifndef ZSTD_STATIC_H
#define ZSTD_STATIC_H
@ -42,14 +41,14 @@
extern "C" {
#endif
/* *************************************
* Includes
/*-*************************************
* Dependencies
***************************************/
#include "zstd.h"
#include "mem.h"
/* *************************************
/*-*************************************
* Types
***************************************/
#define ZSTD_WINDOWLOG_MAX 26
@ -82,72 +81,71 @@ typedef struct
/* *************************************
* Advanced functions
***************************************/
/** ZSTD_getParams
* return ZSTD_parameters structure for a selected compression level and srcSize.
* srcSizeHint value is optional, select 0 if not known */
#define ZSTD_MAX_CLEVEL 20
ZSTDLIB_API unsigned ZSTD_maxCLevel (void);
/*! ZSTD_getParams() :
* @return ZSTD_parameters structure for a selected compression level and srcSize.
* `srcSizeHint` value is optional, select 0 if not known */
ZSTDLIB_API ZSTD_parameters ZSTD_getParams(int compressionLevel, U64 srcSizeHint);
/** ZSTD_validateParams
/*! ZSTD_validateParams() :
* correct params value to remain within authorized range */
ZSTDLIB_API void ZSTD_validateParams(ZSTD_parameters* params);
/** ZSTD_compress_usingDict
* Same as ZSTD_compressCCtx(), using a Dictionary content as prefix
* Note : dict can be NULL, in which case, it's equivalent to ZSTD_compressCCtx() */
ZSTDLIB_API size_t ZSTD_compress_usingDict(ZSTD_CCtx* ctx,
void* dst, size_t maxDstSize,
const void* src, size_t srcSize,
const void* dict,size_t dictSize,
int compressionLevel);
/** ZSTD_compress_advanced
/*! ZSTD_compress_advanced() :
* Same as ZSTD_compress_usingDict(), with fine-tune control of each compression parameter */
ZSTDLIB_API size_t ZSTD_compress_advanced (ZSTD_CCtx* ctx,
void* dst, size_t maxDstSize,
void* dst, size_t dstCapacity,
const void* src, size_t srcSize,
const void* dict,size_t dictSize,
ZSTD_parameters params);
/** ZSTD_decompress_usingDict
* Same as ZSTD_decompressDCtx, using a Dictionary content as prefix
* Note : dict can be NULL, in which case, it's equivalent to ZSTD_decompressDCtx() */
ZSTDLIB_API size_t ZSTD_decompress_usingDict(ZSTD_DCtx* ctx,
void* dst, size_t maxDstSize,
const void* src, size_t srcSize,
const void* dict,size_t dictSize);
/*! ZSTD_compress_usingPreparedDCtx() :
* Same as ZSTD_compress_usingDict, but using a reference context `preparedCCtx`, where dictionary has been loaded.
* It avoids reloading the dictionary each time.
* `preparedCCtx` must have been properly initialized using ZSTD_compressBegin_usingDict() or ZSTD_compressBegin_advanced().
* Requires 2 contexts : 1 for reference, which will not be modified, and 1 to run the compression operation */
ZSTDLIB_API size_t ZSTD_compress_usingPreparedCCtx(
ZSTD_CCtx* cctx, const ZSTD_CCtx* preparedCCtx,
void* dst, size_t dstCapacity,
const void* src, size_t srcSize);
/*- Advanced Decompression functions -*/
/*! ZSTD_decompress_usingPreparedDCtx() :
* Same as ZSTD_decompress_usingDict, but using a reference context `preparedDCtx`, where dictionary has been loaded.
* It avoids reloading the dictionary each time.
* `preparedDCtx` must have been properly initialized using ZSTD_decompressBegin_usingDict().
* Requires 2 contexts : 1 for reference, which will not be modified, and 1 to run the decompression operation */
ZSTDLIB_API size_t ZSTD_decompress_usingPreparedDCtx(
ZSTD_DCtx* dctx, const ZSTD_DCtx* preparedDCtx,
void* dst, size_t dstCapacity,
const void* src, size_t srcSize);
/* **************************************
* Streaming functions (direct mode)
****************************************/
ZSTDLIB_API size_t ZSTD_compressBegin(ZSTD_CCtx* cctx, int compressionLevel);
ZSTDLIB_API size_t ZSTD_compressBegin_advanced(ZSTD_CCtx* ctx, ZSTD_parameters params);
ZSTDLIB_API size_t ZSTD_compressBegin_usingDict(ZSTD_CCtx* cctx, const void* dict,size_t dictSize, int compressionLevel);
ZSTDLIB_API size_t ZSTD_compressBegin_advanced(ZSTD_CCtx* cctx, const void* dict,size_t dictSize, ZSTD_parameters params);
ZSTDLIB_API size_t ZSTD_copyCCtx(ZSTD_CCtx* cctx, const ZSTD_CCtx* preparedCCtx);
ZSTDLIB_API size_t ZSTD_compress_insertDictionary(ZSTD_CCtx* ctx, const void* src, size_t srcSize);
ZSTDLIB_API size_t ZSTD_duplicateCCtx(ZSTD_CCtx* dstCCtx, const ZSTD_CCtx* srcCCtx);
ZSTDLIB_API size_t ZSTD_compressContinue(ZSTD_CCtx* cctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize);
ZSTDLIB_API size_t ZSTD_compressEnd(ZSTD_CCtx* cctx, void* dst, size_t dstCapacity);
ZSTDLIB_API size_t ZSTD_compressContinue(ZSTD_CCtx* cctx, void* dst, size_t maxDstSize, const void* src, size_t srcSize);
ZSTDLIB_API size_t ZSTD_compressEnd(ZSTD_CCtx* cctx, void* dst, size_t maxDstSize);
/**
/*
Streaming compression, synchronous mode (bufferless)
A ZSTD_CCtx object is required to track streaming operations.
Use ZSTD_createCCtx() / ZSTD_freeCCtx() to manage it.
ZSTD_CCtx object can be re-used multiple times within successive compression operations.
First operation is to start a new frame.
Use ZSTD_compressBegin().
You may also prefer the advanced derivative ZSTD_compressBegin_advanced(), for finer parameter control.
It's then possible to add a dictionary with ZSTD_compress_insertDictionary()
Note that dictionary presence is a "hidden" information,
the decoder needs to be aware that it is required for proper decoding, or decoding will fail.
If you want to compress a lot of messages using same dictionary,
it can be beneficial to duplicate compression context rather than reloading dictionary each time.
In such case, use ZSTD_duplicateCCtx(), which will need an already created ZSTD_CCtx,
in order to duplicate compression context into it.
Start by initializing a context.
Use ZSTD_compressBegin(), or ZSTD_compressBegin_usingDict() for dictionary compression,
or ZSTD_compressBegin_advanced(), for finer parameter control.
It's also possible to duplicate a reference context which has been initialized, using ZSTD_copyCCtx()
Then, consume your input using ZSTD_compressContinue().
The interface is synchronous, so all input will be consumed and produce a compressed output.
@ -155,38 +153,39 @@ ZSTDLIB_API size_t ZSTD_compressEnd(ZSTD_CCtx* cctx, void* dst, size_t maxDstSiz
Worst case evaluation is provided by ZSTD_compressBound().
Finish a frame with ZSTD_compressEnd(), which will write the epilogue.
Without it, the frame will be considered incomplete by decoders.
Without the epilogue, frames will be considered incomplete by decoder.
You can then reuse ZSTD_CCtx to compress some new frame.
*/
ZSTDLIB_API size_t ZSTD_resetDCtx(ZSTD_DCtx* dctx);
ZSTDLIB_API size_t ZSTD_decompressBegin(ZSTD_DCtx* dctx);
ZSTDLIB_API size_t ZSTD_decompressBegin_usingDict(ZSTD_DCtx* dctx, const void* dict, size_t dictSize);
ZSTDLIB_API void ZSTD_copyDCtx(ZSTD_DCtx* dctx, const ZSTD_DCtx* preparedDCtx);
ZSTDLIB_API size_t ZSTD_getFrameParams(ZSTD_parameters* params, const void* src, size_t srcSize);
ZSTDLIB_API void ZSTD_decompress_insertDictionary(ZSTD_DCtx* ctx, const void* src, size_t srcSize);
ZSTDLIB_API size_t ZSTD_nextSrcSizeToDecompress(ZSTD_DCtx* dctx);
ZSTDLIB_API size_t ZSTD_decompressContinue(ZSTD_DCtx* dctx, void* dst, size_t maxDstSize, const void* src, size_t srcSize);
ZSTDLIB_API size_t ZSTD_decompressContinue(ZSTD_DCtx* dctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize);
/**
Streaming decompression, bufferless mode
/*
Streaming decompression, direct mode (bufferless)
A ZSTD_DCtx object is required to track streaming operations.
Use ZSTD_createDCtx() / ZSTD_freeDCtx() to manage it.
A ZSTD_DCtx object can be re-used multiple times. Use ZSTD_resetDCtx() to return to fresh status.
A ZSTD_DCtx object can be re-used multiple times.
First operation is to retrieve frame parameters, using ZSTD_getFrameParams().
This function doesn't consume its input. It needs enough input data to properly decode the frame header.
First typical operation is to retrieve frame parameters, using ZSTD_getFrameParams().
This operation is independent, and just needs enough input data to properly decode the frame header.
Objective is to retrieve *params.windowlog, to know minimum amount of memory required during decoding.
Result : 0 when successful, it means the ZSTD_parameters structure has been filled.
>0 : means there is not enough data into src. Provides the expected size to successfully decode header.
errorCode, which can be tested using ZSTD_isError() (For example, if it's not a ZSTD header)
errorCode, which can be tested using ZSTD_isError()
Then, you can optionally insert a dictionary.
This operation must mimic the compressor behavior, otherwise decompression will fail or be corrupted.
Start decompression, with ZSTD_decompressBegin() or ZSTD_decompressBegin_usingDict()
Alternatively, you can copy a prepared context, using ZSTD_copyDCtx()
Then it's possible to start decompression.
Use ZSTD_nextSrcSizeToDecompress() and ZSTD_decompressContinue() alternatively.
Then use ZSTD_nextSrcSizeToDecompress() and ZSTD_decompressContinue() alternatively.
ZSTD_nextSrcSizeToDecompress() tells how much bytes to provide as 'srcSize' to ZSTD_decompressContinue().
ZSTD_decompressContinue() requires this exact amount of bytes, or it will fail.
ZSTD_decompressContinue() needs previous data blocks during decompression, up to (1 << windowlog).
@ -203,138 +202,36 @@ ZSTDLIB_API size_t ZSTD_decompressContinue(ZSTD_DCtx* dctx, void* dst, size_t ma
/* **************************************
* Block functions
****************************************/
/*! Block functions produce and decode raw zstd blocks, without frame metadata.
User will have to save and regenerate necessary information to regenerate data, such as block sizes.
/*!Block functions produce and decode raw zstd blocks, without frame metadata.
It saves associated header sizes.
But user will have to save and regenerate fields required to regenerate data, such as block sizes.
A few rules to respect :
- Uncompressed block size must be <= 128 KB
- Compressing or decompressing require a context structure
+ Use ZSTD_createXCtx() to create them
- It is necessary to init context before starting
+ compression : ZSTD_compressBegin(), which allows selection of compression level or parameters
+ decompression : ZSTD_resetDCtx()
+ If you compress multiple blocks without resetting, next blocks will create references to previous ones
- Dictionary can optionally be inserted, using ZSTD_de/compress_insertDictionary()
- When a block is considered not compressible enough, ZSTD_compressBlock() result will be zero.
+ User must test for such outcome and be able to deal with uncompressed data
+ ZSTD_decompressBlock() doesn't accept uncompressed data as input
A few rules to respect :
- Uncompressed block size must be <= 128 KB
- Compressing or decompressing requires a context structure
+ Use ZSTD_createCCtx() and ZSTD_createDCtx()
- It is necessary to init context before starting
+ compression : ZSTD_compressBegin()
+ decompression : ZSTD_decompressBegin()
+ variants _usingDict() are also allowed
+ copyCCtx() and copyDCtx() work too
- When a block is considered not compressible enough, ZSTD_compressBlock() result will be zero.
In which case, nothing is produced into `dst`.
+ User must test for such outcome and deal directly with uncompressed data
+ ZSTD_decompressBlock() doesn't accept uncompressed data as input !!
*/
size_t ZSTD_compressBlock (ZSTD_CCtx* cctx, void* dst, size_t maxDstSize, const void* src, size_t srcSize);
size_t ZSTD_decompressBlock(ZSTD_DCtx* dctx, void* dst, size_t maxDstSize, const void* src, size_t srcSize);
/* *************************************
* Pre-defined compression levels
***************************************/
#define ZSTD_MAX_CLEVEL 20
ZSTDLIB_API unsigned ZSTD_maxCLevel (void);
static const ZSTD_parameters ZSTD_defaultParameters[4][ZSTD_MAX_CLEVEL+1] = {
{ /* "default" */
/* W, C, H, S, L, strat */
{ 0, 18, 12, 12, 1, 4, ZSTD_fast }, /* level 0 - never used */
{ 0, 19, 13, 14, 1, 7, ZSTD_fast }, /* level 1 */
{ 0, 19, 15, 16, 1, 6, ZSTD_fast }, /* level 2 */
{ 0, 20, 18, 20, 1, 6, ZSTD_fast }, /* level 3 */
{ 0, 21, 19, 21, 1, 6, ZSTD_fast }, /* level 4 */
{ 0, 20, 14, 18, 3, 5, ZSTD_greedy }, /* level 5 */
{ 0, 20, 18, 19, 3, 5, ZSTD_greedy }, /* level 6 */
{ 0, 21, 17, 20, 3, 5, ZSTD_lazy }, /* level 7 */
{ 0, 21, 19, 20, 3, 5, ZSTD_lazy }, /* level 8 */
{ 0, 21, 20, 20, 3, 5, ZSTD_lazy2 }, /* level 9 */
{ 0, 21, 19, 21, 4, 5, ZSTD_lazy2 }, /* level 10 */
{ 0, 22, 20, 22, 4, 5, ZSTD_lazy2 }, /* level 11 */
{ 0, 22, 20, 22, 5, 5, ZSTD_lazy2 }, /* level 12 */
{ 0, 22, 21, 22, 5, 5, ZSTD_lazy2 }, /* level 13 */
{ 0, 22, 22, 23, 5, 5, ZSTD_lazy2 }, /* level 14 */
{ 0, 23, 23, 23, 5, 5, ZSTD_lazy2 }, /* level 15 */
{ 0, 23, 21, 22, 5, 5, ZSTD_btlazy2 }, /* level 16 */
{ 0, 23, 24, 23, 4, 5, ZSTD_btlazy2 }, /* level 17 */
{ 0, 25, 24, 23, 5, 5, ZSTD_btlazy2 }, /* level 18 */
{ 0, 25, 26, 23, 5, 5, ZSTD_btlazy2 }, /* level 19 */
{ 0, 26, 27, 25, 9, 5, ZSTD_btlazy2 }, /* level 20 */
},
{ /* for srcSize <= 256 KB */
/* W, C, H, S, L, strat */
{ 0, 18, 13, 14, 1, 7, ZSTD_fast }, /* level 0 - never used */
{ 0, 18, 14, 15, 1, 6, ZSTD_fast }, /* level 1 */
{ 0, 18, 14, 15, 1, 5, ZSTD_fast }, /* level 2 */
{ 0, 18, 12, 15, 3, 4, ZSTD_greedy }, /* level 3 */
{ 0, 18, 13, 15, 4, 4, ZSTD_greedy }, /* level 4 */
{ 0, 18, 14, 15, 5, 4, ZSTD_greedy }, /* level 5 */
{ 0, 18, 13, 15, 4, 4, ZSTD_lazy }, /* level 6 */
{ 0, 18, 14, 16, 5, 4, ZSTD_lazy }, /* level 7 */
{ 0, 18, 15, 16, 6, 4, ZSTD_lazy }, /* level 8 */
{ 0, 18, 15, 15, 7, 4, ZSTD_lazy }, /* level 9 */
{ 0, 18, 16, 16, 7, 4, ZSTD_lazy }, /* level 10 */
{ 0, 18, 16, 16, 8, 4, ZSTD_lazy }, /* level 11 */
{ 0, 18, 17, 16, 8, 4, ZSTD_lazy }, /* level 12 */
{ 0, 18, 17, 16, 9, 4, ZSTD_lazy }, /* level 13 */
{ 0, 18, 18, 16, 9, 4, ZSTD_lazy }, /* level 14 */
{ 0, 18, 17, 17, 9, 4, ZSTD_lazy2 }, /* level 15 */
{ 0, 18, 18, 18, 9, 4, ZSTD_lazy2 }, /* level 16 */
{ 0, 18, 18, 18, 10, 4, ZSTD_lazy2 }, /* level 17 */
{ 0, 18, 18, 18, 11, 4, ZSTD_lazy2 }, /* level 18 */
{ 0, 18, 18, 18, 12, 4, ZSTD_lazy2 }, /* level 19 */
{ 0, 18, 18, 18, 13, 4, ZSTD_lazy2 }, /* level 20 */
},
{ /* for srcSize <= 128 KB */
/* W, C, H, S, L, strat */
{ 0, 17, 12, 12, 1, 4, ZSTD_fast }, /* level 0 - never used */
{ 0, 17, 12, 13, 1, 6, ZSTD_fast }, /* level 1 */
{ 0, 17, 14, 16, 1, 5, ZSTD_fast }, /* level 2 */
{ 0, 17, 15, 17, 1, 5, ZSTD_fast }, /* level 3 */
{ 0, 17, 13, 15, 2, 4, ZSTD_greedy }, /* level 4 */
{ 0, 17, 15, 17, 3, 4, ZSTD_greedy }, /* level 5 */
{ 0, 17, 14, 17, 3, 4, ZSTD_lazy }, /* level 6 */
{ 0, 17, 16, 17, 4, 4, ZSTD_lazy }, /* level 7 */
{ 0, 17, 16, 17, 4, 4, ZSTD_lazy2 }, /* level 8 */
{ 0, 17, 17, 16, 5, 4, ZSTD_lazy2 }, /* level 9 */
{ 0, 17, 17, 16, 6, 4, ZSTD_lazy2 }, /* level 10 */
{ 0, 17, 17, 16, 7, 4, ZSTD_lazy2 }, /* level 11 */
{ 0, 17, 17, 16, 8, 4, ZSTD_lazy2 }, /* level 12 */
{ 0, 17, 18, 16, 4, 4, ZSTD_btlazy2 }, /* level 13 */
{ 0, 17, 18, 16, 5, 4, ZSTD_btlazy2 }, /* level 14 */
{ 0, 17, 18, 16, 6, 4, ZSTD_btlazy2 }, /* level 15 */
{ 0, 17, 18, 16, 7, 4, ZSTD_btlazy2 }, /* level 16 */
{ 0, 17, 18, 16, 8, 4, ZSTD_btlazy2 }, /* level 17 */
{ 0, 17, 18, 16, 9, 4, ZSTD_btlazy2 }, /* level 18 */
{ 0, 17, 18, 16, 10, 4, ZSTD_btlazy2 }, /* level 19 */
{ 0, 17, 18, 18, 12, 4, ZSTD_btlazy2 }, /* level 20 */
},
{ /* for srcSize <= 16 KB */
/* W, C, H, S, L, strat */
{ 0, 0, 0, 0, 0, 0, ZSTD_fast }, /* level 0 - never used */
{ 0, 14, 14, 14, 1, 4, ZSTD_fast }, /* level 1 */
{ 0, 14, 14, 16, 1, 4, ZSTD_fast }, /* level 2 */
{ 0, 14, 14, 14, 5, 4, ZSTD_greedy }, /* level 3 */
{ 0, 14, 14, 14, 8, 4, ZSTD_greedy }, /* level 4 */
{ 0, 14, 11, 14, 6, 4, ZSTD_lazy }, /* level 5 */
{ 0, 14, 14, 13, 6, 5, ZSTD_lazy }, /* level 6 */
{ 0, 14, 14, 14, 7, 6, ZSTD_lazy }, /* level 7 */
{ 0, 14, 14, 14, 8, 4, ZSTD_lazy }, /* level 8 */
{ 0, 14, 14, 15, 9, 4, ZSTD_lazy }, /* level 9 */
{ 0, 14, 14, 15, 10, 4, ZSTD_lazy }, /* level 10 */
{ 0, 14, 15, 15, 6, 4, ZSTD_btlazy2 }, /* level 11 */
{ 0, 14, 15, 15, 7, 4, ZSTD_btlazy2 }, /* level 12 */
{ 0, 14, 15, 15, 8, 4, ZSTD_btlazy2 }, /* level 13 */
{ 0, 14, 15, 15, 9, 4, ZSTD_btlazy2 }, /* level 14 */
{ 0, 14, 15, 15, 10, 4, ZSTD_btlazy2 }, /* level 15 */
{ 0, 14, 15, 15, 11, 4, ZSTD_btlazy2 }, /* level 16 */
{ 0, 14, 15, 15, 12, 4, ZSTD_btlazy2 }, /* level 17 */
{ 0, 14, 15, 15, 13, 4, ZSTD_btlazy2 }, /* level 18 */
{ 0, 14, 15, 15, 14, 4, ZSTD_btlazy2 }, /* level 19 */
{ 0, 14, 15, 15, 15, 4, ZSTD_btlazy2 }, /* level 20 */
},
};
size_t ZSTD_compressBlock (ZSTD_CCtx* cctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize);
size_t ZSTD_decompressBlock(ZSTD_DCtx* dctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize);
/* *************************************
* Error management
***************************************/
#include "error_public.h"
/*! ZSTD_getErrorCode() :
convert a `size_t` function result into a `ZSTD_error_code` enum type,
which can be used to compare directly with enum list within "error_public.h" */
ZSTD_ErrorCode ZSTD_getError(size_t code);
#if defined (__cplusplus)

View File

@ -53,7 +53,7 @@ MANDIR = $(PREFIX)/share/man/man1
ZSTDDIR = ../lib
ZSTD_FILES := $(ZSTDDIR)/zstd_compress.c $(ZSTDDIR)/zstd_decompress.c $(ZSTDDIR)/fse.c $(ZSTDDIR)/huff0.c
ZSTD_LEGACY:= $(ZSTDDIR)/legacy/zstd_v01.c $(ZSTDDIR)/legacy/zstd_v02.c $(ZSTDDIR)/legacy/zstd_v03.c
ZSTD_LEGACY:= $(ZSTDDIR)/legacy/zstd_v01.c $(ZSTDDIR)/legacy/zstd_v02.c $(ZSTDDIR)/legacy/zstd_v03.c $(ZSTDDIR)/legacy/zstd_v04.c
ifeq ($(ZSTD_LEGACY_SUPPORT), 0)
CPPFLAGS += -DZSTD_LEGACY_SUPPORT=0

View File

@ -212,6 +212,7 @@ typedef struct
} blockParam_t;
#define MIN(a,b) ((a)<(b) ? (a) : (b))
#define MAX(a,b) ((a)>(b) ? (a) : (b))
static int BMK_benchMem(const void* srcBuffer, size_t srcSize,
const char* displayName, int cLevel,
@ -227,6 +228,7 @@ static int BMK_benchMem(const void* srcBuffer, size_t srcSize,
void* const resultBuffer = malloc(srcSize);
ZSTD_CCtx* refCtx = ZSTD_createCCtx();
ZSTD_CCtx* ctx = ZSTD_createCCtx();
ZSTD_DCtx* refDCtx = ZSTD_createDCtx();
ZSTD_DCtx* dctx = ZSTD_createDCtx();
U64 crcOrig = XXH64(srcBuffer, srcSize, 0);
U32 nbBlocks = 0;
@ -235,7 +237,7 @@ static int BMK_benchMem(const void* srcBuffer, size_t srcSize,
if (strlen(displayName)>17) displayName += strlen(displayName)-17; /* can only display 17 characters */
/* Memory allocation & restrictions */
if (!compressedBuffer || !resultBuffer || !blockTable || !refCtx || !ctx || !dctx)
if (!compressedBuffer || !resultBuffer || !blockTable || !refCtx || !ctx || !refDCtx || !dctx)
EXM_THROW(31, "not enough memory");
/* Init blockTable data */
@ -244,13 +246,11 @@ static int BMK_benchMem(const void* srcBuffer, size_t srcSize,
const char* srcPtr = (const char*)srcBuffer;
char* cPtr = (char*)compressedBuffer;
char* resPtr = (char*)resultBuffer;
for (fileNb=0; fileNb<nbFiles; fileNb++)
{
for (fileNb=0; fileNb<nbFiles; fileNb++) {
size_t remaining = fileSizes[fileNb];
U32 nbBlocksforThisFile = (U32)((remaining + (blockSize-1)) / blockSize);
U32 blockEnd = nbBlocks + nbBlocksforThisFile;
for ( ; nbBlocks<blockEnd; nbBlocks++)
{
for ( ; nbBlocks<blockEnd; nbBlocks++) {
size_t thisBlockSize = MIN(remaining, blockSize);
blockTable[nbBlocks].srcPtr = srcPtr;
blockTable[nbBlocks].cPtr = cPtr;
@ -262,9 +262,7 @@ static int BMK_benchMem(const void* srcBuffer, size_t srcSize,
resPtr += thisBlockSize;
remaining -= thisBlockSize;
if (thisBlockSize > largestBlockSize) largestBlockSize = thisBlockSize;
}
}
}
} } }
/* warmimg up memory */
RDG_genBuffer(compressedBuffer, maxCompressedSize, 0.10, 0.50, 1);
@ -278,8 +276,7 @@ static int BMK_benchMem(const void* srcBuffer, size_t srcSize,
U64 crcCheck = 0;
DISPLAY("\r%79s\r", "");
for (loopNb = 1; loopNb <= nbIterations; loopNb++)
{
for (loopNb = 1; loopNb <= nbIterations; loopNb++) {
int nbLoops;
int milliTime;
U32 blockNb;
@ -292,29 +289,15 @@ static int BMK_benchMem(const void* srcBuffer, size_t srcSize,
milliTime = BMK_GetMilliStart();
while (BMK_GetMilliStart() == milliTime);
milliTime = BMK_GetMilliStart();
while (BMK_GetMilliSpan(milliTime) < TIMELOOP)
{
ZSTD_compressBegin_advanced(refCtx, ZSTD_getParams(cLevel, dictBufferSize+largestBlockSize));
ZSTD_compress_insertDictionary(refCtx, dictBuffer, dictBufferSize);
for (blockNb=0; blockNb<nbBlocks; blockNb++)
{
ZSTD_duplicateCCtx(ctx, refCtx);
size_t rSize = ZSTD_compressContinue(ctx,
blockTable[blockNb].cPtr, blockTable[blockNb].cRoom,
blockTable[blockNb].srcPtr,blockTable[blockNb].srcSize);
if (ZSTD_isError(rSize)) EXM_THROW(1, "ZSTD_compressContinue() failed : %s", ZSTD_getErrorName(rSize));
while (BMK_GetMilliSpan(milliTime) < TIMELOOP) {
ZSTD_compressBegin_advanced(refCtx, dictBuffer, dictBufferSize, ZSTD_getParams(cLevel, MAX(dictBufferSize, largestBlockSize)));
for (blockNb=0; blockNb<nbBlocks; blockNb++) {
size_t rSize = ZSTD_compress_usingPreparedCCtx(ctx, refCtx,
blockTable[blockNb].cPtr, blockTable[blockNb].cRoom,
blockTable[blockNb].srcPtr,blockTable[blockNb].srcSize);
if (ZSTD_isError(rSize)) EXM_THROW(1, "ZSTD_compress_usingPreparedCCtx() failed : %s", ZSTD_getErrorName(rSize));
blockTable[blockNb].cSize = rSize;
rSize = ZSTD_compressEnd(ctx,
blockTable[blockNb].cPtr + rSize,
blockTable[blockNb].cRoom - rSize);
if (ZSTD_isError(rSize)) EXM_THROW(2, "ZSTD_compressEnd() failed : %s", ZSTD_getErrorName(rSize));
blockTable[blockNb].cSize += rSize;
}
/*blockTable[blockNb].cSize = ZSTD_compress_usingDict(ctx,
blockTable[blockNb].cPtr, blockTable[blockNb].cRoom,
blockTable[blockNb].srcPtr,blockTable[blockNb].srcSize,
dictBuffer, dictBufferSize,
cLevel);*/
nbLoops++;
}
milliTime = BMK_GetMilliSpan(milliTime);
@ -328,47 +311,54 @@ static int BMK_benchMem(const void* srcBuffer, size_t srcSize,
#if 1
/* Decompression */
memset(resultBuffer, 0xD6, srcSize);
memset(resultBuffer, 0xD6, srcSize); /* warm result buffer */
nbLoops = 0;
milliTime = BMK_GetMilliStart();
while (BMK_GetMilliStart() == milliTime);
milliTime = BMK_GetMilliStart();
for ( ; BMK_GetMilliSpan(milliTime) < TIMELOOP; nbLoops++)
{
for (blockNb=0; blockNb<nbBlocks; blockNb++)
blockTable[blockNb].resSize = ZSTD_decompress_usingDict(dctx,
blockTable[blockNb].resPtr, blockTable[blockNb].srcSize,
blockTable[blockNb].cPtr, blockTable[blockNb].cSize,
dictBuffer, dictBufferSize);
}
milliTime = BMK_GetMilliSpan(milliTime);
for ( ; BMK_GetMilliSpan(milliTime) < TIMELOOP; nbLoops++) {
ZSTD_decompressBegin_usingDict(refDCtx, dictBuffer, dictBufferSize);
for (blockNb=0; blockNb<nbBlocks; blockNb++) {
size_t regenSize = ZSTD_decompress_usingPreparedDCtx(dctx, refDCtx,
blockTable[blockNb].resPtr, blockTable[blockNb].srcSize,
blockTable[blockNb].cPtr, blockTable[blockNb].cSize);
if (ZSTD_isError(regenSize)) {
DISPLAY("ZSTD_decompress_usingPreparedDCtx() failed on block %u : %s",
blockNb, ZSTD_getErrorName(regenSize));
goto _findError;
}
blockTable[blockNb].resSize = regenSize;
} }
milliTime = BMK_GetMilliSpan(milliTime);
if ((double)milliTime < fastestD*nbLoops) fastestD = (double)milliTime / nbLoops;
DISPLAY("%2i-%-17.17s :%10i ->%10i (%5.3f),%6.1f MB/s ,%6.1f MB/s\r", loopNb, displayName, (int)srcSize, (int)cSize, ratio, (double)srcSize / fastestC / 1000., (double)srcSize / fastestD / 1000.);
/* CRC Checking */
_findError:
crcCheck = XXH64(resultBuffer, srcSize, 0);
if (crcOrig!=crcCheck)
{
if (crcOrig!=crcCheck) {
size_t u;
DISPLAY("\n!!! WARNING !!! %14s : Invalid Checksum : %x != %x\n", displayName, (unsigned)crcOrig, (unsigned)crcCheck);
for (u=0; u<srcSize; u++)
{
if (((const BYTE*)srcBuffer)[u] != ((const BYTE*)resultBuffer)[u])
{
U32 bn;
for (u=0; u<srcSize; u++) {
if (((const BYTE*)srcBuffer)[u] != ((const BYTE*)resultBuffer)[u]) {
U32 segNb, bNb, pos;
size_t bacc = 0;
printf("Decoding error at pos %u ", (U32)u);
for (bn = 0; bn < nbBlocks; bn++)
{
if (bacc + blockTable[bn].srcSize > u) break;
bacc += blockTable[bn].srcSize;
for (segNb = 0; segNb < nbBlocks; segNb++) {
if (bacc + blockTable[segNb].srcSize > u) break;
bacc += blockTable[segNb].srcSize;
}
printf("(block %u, pos %u) \n", bn, (U32)(u - bacc));
pos = (U32)(u - bacc);
bNb = pos / (128 KB);
printf("(block %u, sub %u, pos %u) \n", segNb, bNb, pos);
break;
}
}
if (u==srcSize-1) { /* should never happen */
printf("no difference detected\n");
} }
break;
}
#endif
@ -377,7 +367,7 @@ static int BMK_benchMem(const void* srcBuffer, size_t srcSize,
if (crcOrig == crcCheck)
DISPLAY("%2i-%-17.17s :%10i ->%10i (%5.3f),%6.1f MB/s ,%6.1f MB/s \n", cLevel, displayName, (int)srcSize, (int)cSize, ratio, (double)srcSize / fastestC / 1000., (double)srcSize / fastestD / 1000.);
else
DISPLAY("X \n");
DISPLAY("%2i-\n", cLevel);
}
/* clean up */
@ -385,6 +375,7 @@ static int BMK_benchMem(const void* srcBuffer, size_t srcSize,
free(resultBuffer);
ZSTD_freeCCtx(refCtx);
ZSTD_freeCCtx(ctx);
ZSTD_freeDCtx(refDCtx);
ZSTD_freeDCtx(dctx);
return 0;
}
@ -399,12 +390,10 @@ static size_t BMK_findMaxMem(U64 requiredMem)
requiredMem += 2 * step;
if (requiredMem > maxMemory) requiredMem = maxMemory;
while (!testmem)
{
while (!testmem) {
requiredMem -= step;
testmem = (BYTE*)malloc((size_t)requiredMem);
}
free(testmem);
return (size_t)(requiredMem - step);
}
@ -414,8 +403,7 @@ static void BMK_benchCLevel(void* srcBuffer, size_t benchedSize,
const size_t* fileSizes, unsigned nbFiles,
const void* dictBuffer, size_t dictBufferSize)
{
if (cLevel < 0)
{
if (cLevel < 0) {
int l;
for (l=1; l <= -cLevel; l++)
BMK_benchMem(srcBuffer, benchedSize,
@ -447,8 +435,7 @@ static void BMK_loadFiles(void* buffer, size_t bufferSize,
size_t pos = 0;
unsigned n;
for (n=0; n<nbFiles; n++)
{
for (n=0; n<nbFiles; n++) {
size_t readSize;
U64 fileSize = BMK_getFileSize(fileNamesTable[n]);
FILE* f = fopen(fileNamesTable[n], "rb");
@ -478,8 +465,7 @@ static void BMK_benchFileTable(const char** fileNamesTable, unsigned nbFiles,
if (!fileSizes) EXM_THROW(12, "not enough memory for fileSizes");
/* Load dictionary */
if (dictFileName != NULL)
{
if (dictFileName != NULL) {
U64 dictFileSize = BMK_getFileSize(dictFileName);
if (dictFileSize > 64 MB) EXM_THROW(10, "dictionary file %s too large", dictFileName);
dictBufferSize = (size_t)dictFileSize;

View File

@ -23,7 +23,7 @@
- Public forum : https://groups.google.com/forum/#!forum/lz4c
*/
/**************************************
/*-************************************
* Includes
**************************************/
#include <stdlib.h> /* malloc */
@ -31,7 +31,7 @@
#include <string.h> /* memcpy */
/**************************************
/*-************************************
* Basic Types
**************************************/
#if defined (__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L) /* C99 */
@ -50,7 +50,7 @@
#endif
/**************************************
/*-************************************
* OS-specific Includes
**************************************/
#if defined(MSDOS) || defined(OS2) || defined(WIN32) || defined(_WIN32) || defined(__CYGWIN__)
@ -62,7 +62,7 @@
#endif
/**************************************
/*-************************************
* Constants
**************************************/
#define KB *(1 <<10)
@ -71,7 +71,7 @@
#define PRIME2 2246822519U
/**************************************
/*-************************************
* Local types
**************************************/
#define LTLOG 13
@ -81,7 +81,7 @@ typedef BYTE litDistribTable[LTSIZE];
/*********************************************************
/*-*******************************************************
* Local Functions
*********************************************************/
#define RDG_rotl32(x,r) ((x << r) | (x >> (32 - r)))
@ -103,14 +103,12 @@ static void RDG_fillLiteralDistrib(litDistribTable lt, double ld)
BYTE firstChar = '(';
BYTE lastChar = '}';
if (ld==0.0)
{
if (ld==0.0) {
character = 0;
firstChar = 0;
lastChar =255;
}
while (i<LTSIZE)
{
while (i<LTSIZE) {
U32 weight = (U32)((double)(LTSIZE - i) * ld) + 1;
U32 end;
if (weight + i > LTSIZE) weight = LTSIZE-i;
@ -140,13 +138,11 @@ void RDG_genBlock(void* buffer, size_t buffSize, size_t prefixSize, double match
U32 prevOffset = 1;
/* special case : sparse content */
while (matchProba >= 1.0)
{
while (matchProba >= 1.0) {
size_t size0 = RDG_rand(seed) & 3;
size0 = (size_t)1 << (16 + size0 * 2);
size0 += RDG_rand(seed) & (size0-1); /* because size0 is power of 2*/
if (buffSize < pos + size0)
{
if (buffSize < pos + size0) {
memset(buffPtr+pos, 0, buffSize-pos);
return;
}
@ -160,11 +156,9 @@ void RDG_genBlock(void* buffer, size_t buffSize, size_t prefixSize, double match
if (pos==0) buffPtr[0] = RDG_genChar(seed, lt), pos=1;
/* Generate compressible data */
while (pos < buffSize)
{
while (pos < buffSize) {
/* Select : Literal (char) or Match (within 32K) */
if (RDG_RAND15BITS < matchProba32)
{
if (RDG_RAND15BITS < matchProba32) {
/* Copy (within 32K) */
size_t match;
size_t d;
@ -178,17 +172,14 @@ void RDG_genBlock(void* buffer, size_t buffSize, size_t prefixSize, double match
d = pos + length;
if (d > buffSize) d = buffSize;
while (pos < d) buffPtr[pos++] = buffPtr[match++]; /* correctly manages overlaps */
}
else
{
} else {
/* Literal (noise) */
size_t d;
size_t length = RDG_RANDLENGTH;
d = pos + length;
if (d > buffSize) d = buffSize;
while (pos < d) buffPtr[pos++] = RDG_genChar(seed, lt);
}
}
} }
}
@ -220,8 +211,7 @@ void RDG_genStdout(unsigned long long size, double matchProba, double litProba,
RDG_genBlock(buff, RDG_DICTSIZE, 0, matchProba, lt, &seed);
/* Generate compressible data */
while (total < size)
{
while (total < size) {
RDG_genBlock(buff, RDG_DICTSIZE+RDG_BLOCKSIZE, RDG_DICTSIZE, matchProba, lt, &seed);
if (size-total < RDG_BLOCKSIZE) genBlockSize = (size_t)(size-total);
total += genBlockSize;
@ -230,6 +220,6 @@ void RDG_genStdout(unsigned long long size, double matchProba, double litProba,
memcpy(buff, buff + RDG_BLOCKSIZE, RDG_DICTSIZE);
}
// cleanup
/* cleanup */
free(buff);
}

View File

@ -193,42 +193,30 @@ static U64 FIO_getFileSize(const char* infilename)
static int FIO_getFiles(FILE** fileOutPtr, FILE** fileInPtr,
const char* dstFileName, const char* srcFileName)
{
if (!strcmp (srcFileName, stdinmark))
{
if (!strcmp (srcFileName, stdinmark)) {
DISPLAYLEVEL(4,"Using stdin for input\n");
*fileInPtr = stdin;
SET_BINARY_MODE(stdin);
}
else
{
} else {
*fileInPtr = fopen(srcFileName, "rb");
}
if ( *fileInPtr==0 )
{
if ( *fileInPtr==0 ) {
DISPLAYLEVEL(1, "Unable to access file for processing: %s\n", srcFileName);
return 1;
}
if (!strcmp (dstFileName, stdoutmark))
{
if (!strcmp (dstFileName, stdoutmark)) {
DISPLAYLEVEL(4,"Using stdout for output\n");
*fileOutPtr = stdout;
SET_BINARY_MODE(stdout);
}
else
{
/* Check if destination file already exists */
if (!g_overwrite)
{
} else {
if (!g_overwrite) { /* Check if destination file already exists */
*fileOutPtr = fopen( dstFileName, "rb" );
if (*fileOutPtr != 0)
{
/* prompt for overwrite authorization */
if (*fileOutPtr != 0) { /* dest file exists, prompt for overwrite authorization */
fclose(*fileOutPtr);
DISPLAY("Warning : %s already exists \n", dstFileName);
if ((g_displayLevel <= 1) || (*fileInPtr == stdin))
{
if ((g_displayLevel <= 1) || (*fileInPtr == stdin)) {
/* No interaction possible */
DISPLAY("Operation aborted : %s already exists \n", dstFileName);
return 1;
@ -236,15 +224,12 @@ static int FIO_getFiles(FILE** fileOutPtr, FILE** fileInPtr,
DISPLAY("Overwrite ? (y/N) : ");
{
int ch = getchar();
if ((ch!='Y') && (ch!='y'))
{
if ((ch!='Y') && (ch!='y')) {
DISPLAY("No. Operation aborted : %s already exists \n", dstFileName);
return 1;
}
while ((ch!=EOF) && (ch!='\n')) ch = getchar(); /* flush rest of input line */
}
}
}
} } }
*fileOutPtr = fopen( dstFileName, "wb" );
}
@ -265,15 +250,13 @@ static size_t FIO_loadFile(void** bufferPtr, const char* fileName)
U64 fileSize;
*bufferPtr = NULL;
if (fileName == NULL)
return 0;
if (fileName == NULL) return 0;
DISPLAYLEVEL(4,"Loading %s as dictionary \n", fileName);
fileHandle = fopen(fileName, "rb");
if (fileHandle==0) EXM_THROW(31, "Error opening file %s", fileName);
fileSize = FIO_getFileSize(fileName);
if (fileSize > MAX_DICT_SIZE)
{
if (fileSize > MAX_DICT_SIZE) {
int seekResult;
if (fileSize > 1 GB) EXM_THROW(32, "Dictionary file %s is too large", fileName); /* avoid extreme cases */
DISPLAYLEVEL(2,"Dictionary %s is too large : using last %u bytes only \n", fileName, MAX_DICT_SIZE);
@ -355,19 +338,14 @@ static int FIO_compressFilename_extRess(cRess_t ress,
/* init */
filesize = FIO_getFileSize(srcFileName) + dictSize;
errorCode = ZBUFF_compressInit_advanced(ress.ctx, ZSTD_getParams(cLevel, filesize));
if (ZBUFF_isError(errorCode)) EXM_THROW(21, "Error initializing compression");
errorCode = ZBUFF_compressWithDictionary(ress.ctx, ress.dictBuffer, ress.dictBufferSize);
if (ZBUFF_isError(errorCode)) EXM_THROW(22, "Error initializing dictionary");
errorCode = ZBUFF_compressInit_advanced(ress.ctx, ress.dictBuffer, ress.dictBufferSize, ZSTD_getParams(cLevel, filesize));
if (ZBUFF_isError(errorCode)) EXM_THROW(21, "Error initializing compression : %s", ZBUFF_getErrorName(errorCode));
/* Main compression loop */
filesize = 0;
while (1)
{
size_t inSize;
while (1) {
/* Fill input Buffer */
inSize = fread(ress.srcBuffer, (size_t)1, ress.srcBufferSize, srcFile);
size_t inSize = fread(ress.srcBuffer, (size_t)1, ress.srcBufferSize, srcFile);
if (inSize==0) break;
filesize += inSize;
DISPLAYUPDATE(2, "\rRead : %u MB ", (U32)(filesize>>20));
@ -460,8 +438,7 @@ int FIO_compressMultipleFilenames(const char** inFileNamesTable, unsigned nbFile
ress = FIO_createCResources(dictFileName);
/* loop on each file */
for (u=0; u<nbFiles; u++)
{
for (u=0; u<nbFiles; u++) {
size_t ifnSize = strlen(inFileNamesTable[u]);
if (dfnSize <= ifnSize+suffixSize+1) { free(dstFileName); dfnSize = ifnSize + 20; dstFileName = (char*)malloc(dfnSize); }
strcpy(dstFileName, inFileNamesTable[u]);
@ -529,10 +506,8 @@ unsigned long long FIO_decompressFrame(dRess_t ress,
size_t readSize=alreadyLoaded;
/* Main decompression Loop */
ZBUFF_decompressInit(ress.dctx);
ZBUFF_decompressWithDictionary(ress.dctx, ress.dictBuffer, ress.dictBufferSize);
while (1)
{
ZBUFF_decompressInitDictionary(ress.dctx, ress.dictBuffer, ress.dictBufferSize);
while (1) {
/* Decode */
size_t sizeCheck;
size_t inSize=readSize, decodedSize=ress.dstBufferSize;
@ -570,8 +545,7 @@ static int FIO_decompressFile_extRess(dRess_t ress,
if (FIO_getFiles(&dstFile, &srcFile, dstFileName, srcFileName)) return 1;
/* for each frame */
for ( ; ; )
{
for ( ; ; ) {
size_t sizeCheck;
/* check magic number -> version */
size_t toRead = 4;
@ -579,8 +553,7 @@ static int FIO_decompressFile_extRess(dRess_t ress,
if (sizeCheck==0) break; /* no more input */
if (sizeCheck != toRead) EXM_THROW(31, "Read error : cannot read header");
#if defined(ZSTD_LEGACY_SUPPORT) && (ZSTD_LEGACY_SUPPORT==1)
if (ZSTD_isLegacy(MEM_readLE32(ress.srcBuffer)))
{
if (ZSTD_isLegacy(MEM_readLE32(ress.srcBuffer))) {
filesize += FIO_decompressLegacyFrame(dstFile, srcFile, MEM_readLE32(ress.srcBuffer));
continue;
}
@ -630,14 +603,12 @@ int FIO_decompressMultipleFilenames(const char** srcNamesTable, unsigned nbFiles
if (dstFileName==NULL) EXM_THROW(70, "not enough memory for dstFileName");
ress = FIO_createDResources(dictFileName);
for (u=0; u<nbFiles; u++)
{
for (u=0; u<nbFiles; u++) {
const char* srcFileName = srcNamesTable[u];
size_t sfnSize = strlen(srcFileName);
const char* suffixPtr = srcFileName + sfnSize - suffixSize;
if (dfnSize <= sfnSize-suffixSize+1) { free(dstFileName); dfnSize = sfnSize + 20; dstFileName = (char*)malloc(dfnSize); if (dstFileName==NULL) EXM_THROW(71, "not enough memory for dstFileName"); }
if (sfnSize <= suffixSize || strcmp(suffixPtr, suffix) != 0)
{
if (sfnSize <= suffixSize || strcmp(suffixPtr, suffix) != 0) {
DISPLAYLEVEL(1, "File extension doesn't match expected extension (%4s); will not process file: %s\n", suffix, srcFileName);
skippedFiles++;
continue;

View File

@ -203,11 +203,9 @@ static int basicUnitTests(U32 seed, double compressibility)
size_t cSizeOrig;
DISPLAYLEVEL(4, "test%3i : load dictionary into context : ", testNb++);
result = ZSTD_compressBegin(ctxOrig, 2);
result = ZSTD_compressBegin_usingDict(ctxOrig, CNBuffer, dictSize, 2);
if (ZSTD_isError(result)) goto _output_error;
result = ZSTD_compress_insertDictionary(ctxOrig, CNBuffer, dictSize);
if (ZSTD_isError(result)) goto _output_error;
result = ZSTD_duplicateCCtx(ctxDuplicated, ctxOrig);
result = ZSTD_copyCCtx(ctxDuplicated, ctxOrig);
if (ZSTD_isError(result)) goto _output_error;
DISPLAYLEVEL(4, "OK \n");
@ -284,7 +282,7 @@ static int basicUnitTests(U32 seed, double compressibility)
DISPLAYLEVEL(4, "OK \n");
DISPLAYLEVEL(4, "test%3i : Block decompression test : ", testNb++);
result = ZSTD_resetDCtx(dctx);
result = ZSTD_decompressBegin(dctx);
if (ZSTD_isError(result)) goto _output_error;
result = ZSTD_decompressBlock(dctx, decodedBuffer, COMPRESSIBLE_NOISE_LENGTH, compressedBuffer, cSize);
if (ZSTD_isError(result)) goto _output_error;
@ -293,18 +291,15 @@ static int basicUnitTests(U32 seed, double compressibility)
/* dictionary block compression */
DISPLAYLEVEL(4, "test%3i : Dictionary Block compression test : ", testNb++);
result = ZSTD_compressBegin(cctx, 5);
if (ZSTD_isError(result)) goto _output_error;
result = ZSTD_compress_insertDictionary(cctx, CNBuffer, dictSize);
result = ZSTD_compressBegin_usingDict(cctx, CNBuffer, dictSize, 5);
if (ZSTD_isError(result)) goto _output_error;
cSize = ZSTD_compressBlock(cctx, compressedBuffer, ZSTD_compressBound(blockSize), (char*)CNBuffer+dictSize, blockSize);
if (ZSTD_isError(cSize)) goto _output_error;
DISPLAYLEVEL(4, "OK \n");
DISPLAYLEVEL(4, "test%3i : Dictionary Block decompression test : ", testNb++);
result = ZSTD_resetDCtx(dctx);
result = ZSTD_decompressBegin_usingDict(dctx, CNBuffer, dictSize);
if (ZSTD_isError(result)) goto _output_error;
ZSTD_decompress_insertDictionary(dctx, CNBuffer, dictSize);
result = ZSTD_decompressBlock(dctx, decodedBuffer, COMPRESSIBLE_NOISE_LENGTH, compressedBuffer, cSize);
if (ZSTD_isError(result)) goto _output_error;
if (result != blockSize) goto _output_error;
@ -570,12 +565,10 @@ int fuzzerTests(U32 seed, U32 nbTests, unsigned startTest, double compressibilit
dict = srcBuffer + sampleStart;
dictSize = sampleSize;
errorCode = ZSTD_compressBegin(refCtx, (FUZ_rand(&lseed) % (20 - (sampleSizeLog/3))) + 1);
CHECK (ZSTD_isError(errorCode), "start streaming error : %s", ZSTD_getErrorName(errorCode));
errorCode = ZSTD_compress_insertDictionary(refCtx, dict, dictSize);
CHECK (ZSTD_isError(errorCode), "dictionary insertion error : %s", ZSTD_getErrorName(errorCode));
errorCode = ZSTD_duplicateCCtx(ctx, refCtx);
CHECK (ZSTD_isError(errorCode), "context duplication error : %s", ZSTD_getErrorName(errorCode));
errorCode = ZSTD_compressBegin_usingDict(refCtx, dict, dictSize, (FUZ_rand(&lseed) % (20 - (sampleSizeLog/3))) + 1);
CHECK (ZSTD_isError(errorCode), "ZSTD_compressBegin_usingDict error : %s", ZSTD_getErrorName(errorCode));
errorCode = ZSTD_copyCCtx(ctx, refCtx);
CHECK (ZSTD_isError(errorCode), "ZSTD_copyCCtx error : %s", ZSTD_getErrorName(errorCode));
totalTestSize = 0; cSize = 0;
for (n=0; n<nbChunks; n++)
{
@ -603,9 +596,8 @@ int fuzzerTests(U32 seed, U32 nbTests, unsigned startTest, double compressibilit
crcOrig = XXH64_digest(xxh64);
/* streaming decompression test */
errorCode = ZSTD_resetDCtx(dctx);
errorCode = ZSTD_decompressBegin_usingDict(dctx, dict, dictSize);
CHECK (ZSTD_isError(errorCode), "cannot init DCtx : %s", ZSTD_getErrorName(errorCode));
ZSTD_decompress_insertDictionary(dctx, dict, dictSize);
totalCSize = 0;
totalGenSize = 0;
while (totalCSize < cSize)

View File

@ -329,6 +329,86 @@ unsigned long long FIOv03_decompressFrame(FILE* foutput, FILE* finput)
}
/*- v0.4.x -*/
typedef struct {
void* srcBuffer;
size_t srcBufferSize;
void* dstBuffer;
size_t dstBufferSize;
void* dictBuffer;
size_t dictBufferSize;
ZBUFFv04_DCtx* dctx;
} dRessv04_t;
static dRessv04_t FIOv04_createDResources(void)
{
dRessv04_t ress;
/* init */
ress.dctx = ZBUFFv04_createDCtx();
if (ress.dctx==NULL) EXM_THROW(60, "Can't create ZBUFF decompression context");
ress.dictBuffer = NULL; ress.dictBufferSize=0;
/* Allocate Memory */
ress.srcBufferSize = ZBUFFv04_recommendedDInSize();
ress.srcBuffer = malloc(ress.srcBufferSize);
ress.dstBufferSize = ZBUFFv04_recommendedDOutSize();
ress.dstBuffer = malloc(ress.dstBufferSize);
if (!ress.srcBuffer || !ress.dstBuffer) EXM_THROW(61, "Allocation error : not enough memory");
return ress;
}
static void FIOv04_freeDResources(dRessv04_t ress)
{
size_t errorCode = ZBUFFv04_freeDCtx(ress.dctx);
if (ZBUFFv04_isError(errorCode)) EXM_THROW(69, "Error : can't free ZBUFF context resource : %s", ZBUFFv04_getErrorName(errorCode));
free(ress.srcBuffer);
free(ress.dstBuffer);
free(ress.dictBuffer);
}
unsigned long long FIOv04_decompressFrame(dRessv04_t ress,
FILE* foutput, FILE* finput)
{
U64 frameSize = 0;
size_t readSize = 4;
MEM_writeLE32(ress.srcBuffer, ZSTDv04_magicNumber);
ZBUFFv04_decompressInit(ress.dctx);
ZBUFFv04_decompressWithDictionary(ress.dctx, ress.dictBuffer, ress.dictBufferSize);
while (1)
{
/* Decode */
size_t sizeCheck;
size_t inSize=readSize, decodedSize=ress.dstBufferSize;
size_t toRead = ZBUFFv04_decompressContinue(ress.dctx, ress.dstBuffer, &decodedSize, ress.srcBuffer, &inSize);
if (ZBUFFv04_isError(toRead)) EXM_THROW(36, "Decoding error : %s", ZBUFFv04_getErrorName(toRead));
readSize -= inSize;
/* Write block */
sizeCheck = fwrite(ress.dstBuffer, 1, decodedSize, foutput);
if (sizeCheck != decodedSize) EXM_THROW(37, "Write error : unable to write data block to destination file");
frameSize += decodedSize;
DISPLAYUPDATE(2, "\rDecoded : %u MB... ", (U32)(frameSize>>20) );
if (toRead == 0) break;
if (readSize) EXM_THROW(38, "Decoding error : should consume entire input");
/* Fill input buffer */
if (toRead > ress.srcBufferSize) EXM_THROW(34, "too large block");
readSize = fread(ress.srcBuffer, 1, toRead, finput);
if (readSize != toRead) EXM_THROW(35, "Read error");
}
FIOv04_freeDResources(ress);
return frameSize;
}
unsigned long long FIO_decompressLegacyFrame(FILE* foutput, FILE* finput, U32 magicNumberLE)
{
switch(magicNumberLE)
@ -339,6 +419,8 @@ unsigned long long FIO_decompressLegacyFrame(FILE* foutput, FILE* finput, U32 ma
return FIOv02_decompressFrame(foutput, finput);
case ZSTDv03_magicNumber :
return FIOv03_decompressFrame(foutput, finput);
case ZSTDv04_magicNumber :
return FIOv04_decompressFrame(FIOv04_createDResources(), foutput, finput);
default :
return ERROR(prefix_unknown);
}

View File

@ -158,10 +158,9 @@ static int basicUnitTests(U32 seed, double compressibility)
/* Basic compression test */
DISPLAYLEVEL(4, "test%3i : compress %u bytes : ", testNb++, COMPRESSIBLE_NOISE_LENGTH);
ZBUFF_compressInit(zc, 1);
ZBUFF_compressInitDictionary(zc, CNBuffer, 128 KB, 1);
readSize = CNBufferSize;
genSize = compressedBufferSize;
ZBUFF_compressWithDictionary(zc, CNBuffer, 128 KB);
result = ZBUFF_compressContinue(zc, compressedBuffer, &genSize, CNBuffer, &readSize);
if (ZBUFF_isError(result)) goto _output_error;
if (readSize != CNBufferSize) goto _output_error; /* entire input should be consumed */
@ -174,8 +173,7 @@ static int basicUnitTests(U32 seed, double compressibility)
/* Basic decompression test */
DISPLAYLEVEL(4, "test%3i : decompress %u bytes : ", testNb++, COMPRESSIBLE_NOISE_LENGTH);
ZBUFF_decompressInit(zd);
ZBUFF_decompressWithDictionary(zd, CNBuffer, 128 KB);
ZBUFF_decompressInitDictionary(zd, CNBuffer, 128 KB);
readSize = cSize;
genSize = CNBufferSize;
result = ZBUFF_decompressContinue(zd, decodedBuffer, &genSize, compressedBuffer, &readSize);
@ -318,7 +316,6 @@ int fuzzerTests(U32 seed, U32 nbTests, unsigned startTest, double compressibilit
sampleSizeLog = FUZ_rand(&lseed) % maxSrcLog;
maxTestSize = (size_t)1 << sampleSizeLog;
maxTestSize += FUZ_rand(&lseed) & (maxTestSize-1);
ZBUFF_compressInit(zc, (FUZ_rand(&lseed) % (20 - (sampleSizeLog/3))) + 1);
sampleSizeLog = FUZ_rand(&lseed) % maxSampleLog;
sampleSize = (size_t)1 << sampleSizeLog;
@ -326,7 +323,7 @@ int fuzzerTests(U32 seed, U32 nbTests, unsigned startTest, double compressibilit
sampleStart = FUZ_rand(&lseed) % (srcBufferSize - sampleSize);
dict = srcBuffer + sampleStart;
dictSize = sampleSize;
ZBUFF_compressWithDictionary(zc, dict, dictSize);
ZBUFF_compressInitDictionary(zc, dict, dictSize, (FUZ_rand(&lseed) % (20 - (sampleSizeLog/3))) + 1);
totalTestSize = 0;
cSize = 0;
@ -374,8 +371,7 @@ int fuzzerTests(U32 seed, U32 nbTests, unsigned startTest, double compressibilit
crcOrig = XXH64_digest(xxh64);
/* multi - fragments decompression test */
ZBUFF_decompressInit(zd);
ZBUFF_decompressWithDictionary(zd, dict, dictSize);
ZBUFF_decompressInitDictionary(zd, dict, dictSize);
totalCSize = 0;
totalGenSize = 0;
while (totalCSize < cSize)

View File

@ -0,0 +1,183 @@
<?xml version="1.0" encoding="utf-8"?>
<Project DefaultTargets="Build" ToolsVersion="4.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
<ItemGroup Label="ProjectConfigurations">
<ProjectConfiguration Include="Debug|Win32">
<Configuration>Debug</Configuration>
<Platform>Win32</Platform>
</ProjectConfiguration>
<ProjectConfiguration Include="Debug|x64">
<Configuration>Debug</Configuration>
<Platform>x64</Platform>
</ProjectConfiguration>
<ProjectConfiguration Include="Release|Win32">
<Configuration>Release</Configuration>
<Platform>Win32</Platform>
</ProjectConfiguration>
<ProjectConfiguration Include="Release|x64">
<Configuration>Release</Configuration>
<Platform>x64</Platform>
</ProjectConfiguration>
</ItemGroup>
<PropertyGroup Label="Globals">
<ProjectGuid>{D4C01A3D-F609-4DA6-B53F-88D063CCE993}</ProjectGuid>
<Keyword>Win32Proj</Keyword>
<RootNamespace>fuzzer</RootNamespace>
</PropertyGroup>
<Import Project="$(VCTargetsPath)\Microsoft.Cpp.Default.props" />
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'" Label="Configuration">
<ConfigurationType>Application</ConfigurationType>
<UseDebugLibraries>true</UseDebugLibraries>
<PlatformToolset>v120</PlatformToolset>
<CharacterSet>Unicode</CharacterSet>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'" Label="Configuration">
<ConfigurationType>Application</ConfigurationType>
<UseDebugLibraries>true</UseDebugLibraries>
<PlatformToolset>v120</PlatformToolset>
<CharacterSet>Unicode</CharacterSet>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'" Label="Configuration">
<ConfigurationType>Application</ConfigurationType>
<UseDebugLibraries>false</UseDebugLibraries>
<PlatformToolset>v120</PlatformToolset>
<WholeProgramOptimization>true</WholeProgramOptimization>
<CharacterSet>Unicode</CharacterSet>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'" Label="Configuration">
<ConfigurationType>Application</ConfigurationType>
<UseDebugLibraries>false</UseDebugLibraries>
<PlatformToolset>v120</PlatformToolset>
<WholeProgramOptimization>true</WholeProgramOptimization>
<CharacterSet>Unicode</CharacterSet>
</PropertyGroup>
<Import Project="$(VCTargetsPath)\Microsoft.Cpp.props" />
<ImportGroup Label="ExtensionSettings">
</ImportGroup>
<ImportGroup Label="PropertySheets" Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">
<Import Project="$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props" Condition="exists('$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props')" Label="LocalAppDataPlatform" />
</ImportGroup>
<ImportGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'" Label="PropertySheets">
<Import Project="$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props" Condition="exists('$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props')" Label="LocalAppDataPlatform" />
</ImportGroup>
<ImportGroup Label="PropertySheets" Condition="'$(Configuration)|$(Platform)'=='Release|Win32'">
<Import Project="$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props" Condition="exists('$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props')" Label="LocalAppDataPlatform" />
</ImportGroup>
<ImportGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'" Label="PropertySheets">
<Import Project="$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props" Condition="exists('$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props')" Label="LocalAppDataPlatform" />
</ImportGroup>
<PropertyGroup Label="UserMacros" />
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">
<LinkIncremental>true</LinkIncremental>
<IncludePath>$(SolutionDir)..\..\lib;$(SolutionDir)..\..\lib\legacy;$(VCInstallDir)include;$(VCInstallDir)atlmfc\include;$(WindowsSDK_IncludePath);</IncludePath>
<RunCodeAnalysis>true</RunCodeAnalysis>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'">
<LinkIncremental>true</LinkIncremental>
<IncludePath>$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib;$(VCInstallDir)include;$(VCInstallDir)atlmfc\include;$(WindowsSDK_IncludePath);</IncludePath>
<RunCodeAnalysis>true</RunCodeAnalysis>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'">
<LinkIncremental>false</LinkIncremental>
<IncludePath>$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib;$(VCInstallDir)include;$(VCInstallDir)atlmfc\include;$(WindowsSDK_IncludePath);</IncludePath>
<RunCodeAnalysis>true</RunCodeAnalysis>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'">
<LinkIncremental>false</LinkIncremental>
<IncludePath>$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib;$(VCInstallDir)include;$(VCInstallDir)atlmfc\include;$(WindowsSDK_IncludePath);</IncludePath>
<RunCodeAnalysis>true</RunCodeAnalysis>
</PropertyGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">
<ClCompile>
<PrecompiledHeader>
</PrecompiledHeader>
<WarningLevel>Level4</WarningLevel>
<Optimization>Disabled</Optimization>
<PreprocessorDefinitions>WIN32;_DEBUG;_CONSOLE;%(PreprocessorDefinitions)</PreprocessorDefinitions>
<EnablePREfast>true</EnablePREfast>
<AdditionalOptions>/analyze:stacksize25000 %(AdditionalOptions)</AdditionalOptions>
</ClCompile>
<Link>
<SubSystem>Console</SubSystem>
<GenerateDebugInformation>true</GenerateDebugInformation>
<AdditionalDependencies>setargv.obj;kernel32.lib;user32.lib;gdi32.lib;winspool.lib;comdlg32.lib;advapi32.lib;shell32.lib;ole32.lib;oleaut32.lib;uuid.lib;odbc32.lib;odbccp32.lib;%(AdditionalDependencies)</AdditionalDependencies>
</Link>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'">
<ClCompile>
<PrecompiledHeader>
</PrecompiledHeader>
<WarningLevel>Level4</WarningLevel>
<Optimization>Disabled</Optimization>
<PreprocessorDefinitions>WIN32;_DEBUG;_CONSOLE;%(PreprocessorDefinitions)</PreprocessorDefinitions>
<EnablePREfast>true</EnablePREfast>
<AdditionalOptions>/analyze:stacksize25000 %(AdditionalOptions)</AdditionalOptions>
</ClCompile>
<Link>
<SubSystem>Console</SubSystem>
<GenerateDebugInformation>true</GenerateDebugInformation>
</Link>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'">
<ClCompile>
<WarningLevel>Level4</WarningLevel>
<PrecompiledHeader>
</PrecompiledHeader>
<Optimization>MaxSpeed</Optimization>
<FunctionLevelLinking>true</FunctionLevelLinking>
<IntrinsicFunctions>true</IntrinsicFunctions>
<PreprocessorDefinitions>WIN32;NDEBUG;_CONSOLE;%(PreprocessorDefinitions)</PreprocessorDefinitions>
<EnablePREfast>true</EnablePREfast>
<AdditionalOptions>/analyze:stacksize25000 %(AdditionalOptions)</AdditionalOptions>
</ClCompile>
<Link>
<SubSystem>Console</SubSystem>
<GenerateDebugInformation>true</GenerateDebugInformation>
<EnableCOMDATFolding>true</EnableCOMDATFolding>
<OptimizeReferences>true</OptimizeReferences>
</Link>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'">
<ClCompile>
<WarningLevel>Level4</WarningLevel>
<PrecompiledHeader>
</PrecompiledHeader>
<Optimization>MaxSpeed</Optimization>
<FunctionLevelLinking>true</FunctionLevelLinking>
<IntrinsicFunctions>true</IntrinsicFunctions>
<PreprocessorDefinitions>WIN32;NDEBUG;_CONSOLE;%(PreprocessorDefinitions)</PreprocessorDefinitions>
<EnablePREfast>true</EnablePREfast>
<AdditionalOptions>/analyze:stacksize25000 %(AdditionalOptions)</AdditionalOptions>
</ClCompile>
<Link>
<SubSystem>Console</SubSystem>
<GenerateDebugInformation>true</GenerateDebugInformation>
<EnableCOMDATFolding>true</EnableCOMDATFolding>
<OptimizeReferences>true</OptimizeReferences>
</Link>
</ItemDefinitionGroup>
<ItemGroup>
<ClCompile Include="..\..\..\dictBuilder\dibcli.c" />
<ClCompile Include="..\..\..\dictBuilder\dictBuilder.c" />
<ClCompile Include="..\..\..\dictBuilder\divsufsort.c" />
<ClCompile Include="..\..\..\dictBuilder\sssort.c" />
<ClCompile Include="..\..\..\dictBuilder\trsort.c" />
<ClCompile Include="..\..\..\dictBuilder\utils.c" />
<ClCompile Include="..\..\..\lib\fse.c" />
<ClCompile Include="..\..\..\lib\huff0.c" />
<ClCompile Include="..\..\..\lib\zstd_decompress.c" />
</ItemGroup>
<ItemGroup>
<ClInclude Include="..\..\..\dictBuilder\config.h" />
<ClInclude Include="..\..\..\dictBuilder\dictBuilder.h" />
<ClInclude Include="..\..\..\dictBuilder\divsufsort.h" />
<ClInclude Include="..\..\..\dictBuilder\divsufsort_private.h" />
<ClInclude Include="..\..\..\dictBuilder\lfs.h" />
<ClInclude Include="..\..\..\lib\fse.h" />
<ClInclude Include="..\..\..\lib\huff0.h" />
<ClInclude Include="..\..\..\lib\huff0_static.h" />
<ClInclude Include="..\..\..\lib\zstd.h" />
</ItemGroup>
<Import Project="$(VCTargetsPath)\Microsoft.Cpp.targets" />
<ImportGroup Label="ExtensionTargets">
</ImportGroup>
</Project>

View File

@ -0,0 +1,75 @@
<?xml version="1.0" encoding="utf-8"?>
<Project ToolsVersion="4.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
<ItemGroup>
<Filter Include="Fichiers sources">
<UniqueIdentifier>{4FC737F1-C7A5-4376-A066-2A32D752A2FF}</UniqueIdentifier>
<Extensions>cpp;c;cc;cxx;def;odl;idl;hpj;bat;asm;asmx</Extensions>
</Filter>
<Filter Include="Fichiers d%27en-tête">
<UniqueIdentifier>{93995380-89BD-4b04-88EB-625FBE52EBFB}</UniqueIdentifier>
<Extensions>h;hpp;hxx;hm;inl;inc;xsd</Extensions>
</Filter>
<Filter Include="Fichiers de ressources">
<UniqueIdentifier>{67DA6AB6-F800-4c08-8B7A-83BB121AAD01}</UniqueIdentifier>
<Extensions>rc;ico;cur;bmp;dlg;rc2;rct;bin;rgs;gif;jpg;jpeg;jpe;resx;tiff;tif;png;wav;mfcribbon-ms</Extensions>
</Filter>
</ItemGroup>
<ItemGroup>
<ClCompile Include="..\..\..\dictBuilder\dibcli.c">
<Filter>Fichiers sources</Filter>
</ClCompile>
<ClCompile Include="..\..\..\dictBuilder\dictBuilder.c">
<Filter>Fichiers sources</Filter>
</ClCompile>
<ClCompile Include="..\..\..\dictBuilder\divsufsort.c">
<Filter>Fichiers sources</Filter>
</ClCompile>
<ClCompile Include="..\..\..\dictBuilder\sssort.c">
<Filter>Fichiers sources</Filter>
</ClCompile>
<ClCompile Include="..\..\..\dictBuilder\trsort.c">
<Filter>Fichiers sources</Filter>
</ClCompile>
<ClCompile Include="..\..\..\dictBuilder\utils.c">
<Filter>Fichiers sources</Filter>
</ClCompile>
<ClCompile Include="..\..\..\lib\fse.c">
<Filter>Fichiers sources</Filter>
</ClCompile>
<ClCompile Include="..\..\..\lib\huff0.c">
<Filter>Fichiers sources</Filter>
</ClCompile>
<ClCompile Include="..\..\..\lib\zstd_decompress.c">
<Filter>Fichiers sources</Filter>
</ClCompile>
</ItemGroup>
<ItemGroup>
<ClInclude Include="..\..\..\dictBuilder\config.h">
<Filter>Fichiers d%27en-tête</Filter>
</ClInclude>
<ClInclude Include="..\..\..\dictBuilder\dictBuilder.h">
<Filter>Fichiers d%27en-tête</Filter>
</ClInclude>
<ClInclude Include="..\..\..\dictBuilder\divsufsort.h">
<Filter>Fichiers d%27en-tête</Filter>
</ClInclude>
<ClInclude Include="..\..\..\dictBuilder\divsufsort_private.h">
<Filter>Fichiers d%27en-tête</Filter>
</ClInclude>
<ClInclude Include="..\..\..\dictBuilder\lfs.h">
<Filter>Fichiers d%27en-tête</Filter>
</ClInclude>
<ClInclude Include="..\..\..\lib\fse.h">
<Filter>Fichiers d%27en-tête</Filter>
</ClInclude>
<ClInclude Include="..\..\..\lib\huff0.h">
<Filter>Fichiers d%27en-tête</Filter>
</ClInclude>
<ClInclude Include="..\..\..\lib\huff0_static.h">
<Filter>Fichiers d%27en-tête</Filter>
</ClInclude>
<ClInclude Include="..\..\..\lib\zstd.h">
<Filter>Fichiers d%27en-tête</Filter>
</ClInclude>
</ItemGroup>
</Project>

View File

@ -11,6 +11,8 @@ Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "fullbench", "fullbench\full
EndProject
Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "zstdlib", "zstdlib\zstdlib.vcxproj", "{8BFD8150-94D5-4BF9-8A50-7BD9929A0850}"
EndProject
Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "dictBuilder", "dictBuilder\dictBuilder.vcxproj", "{D4C01A3D-F609-4DA6-B53F-88D063CCE993}"
EndProject
Global
GlobalSection(SolutionConfigurationPlatforms) = preSolution
Debug|Win32 = Debug|Win32
@ -51,6 +53,14 @@ Global
{8BFD8150-94D5-4BF9-8A50-7BD9929A0850}.Release|Win32.Build.0 = Release|Win32
{8BFD8150-94D5-4BF9-8A50-7BD9929A0850}.Release|x64.ActiveCfg = Release|x64
{8BFD8150-94D5-4BF9-8A50-7BD9929A0850}.Release|x64.Build.0 = Release|x64
{D4C01A3D-F609-4DA6-B53F-88D063CCE993}.Debug|Win32.ActiveCfg = Debug|Win32
{D4C01A3D-F609-4DA6-B53F-88D063CCE993}.Debug|Win32.Build.0 = Debug|Win32
{D4C01A3D-F609-4DA6-B53F-88D063CCE993}.Debug|x64.ActiveCfg = Debug|x64
{D4C01A3D-F609-4DA6-B53F-88D063CCE993}.Debug|x64.Build.0 = Debug|x64
{D4C01A3D-F609-4DA6-B53F-88D063CCE993}.Release|Win32.ActiveCfg = Release|Win32
{D4C01A3D-F609-4DA6-B53F-88D063CCE993}.Release|Win32.Build.0 = Release|Win32
{D4C01A3D-F609-4DA6-B53F-88D063CCE993}.Release|x64.ActiveCfg = Release|x64
{D4C01A3D-F609-4DA6-B53F-88D063CCE993}.Release|x64.Build.0 = Release|x64
EndGlobalSection
GlobalSection(SolutionProperties) = preSolution
HideSolutionNode = FALSE

View File

@ -24,6 +24,7 @@
<ClCompile Include="..\..\..\lib\legacy\zstd_v01.c" />
<ClCompile Include="..\..\..\lib\legacy\zstd_v02.c" />
<ClCompile Include="..\..\..\lib\legacy\zstd_v03.c" />
<ClCompile Include="..\..\..\lib\legacy\zstd_v04.c" />
<ClCompile Include="..\..\..\lib\zstd_buffered.c" />
<ClCompile Include="..\..\..\lib\zstd_compress.c" />
<ClCompile Include="..\..\..\lib\zstd_decompress.c" />
@ -43,6 +44,7 @@
<ClInclude Include="..\..\..\lib\legacy\zstd_v01.h" />
<ClInclude Include="..\..\..\lib\legacy\zstd_v02.h" />
<ClInclude Include="..\..\..\lib\legacy\zstd_v03.h" />
<ClInclude Include="..\..\..\lib\legacy\zstd_v04.h" />
<ClInclude Include="..\..\..\lib\zstd.h" />
<ClInclude Include="..\..\..\lib\zstd_buffered.h" />
<ClInclude Include="..\..\..\lib\zstd_buffered_static.h" />

View File

@ -57,6 +57,9 @@
<ClCompile Include="..\..\..\programs\datagen.c">
<Filter>Fichiers sources</Filter>
</ClCompile>
<ClCompile Include="..\..\..\lib\legacy\zstd_v04.c">
<Filter>Fichiers sources</Filter>
</ClCompile>
</ItemGroup>
<ItemGroup>
<ClInclude Include="..\..\..\lib\fse.h">
@ -113,5 +116,8 @@
<ClInclude Include="..\..\..\programs\datagen.h">
<Filter>Fichiers d%27en-tête</Filter>
</ClInclude>
<ClInclude Include="..\..\..\lib\legacy\zstd_v04.h">
<Filter>Fichiers d%27en-tête</Filter>
</ClInclude>
</ItemGroup>
</Project>

View File

@ -27,6 +27,8 @@
</ItemGroup>
<ItemGroup>
<ClInclude Include="..\..\..\lib\bitstream.h" />
<ClInclude Include="..\..\..\lib\error_private.h" />
<ClInclude Include="..\..\..\lib\error_public.h" />
<ClInclude Include="..\..\..\lib\fse.h" />
<ClInclude Include="..\..\..\lib\fse_static.h" />
<ClInclude Include="..\..\..\lib\huff0.h" />

View File

@ -68,6 +68,12 @@
<ClInclude Include="..\..\..\lib\mem.h">
<Filter>Header Files</Filter>
</ClInclude>
<ClInclude Include="..\..\..\lib\error_private.h">
<Filter>Header Files</Filter>
</ClInclude>
<ClInclude Include="..\..\..\lib\error_public.h">
<Filter>Header Files</Filter>
</ClInclude>
</ItemGroup>
<ItemGroup>
<ResourceCompile Include="zstdlib.rc">