Commit Graph

403 Commits

Author SHA1 Message Date
Yann Collet
c0393a538f fixed counting long distance weights 2018-03-05 15:12:10 -08:00
Yann Collet
cb789d2df8 re-inserted offset evaluation 2018-03-05 13:08:59 -08:00
Yann Collet
b91ddf0ae6 Merge branch 'dev' into longOffsetMode 2018-03-05 11:59:54 -08:00
Yann Collet
b01552a07a force inlining of HUF_decodeSymbol*() functions
which was not done properly by gcc 4.8
resulting in major performance difference.

ex :
zstd -b1 silesia.tar
before : dec 680 MB/s
after  : dec 710 MB/s  (without bmi2)
after  : dec 770 MB/s  (with DYNAMIC_BMI2)
2018-03-01 11:31:45 -08:00
Yann Collet
6cdf690441 minor cleaning of huff0
Update code documentation, and properly names a few "magic constants".
Also, HUF_compress_internal() gets a cleaner way
to determine size of tables inside workspace.
2018-02-26 14:52:23 -08:00
Yann Collet
653383f74a minor nit from Mac XCode 2018-02-22 15:44:26 -08:00
Yann Collet
0fd4df6ed3 Implemented BMI2 functions directly within huf_decompress.c
This makes it easier to edit for maintenance and evolutions
(I plan to experiment modifications in huffman decompression functions).

The methology followed seems broadly applicable to other BMI2 modules.

Performance was tracked rigorously at each step,
there is no noticeable loss (nor win) of performance compared to `#include` version.

Note however that 4X decoder variants tend to be extremely sensitive to code alignment.
This source code resulted in pretty good performance for gcc 7.2 and 7.3,
but future changes (even in other parts of the code) might trigger the issue again.
2018-02-22 10:51:47 -08:00
Nick Terrell
6e128d3534 [BMI2] Add comments to the bmi2 variable in the contexts 2018-02-20 14:12:11 -08:00
Nick Terrell
4319132312 [decompress] Support BMI2 2018-02-13 17:00:15 -08:00
Yann Collet
2524cbd847 added code comment on how to generate default tables
as suggested by @terrelln
2018-02-13 10:02:25 -08:00
Yann Collet
71c07966bb added SEQSYMBOL_TABLE_SIZE()
as suggested by @terrelln's comment
2018-02-12 16:52:15 -08:00
Yann Collet
04a3f85ce7 fixed gcc warning on a switch code path 2018-02-09 16:16:27 -08:00
Yann Collet
af48f0b62b fix : offset table pointer when using default table 2018-02-09 15:15:46 -08:00
Yann Collet
426944c3e3 fixed strict aliasing issue
tuned threshold
2018-02-09 13:24:11 -08:00
Yann Collet
64ee732694 decide long-offset mode based on offcode statistics
threshold vaguely estimated
2018-02-09 12:33:28 -08:00
Yann Collet
6bfe50ad48 re-enabled ZSTD_decompressSequencesLong() 2018-02-09 09:14:25 -08:00
Yann Collet
1850597eaa pre-calculated default decoding tables 2018-02-09 06:01:02 -08:00
Yann Collet
ab75df21ed fixed mono-symbol distribution 2018-02-09 05:12:13 -08:00
Yann Collet
421a2716d8 fixed default fse distributions
but would be better to pre-calculate tables, for speed
2018-02-09 04:50:58 -08:00
Yann Collet
95424409ea addBits and baseline into FSE decoding table
note : unfinished
- need new default tables
- need modify long mode
2018-02-09 04:25:15 -08:00
Yann Collet
0170cf9a7a minor : modified ZSTD_preserveUnsortedMark() to be more vectorization friendly 2018-02-05 11:46:02 -08:00
Yann Collet
94efb1749d faster decoding in 32-bits mode for long offsets (tentative)
On my laptop:
Before:
./zstd32 -b --zstd=wlog=27 silesia.tar enwik8 -S
 3#silesia.tar       : 211984896 ->  66683478 (3.179),  97.6 MB/s , 400.7 MB/s
 3#enwik8            : 100000000 ->  35643153 (2.806),  76.5 MB/s , 303.2 MB/s

After:
./zstd32 -b --zstd=wlog=27 silesia.tar enwik8 -S
 3#silesia.tar       : 211984896 ->  66683478 (3.179),  97.4 MB/s , 435.0 MB/s
 3#enwik8            : 100000000 ->  35643153 (2.806),  76.2 MB/s , 338.1 MB/s

Mileage vary, depending on file, and cpu type.
But a generic rule is : x86 benefits less from "long-offset mode" than x64,
maybe due to register pressure.
On "entropy", long-mode is _never_ a win for x86.
On my laptop though, it may, depending on file and compression level
(enwik8 benefits more from "long-mode" than silesia).
2018-02-04 01:49:31 -08:00
Yann Collet
4f43ef731d Merge branch 'dev' into constCDict 2018-01-18 13:36:43 -08:00
Yann Collet
f3b8f90b6d changed initStatic?Dict() return type to const ZSTD_?Dict*
ZSTD_create?Dict() is required to produce a ?Dict* return type
because `free()` does not accept a `const type*` argument.
If it wasn't for this restriction, I would have preferred to create a `const ?Dict*` object
to emphasize the fact that, once created, a dictionary never changes
(hence can be shared concurrently until the end of its lifetime).

There is no such limitation with initStatic?Dict() :
as stated in the doc, there is no corresponding free() function,
since `workspace` is provided, hence allocated, externally,
it can only be free() externally.

Which means, ZSTD_initStatic?Dict() can return a `const ZSTD_?Dict*` pointer.

Tested with `make all`, to catch initStatic's users,
which, incidentally, also updated zstd.h documentation.
2018-01-17 14:08:48 -08:00
Yann Collet
2e23333094 ZSTDMT can now work in non-blocking mode with 1 thread
it still fallbacks to single-thread blocking invocation
when input is small (<1job)
or when invoking ZSTDMT_compress(), which is blocking.

Also : fixed a bug in new block-granular compression routine.
2018-01-16 15:28:43 -08:00
Yann Collet
f597f55675 improved btlazy2 : list of unsorted candidates can reach extDict
It used to stop on reaching extDict, for simplification.
As a consequence, there was a small loss of performance each time the round buffer would restart from beginning.
It's not a large difference though, just several hundreds of bytes on silesia.
This patch fixes it.
2017-12-30 15:12:59 +01:00
Yann Collet
03832b7aa5 re-added test case
messing with revert ... :(
2017-12-12 14:01:54 -08:00
Yann Collet
8a104fda05 Revert "Created a test case which reliably reproduces bug #944"
This reverts commit 5098d1fbe2.
2017-12-12 12:51:49 -08:00
Yann Collet
5098d1fbe2 Created a test case which reliably reproduces bug #944
in zstreamtest.
2017-12-12 12:48:31 -08:00
Yann Collet
23767e950a fix one UB pointer arithmetic in encoder
Instead of calculating distance between 2 memory objects, which is UB,
we extract the offset from object 1, and transfer it into object 2.
2017-11-17 13:24:51 -08:00
Yann Collet
cdade555ee fixed one UB pointer arithmetic 2017-11-17 11:40:08 -08:00
Nick Terrell
1fc4f593da Allow skippable frames of any size 2017-11-01 13:07:26 -07:00
Yann Collet
7f6a783862 fixed a small error in decodeCorpus
a compressed block must be strictly smaller than its decompressed size.
2017-10-07 15:19:52 -07:00
Yann Collet
54a827fff0 Merge branch 'dev' into newFormats
Fixed conflicts in zstdmt_compress.c
2017-09-27 16:39:40 -07:00
Yann Collet
ea1f50bf73 removed ZSTD_decompressBegin() from ZSTD_initDCtx_internal()
It does not feel "right" from a dependency perspective.
ZSTD_initDCtx_internal() is triggered once, on DCtx creation,
while ZSTD_decompressBegin() is invoked at the beginning of each new frame,
and is also a user-facing prototype.

Downside : a DCtx must be init before first usage !
This was always the intention by the way, and is documented as such.
This stage is automatically done within ZSTD_decompress() and variants,
and also within ZSTD_decompressStream().
Only ZSTD_decompressContinue() is impacted,
it must be preceded by a ZSTD_decompressBegin(), as detailed in doc.

A test has been fixed, to no longer rely on undocumented assumption that ZSTD_decompressBegin() is invoked during init.
2017-09-27 13:51:05 -07:00
Yann Collet
9416195221 changed error code when pos<=size condition is not respected
Now pointing towards src_size or dst_size,
instead of error_GENERIC.
2017-09-27 10:35:56 -07:00
Yann Collet
ca306c1c84 fixed a bug in zstreamtest
decoder output buffer would receive a wrong size.

In previous version, ZSTD_decompressStream() would blindly trust the caller that pos <= size.
In this version, this condition is actively checked,
and the function returns an error code if this condition is not respected.

This check could also be done with an assert(),
but since this is a user-facing interface, it seems better to keep this check at runtime.
2017-09-27 00:39:41 -07:00
Yann Collet
cd53ac831b fixed DCtx initialization error
now relying on initialization of dctx->format first
2017-09-26 18:26:09 -07:00
Yann Collet
c0dd960363 switch assert() position 2017-09-26 15:36:57 -07:00
Yann Collet
319c699991 created ZSTD_startingInputLength()
as suggested by @terrelln
2017-09-26 15:36:14 -07:00
Yann Collet
8d1e97ea9c minor fixes following @terrelln comments 2017-09-26 15:06:30 -07:00
Yann Collet
df4e9bba25 fixed constant errors for gcc in c99 mode
C standard does not consider a `static const int` as a constant.
This is a problem for initializer, and ZSTD_STATIC_ASSERT().
Replaced by macro values
2017-09-26 14:31:06 -07:00
Yann Collet
9f0b8dfbe9 Merge branch 'dev' into newFormats 2017-09-26 14:22:39 -07:00
Nick Terrell
c233bdbaee Increase maximum window size
* Maximum window size in 32-bit mode is 1GB, since allocations for 2GB fail
  on my Mac.
* Maximum window size in 64-bit mode is 2GB, since that is the largest
  power of 2 that works with the overflow prevention.
* Allow `--long=windowLog` to set the window log, along with
  `--zstd=wlog=#`. These options also set the window size during
  decompression, but don't override `--memory=#` if it is set.
* Present a helpful error message when the window size is too large during
  decompression.
* The long range matcher defaults to a hash log 7 less than the window log,
  which keeps it at 20 for window log 27.
* Keep the default long range matcher window size and the default maximum
  window size at 27 for the API and CLI.
* Add tests that use the maximum window size and hash size for compression
  and decompression.
2017-09-26 14:00:01 -07:00
Yann Collet
52a1d1c6dc added ZSTD_DCtx_reset() 2017-09-25 16:56:48 -07:00
Yann Collet
5d8fdd1641 Merge pull request #855 from terrelln/maxoff
[libzstd] Increase MaxOff
2017-09-25 16:34:29 -07:00
Yann Collet
f2a913862c added ZSTD_decompress_generic_simpleArgs() 2017-09-25 15:46:34 -07:00
Yann Collet
6ee05a02b8 added ZSTD_decompress_generic()
same as ZSTD_decompressStream(),
just for a similar feeling as the compression side, which uses ZSTD_compress_generic()
2017-09-25 15:41:48 -07:00
Yann Collet
b8d4a3887f introduced constant ZSTD_frameIdSize
within zstd_internal.h
This is the size of magic number.

Avoids using `4` directly in source code, which is a bit less meaningful.
2017-09-25 15:26:18 -07:00
Yann Collet
044fb4c057 implemented magic-less frame decoder 2017-09-25 15:12:09 -07:00
Yann Collet
62568c9a42 added capability to generate magic-less frames
decoder not implemented yet
2017-09-25 14:26:26 -07:00
Nick Terrell
bbe77212ef [libzstd] Increase MaxOff 2017-09-25 13:36:18 -07:00
Yann Collet
8977224b9b Merge pull request #859 from terrelln/31
Prepare for ZSTD_WINDOWLOG_MAX == 31
2017-09-22 09:01:39 -07:00
Nick Terrell
d6abb28951 Prepare for ZSTD_WINDOWLOG_MAX == 31 2017-09-21 17:18:41 -07:00
Yann Collet
cd3115b284 added control from frame content size at end of decompression
adding check at end of single-pass ZSTD_decompressFrame().
Check within ZSTD_decompressContinue() was already added in a previous patch : b3f33ccfb3
2017-09-21 16:21:10 -07:00
Nick Terrell
18442a31ff [libzstd] Fix bad window size assert
The window size is not validated or used in the one-pass API, so there
shouldn't be an assert based on it.

fix-fuzz-failure
2017-09-19 13:47:59 -07:00
Nick Terrell
5f22479517 [block] Don't use fParams in ZSTD_decompressBlock() 2017-09-15 17:37:20 -07:00
Yann Collet
ce31004f20 fix following suggestions by @terrelln 2017-09-11 13:12:52 -07:00
Yann Collet
b3f33ccfb3 use ZSTD_decodingBufferSize_min() inside ZSTD_decompressStream()
Use same definition as public one
minor : reduce allocated buffer size in some cases
(when frameContentSize is known and == windowSize)
2017-09-09 14:37:28 -07:00
Yann Collet
058ed2ad33 ZSTD_decodingBufferSize_min()
supporting function for bufferless streaming API (ZSTD_decompressContinue())
makes it possible to correctly size a round buffer for decoding using this API.

also : added field blockSizeMax within ZSTD_frameHeader,
as it's a necessary information to know when to restart at beginning of decoding buffer.
2017-09-09 01:03:29 -07:00
Yann Collet
3128e03be6 updated license header
to clarify dual-license meaning as "or"
2017-09-08 00:09:23 -07:00
Yann Collet
8a5c0c98ae restored 32-bits decoder ability to decode long offsets (>32 MB, levels 21+) 2017-09-01 11:56:57 -07:00
Yann Collet
36aa8b5999 improved decoding speed 2017-09-01 11:40:59 -07:00
Yann Collet
3704507774 fixed decompression bug reported by @Etsukata (#828) 2017-09-01 00:05:37 -07:00
Yann Collet
d7ad99b2ab Merge branch 'longRangeMatcher' into dev 2017-08-31 18:08:37 -07:00
Stella Lau
c88fb9267f Replace 'byReference' with enum 2017-08-29 11:55:02 -07:00
Yann Collet
32fb407c9d updated a bunch of headers
for the new license
2017-08-18 16:52:05 -07:00
Yann Collet
f9e6590715 Merge pull request #796 from terrelln/is-error
[FSE][HUF] Inline error checks
2017-08-15 12:37:28 -07:00
Yann Collet
2dbcfc6994 Merge pull request #794 from terrelln/force-inline
[libzstd] Fix FORCE_INLINE macro
2017-08-15 12:03:44 -07:00
Nick Terrell
07c6ff588e [FSE][HUF] Inline error checks
Caught by Clang's optimization remarks.
2017-08-15 11:23:28 -07:00
Nick Terrell
565e925eb7 [libzstd] Fix FORCE_INLINE macro 2017-08-14 21:12:05 -07:00
Roman Gershman
b9d4f4fb74 Fix ZSTD_estimateDStreamSize function after ZSTD_DStream and ZSTD_DCtx were merged 2017-08-13 13:29:42 +03:00
Nick Terrell
9ba97182d1 [CI] Add gcc7build test 2017-08-08 13:28:56 -07:00
Nick Terrell
abe12b3399 [libzstd] Fix bug in Huffman decompresser
The zstd format specification doesn't enforce that Huffman compressed
literals (including the table) have to be smaller than the uncompressed
literals. The compressor will never Huffman compress literals if the
compressed size is larger than the uncompressed size. The decompresser
doesn't accept Huffman compressed literals with 4 streams whose compressed
size is at least as large as the uncompressed size.

* Make the decompresser accept Huffman compressed literals whose size
  increases.
* Add a test case that exposes the bug. The compressed file has to be
  statically generated, since the compressor won't normally produce files
  that expose the bug.
2017-08-07 12:37:48 -07:00
Yann Collet
2bd6440be0 pinned down error code enum values
Note : all error codes are changed by this new version,
but it's expected to be the last change for existing codes.

Codes are now grouped by category, and receive a manually attributed value.
The objective is to guarantee that
error code values will not change in the future
when introducing new codes.
Intentionnal empty spaces and ranges are defined
in order to keep room for potential new codes.
2017-07-13 17:12:16 -07:00
Nick Terrell
de0414b736 [libzstd] Pull CTables into sub-structure 2017-07-12 19:49:19 -07:00
Yann Collet
0f4fc6c20a fixed several conversion warnings 2017-07-07 17:13:12 -07:00
Yann Collet
9bde061a0b fixed minor Visual compilation limitation 2017-07-07 16:14:17 -07:00
Yann Collet
593d517ebf fixed minor cast warning 2017-07-07 16:09:47 -07:00
Yann Collet
ead4dd48f6 new field frameHeader.headerSize 2017-07-07 15:51:24 -07:00
Yann Collet
46396523c0 ZSTD_getFrameHeader : control of windowSize limits is delegated to caller
Extracting frame header is a separate operation.
It's now possible to get frame header, whatever the window size set in it.
2017-07-07 15:32:12 -07:00
Yann Collet
990449b89d new field : ZSTD_frameHeader.frameType
Makes frame type (zstd,skippable) detection more straighforward.
ZSTD_getFrameHeader set frameContentSize=ZSTD_CONTENTSIZE_UNKNOWN to mean "field not present"
2017-07-07 15:21:35 -07:00
Yann Collet
e622330a3b extended frameHeader.windowSize to unsigned long long 2017-07-07 14:19:01 -07:00
Yann Collet
f04deff4fc fixed #718, reported by @GregSlazinski, solution suggested by @mcmilk 2017-07-06 01:42:46 -07:00
Yann Collet
d75c0e71c4 minor code refactoring 2017-07-05 18:10:07 -07:00
Yann Collet
5051dd39ca Merge pull request #743 from facebook/fullbench
compress_generic() automatic optimization opportunities
2017-07-03 21:26:38 -07:00
Nick Terrell
c80fc50a8d [libzstd] Fix memcpy() on potential NULL source
* `ZSTD_decompressStream_generic()` `ip` may be `NULL` for one of the calls
  to `memcpy()`
* Assert the source is not `NULL` for calls to `memcpy()` where I believe
  the source should not be `NULL`.
2017-07-03 12:31:55 -07:00
Yann Collet
2485f88bf8 fixed legacy version init bug 2017-07-01 09:09:34 -07:00
Yann Collet
7f40bb1c39 Merge pull request #742 from stellamplau/stack-space
Reduce stack usage of HUF_readDTableX4 and HUF_readDTableX2
2017-06-30 14:50:23 -07:00
Stella Lau
4c71f59c77 Clarify typedef of rankVal_t and rankValCol_t 2017-06-30 09:52:20 -07:00
Stella Lau
28f711ef95 Rename ALIGN and ALIGN_MASK to HUF_ALIGN and HUF_ALIGN_MASK 2017-06-30 09:38:11 -07:00
Stella Lau
70ad6829e7 Delegate HUF_decompress4X_hufOnly to workspace version 2017-06-29 16:22:32 -07:00
Stella Lau
104c4d57c1 Fix bitshift error 2017-06-29 15:40:49 -07:00
Stella Lau
fedc94de8c Fix pointer casting warning 2017-06-29 13:04:15 -07:00
Stella Lau
c6a5275a28 Fix alignment warnings with pointer casting 2017-06-29 12:39:34 -07:00
Stella Lau
99e315999c Reduce stack usage of HUF_readDTableX4 and HUF_readDTableX2 2017-06-29 11:49:59 -07:00
Yann Collet
33a6639039 fixed ZSTD_refPrefix with Multithread-enabled CCtx 2017-06-28 11:09:43 -07:00
Yann Collet
7d3816183f exposed ZSTD_MAGIC_DICTIONARY in zstd.h
makes it easier to explain ZSTD_dictMode
2017-06-27 13:50:34 -07:00
Yann Collet
c7fb884eea fixed minor conversion warning 2017-06-26 18:02:23 -07:00
Yann Collet
dde10b23fe refactored ZSTD_estimateDStreamSize()
now uses windowSize as argument.
Also : created ZSTD_estimateDStreamSize_fromFrame()
2017-06-26 17:44:26 -07:00