Commit Graph

191 Commits

Author SHA1 Message Date
Sean Purcell
ba2ad9f25c ZSTD_decompress now handles multiple frames 2017-02-08 14:50:10 -08:00
Sean Purcell
4e709712e1 Decompressed size functions now handle multiframes and distinguish cases
- Add ZSTD_findDecompressedSize
    - Traverses multiple frames to find total output size
- Add ZSTD_getFrameContentSize
    - Gets the decompressed size of a single frame by reading header
- Deprecate ZSTD_getDecompressedSize
2017-02-08 14:50:10 -08:00
Yann Collet
bb0027405a fixed zstdmt corruption issue when enabling overlapped sections
see Asana board for detailed explanation on why and how to fix it
2017-01-25 16:25:38 -08:00
Sean Purcell
57d423c5df Don't create dict in streaming apis if dictSize == 0 2017-01-17 14:31:35 -08:00
Sean Purcell
834ab50fa3 Fixed decompress_usingDict not propagating corrupted dictionary error 2017-01-11 17:31:34 -08:00
Yann Collet
aca113f4f5 fixed ZSTD_sizeof_?Dict() 2016-12-23 22:25:03 +01:00
Yann Collet
0819abe3c1 added ZSTD_createDDict_byReference() body 2016-12-21 19:25:15 +01:00
Yann Collet
4e5eea61a8 added ZSTD_createDDict_byReference() 2016-12-21 16:44:35 +01:00
Nick Terrell
8157a4c3cc Fix dictionary loading bug causing an MSAN failure
Offset rep codes must be in the range `[1, dictSize)`.
Fix dictionary loading to reject `0` as a offset rep code.
2016-12-20 10:47:52 -08:00
Yann Collet
35168679bd Merge pull request #478 from terrelln/wildcopy-ub
Fix execSequence wildcopy undefined behavior
2016-12-13 11:33:00 +01:00
Nick Terrell
064a143520 Fix execSequence wildcopy undefined behavior
execSequence relied on pointer overflow to handle cases where
`sequence.matchLength < 8`.  Instead of passing an `size_t` to
wildcopy, pass a `ptrdiff_t`.
2016-12-12 19:01:23 -08:00
Nick Terrell
e474aa55b4 Fix decompression buffer overrun
Allows an adversary to write up to 3 bytes beyond the end of the buffer.
Occurs if the match overlaps the `extDict` and `currentPrefix`, and the
match length in the `currentPrefix` is less than `MINMATCH`, and
`op-(16-MINMATCH) >= oMatchEnd > op-16`.
2016-12-12 18:05:30 -08:00
Yann Collet
825dffbc43 moved zbuff source files into lib/deprecated 2016-12-05 19:28:19 -08:00
Yann Collet
8f8e2b0b4a fixed initialization warning 2016-12-05 18:00:50 -08:00
Yann Collet
e7a41a5955 added : dictID retrieval functions.
added : unit tests for dictID retrieval functions
2016-12-05 16:21:06 -08:00
Yann Collet
9ffbeea875 API : changed : streaming decompression : implicit reset on starting new frames 2016-12-02 18:37:38 -08:00
Yann Collet
2238312c2f fix dict loading 2016-12-02 11:36:11 -08:00
Yann Collet
b89af20353 reduced table sizes for HUF_readDTableX4 2016-12-01 18:24:59 -08:00
Yann Collet
ff504de391 minor decompression speed improvement 2016-11-29 17:42:46 -08:00
Yann Collet
a56ac2815c restored normal decoder speed 2016-11-29 15:30:23 -08:00
Yann Collet
37870d7a66 fixed minor visual warning 2016-11-29 14:31:57 -08:00
Yann Collet
4f5350f610 long matches support overflow 2016-11-29 13:12:24 -08:00
Yann Collet
52e136ed3d long decoder compatible with round and separate buffers 2016-11-28 19:59:11 -08:00
Yann Collet
ce3527ca0c combined normal and long decoder 2016-11-28 18:38:52 -08:00
Yann Collet
8993bee997 restored normal mode 2016-11-28 16:11:30 -08:00
Yann Collet
764e70a4f3 added decodeSequencesLong 2016-11-28 15:50:16 -08:00
Yann Collet
73f88a66f1 added prefetch 2016-11-23 15:43:30 -08:00
Yann Collet
50524bf0da delayed decompression 2016-11-23 15:11:07 -08:00
Nick Terrell
4359d21ad7 Merge two memset() calls into one 2016-11-14 17:52:51 -08:00
Nick Terrell
24701de877 Fix uninitialized memory read 2016-11-14 13:57:05 -08:00
Yann Collet
179b19776f fileio.c does no longer need ZSTD_LEGACY_SUPPORT, and does no longer depend on zstd_legacy.h
Added : ZSTD_isFrame() in experimental section
2016-11-02 17:30:49 -07:00
Yann Collet
31e660e7aa more accurate default maximum window size 2016-10-29 03:56:45 -07:00
Yann Collet
2115724c22 Merge pull request #430 from terrelln/exec-sequences
ZSTD_execSequence() accepts match in last 7 bytes
2016-10-28 10:45:05 -07:00
Nick Terrell
10bfd0c0d5 Fix ZSTD_execSequence() performance regression
Commit ae1cb3b3d0 caused the regression.
It is an instruction alignment issue, because if it is `U64 i` instead
of `U32 i`, the regression returns.  This patch fixes the regression
in gcc, but only gets some of the clang performance back.

Benchmarks:
Run on `silesia.tar`.  I only show levels 1-5 because the performance
regression was uniform across all levels.  I did one run on levels
1-19 and it looked good.

| Build | Level | Before | While | After |
|-------|-------|-------:|------:|------:|
| gcc   |     1 |  931.4 | 904.4 | 932.8 |
| gcc   |     2 |  849.1 | 822.6 | 851.2 |
| gcc   |     3 |  815.6 | 790.6 | 818.9 |
| gcc   |     4 |  794.1 | 770.7 | 798.0 |
| gcc   |     5 |  785.7 | 760.7 | 788.8 |
| clang |     1 |  705.5 | 683.2 | 693.8 |
| clang |     2 |  670.0 | 649.2 | 660.7 |
| clang |     3 |  659.6 | 639.8 | 651.4 |
| clang |     4 |  652.5 | 634.7 | 645.9 |
| clang |     5 |  646.9 | 625.5 | 637.7 |
2016-10-27 16:19:57 -07:00
Nick Terrell
eb7873a048 ZSTD_execSequence() accepts match in last 7 bytes
The zstd reference compressor will not emit a match in the last 7
bytes of a block.  The decompressor will also not accept a match
in the last 7 bytes.  This patch makes the decompressor accept a
match in the last 7 bytes.
2016-10-25 21:24:15 -07:00
Yann Collet
335ad5d4d4 added ZSTD_initDStream_usingDDict() .
slightly optimized ZSTD_initDStream() when no dictionary .
fixed ZSTD_sizeof_CStream() .
2016-10-25 17:47:02 -07:00
Nick Terrell
f698ad6deb Merge remote-tracking branch 'upstream/dev' into fixes
* upstream/dev:
  added doc\zstd_manual.html
  added contrib\gen_html
  zstd_compression_format.md moved to doc/
  Fix small bug in ZSTD_execSequence()
  improved ZSTD_compressBlock_opt_extDict_generic
  protect ZSTD_decodeFrameHeader() from invalid usage, as suggested by @spaskob
  zstd_opt.h: small improvement in compression ratio
  improved dicitonary segment merge
  use implicit rules to compile zstd_decompress.c
  detect early impossible decompression scenario in legacy decoder v0.5
  no repeat mode in legacy v0.5
  fixed invalid invocation of dictionary in legacy decoder v0.5
  fix edge case
  fix command line interpretation
  fixed minor corner case
  zstd.h: added the Introduction section
  fixed clang 3.5 warnings
  zstd.h: updated comments
2016-10-24 13:10:13 -07:00
Yann Collet
4239a207dd Merge pull request #425 from inikep/dev11
Doc
2016-10-24 11:11:40 -07:00
Przemyslaw Skibinski
3ee94a7600 zstd_compression_format.md moved to doc/ 2016-10-24 15:58:07 +02:00
Yann Collet
97611611a3 Merge pull request #423 from terrelln/exec-seq-patch
Fix small bug in ZSTD_execSequence()
2016-10-21 17:02:06 -07:00
Nick Terrell
ae1cb3b3d0 Fix small bug in ZSTD_execSequence()
`memmove(op, match, sequence.matchLength)` is not the desired behavior.
Overlap is allowed, and handled as if we did `*op++ = *match++`, which
is not how `memmove()` handles overlap.

Only triggered if both of the following conditions are met:
* The match spans extDict & currentPrefixSegment
* `oLitEnd <= oend_w < oLitEnd + length1 < oMatchEnd <= oend`.

These two conditions imply that the block is less than 15 bytes long.
This bug isn't triggered by the streaming API, because it allocates
enough space for the window size + the block size, so there cannot be
a match that is within 8 bytes of the end and overlaps with itself.
It cannot be triggered by the block decompression API because all of
the decompressed data is in the currentPrefixSegment.

Introduced by commit 7158584399
2016-10-21 12:13:44 -07:00
Yann Collet
da3bd8b6de protect ZSTD_decodeFrameHeader() from invalid usage, as suggested by @spaskob 2016-10-20 20:11:00 -07:00
Nick Terrell
bb68062c59 Unitialized memory read in ZSTD_decodeSeqHeaders()
Caused by two things:
1. Not checking that `ip` is in range except for the first byte.
2. `ZSTDv0{5,6}_decodeLiteralsBlock()` could return a value larger than `srcSize`.
2016-10-18 16:41:33 -07:00
Yann Collet
06573e17be fixed minor corner case 2016-10-17 17:28:28 -07:00
Nick Terrell
4db751668f Fix buffer overrun in ZSTD_loadEntropy()
The table log set by `FSE_readNCount()` was not checked in
`ZSTD_loadEntropy()`.  This caused `FSE_buildDTable(dctx->MLTable, ...)`
to overwrite the beginning of `dctx->hufTable`.

The benchmarks look good, there is no obvious performance regression:

  > ./zstds/zstd.opt.0 -i10 -b1 -e5 ~/bench/silesia.tar
   1#silesia.tar       : 211988480 ->  73656930 (2.878), 268.2 MB/s , 701.0 MB/s
   2#silesia.tar       : 211988480 ->  70162842 (3.021), 199.5 MB/s , 666.9 MB/s
   3#silesia.tar       : 211988480 ->  66997986 (3.164), 154.9 MB/s , 655.6 MB/s
   4#silesia.tar       : 211988480 ->  66002591 (3.212), 128.9 MB/s , 648.4 MB/s
   5#silesia.tar       : 211988480 ->  65008480 (3.261),  98.4 MB/s , 633.4 MB/s

  > ./zstds/zstd.opt.2 -i10 -b1 -e5 ~/bench/silesia.tar
   1#silesia.tar       : 211988480 ->  73656930 (2.878), 266.1 MB/s , 703.7 MB/s
   2#silesia.tar       : 211988480 ->  70162842 (3.021), 199.0 MB/s , 666.6 MB/s
   3#silesia.tar       : 211988480 ->  66997986 (3.164), 156.2 MB/s , 656.2 MB/s
   4#silesia.tar       : 211988480 ->  66002591 (3.212), 133.2 MB/s , 647.4 MB/s
   5#silesia.tar       : 211988480 ->  65008480 (3.261),  96.3 MB/s , 633.3 MB/s
2016-10-17 15:51:15 -07:00
Yann Collet
7933434fdf Merge branch 'dev' of github.com:facebook/zstd into dev 2016-10-14 13:32:35 -07:00
Yann Collet
d4cda27b63 new command -M#, to limit memory usage during decompression (#403) 2016-10-14 13:32:20 -07:00
Nick Terrell
3b9cdf9220 Fix ubsan failures (pass NULL to memcpy) 2016-10-12 20:54:42 -07:00
Yann Collet
5d919e7ac3 added ZSTD_error_frameParameter_windowTooLarge (#403) 2016-10-12 17:29:24 -07:00
Nick Terrell
7158584399 Fix ZSTD_execSequence() edge case 2016-10-12 10:05:26 -07:00