Commit Graph

2840 Commits

Author SHA1 Message Date
Mitchell Grenier
a424899637 Fix buck for lib 2018-11-30 13:45:16 +00:00
Yann Collet
d3a0c71259 pushed experimental parameters
into ZSTD_STATIC_LINKING_ONLY section
2018-11-21 16:18:55 -08:00
Yann Collet
d4d4e109e9 getParameter fills an int*
rather than an unsigned*
for consistency
since type of setParameter() changed to int.
2018-11-21 15:37:26 -08:00
Yann Collet
fea920615c promote ZSTD_findFrameCompressedSize() into staging area 2018-11-21 15:25:50 -08:00
Yann Collet
41c7d0b1e1 changed hashEveryLog into hashRateLog 2018-11-21 14:36:57 -08:00
Yann Collet
5d3592398d fixed fall-through 2018-11-20 16:09:33 -08:00
Yann Collet
5c6d4b18ac completed implementation of ZSTD_cParam_getBounds()
for all parameters
2018-11-20 16:06:00 -08:00
Yann Collet
2e7fd6a2cb fixed remaining searchLength invocations 2018-11-20 15:13:27 -08:00
Yann Collet
e874dacc08 changed searchLength into minMatch
refactored all relevant API and calls
for consistency.
2018-11-20 14:56:07 -08:00
Yann Collet
114bd4346e changed enum type name to ZSTD_ResetDirective
for naming consistency :
types should start with a capital letter (after prefix)
2018-11-20 12:00:20 -08:00
Yann Collet
3b838abf97 ZSTD_CCtx_setParameter : value argument is now int
for compatibility with compression level
2018-11-20 11:53:01 -08:00
Yann Collet
19e5f2a35b removed some constants and _simpleArgs() from staging
constants that *may* change in the future
will be accessed through functions instead
(to be created).

_simpleArgs() variant do not have (yet) a clear enough added value
to deserve "stable" status.
2018-11-19 17:38:15 -08:00
Ryan Schmidt
ef4df0df4a Fix i386 build failure "Junk character 13" 2018-11-16 02:16:21 -06:00
Yann Collet
5c68639186 updated ZSTD_DCtx_reset()
signature and behavior is now the same as ZSTD_CCtx_reset()
2018-11-15 16:12:39 -08:00
Yann Collet
06c8d5a4f4 Merge branch 'dev' into advancedAPI
fixed rsyncable
2018-11-15 10:51:24 -08:00
Nick Terrell
b9693d3a49 [lib] Add rsyncable mode
- Add rsyncable mode to multithreaded mode
- Factor out LDM's hash function for reuse
2018-11-14 16:59:57 -08:00
Yann Collet
21a42bf5f9 added advanced decompression api 2018-11-14 16:54:54 -08:00
Yann Collet
cf9f4b63b8 fixed fuzz test src code 2018-11-14 14:46:49 -08:00
Yann Collet
7b0391e37e finalized retrofit of ZSTD_CCtx_reset()
updated all depending sources
2018-11-14 13:05:35 -08:00
Yann Collet
ff8d371708 modified ZSTD_CCtx_reset()
which now accepts an enum,
to distinguish between resetting the session, or the parameters (or both).

removed ZSTD_CCtx_resetParameters(), which is redundant.

start replacing invocation of ZSTD_CCtx_reset*() functions

Updated advanced API documentation

trimmed down amount of API staged in RC,
in particular, all functions related to ZSTD_CCtxParams()
seem too advanced.
2018-11-14 12:33:57 -08:00
Yann Collet
d7e10a774a added constant ZSTD_WINDOWLOG_LIMIT_DEFAULT
answering #1407.

Also : removed obsolete function ZSTD_setDStreamParameter()
which could only be used with one parameter (DStream_p_maxWindowSize).
Now replaced by ZSTD_DCtx_setWindowSize() (which exists since a few revisions)
2018-11-13 18:12:34 -08:00
Yann Collet
2c8fde538f added constant ZSTD_MAGIC_SKIPPABLE_MASK
and updated several API comments
2018-11-13 17:36:35 -08:00
Yann Collet
b83d1e7714 removed some static const variables
and replaced by traditional macro constants.

Unfortunately, C doesn't consider `static const` to mean "constant"
2018-11-13 16:56:32 -08:00
Yann Collet
768a264200 Merge branch 'dev' of github.com:facebook/zstd into dev 2018-11-13 15:56:36 -08:00
Yann Collet
092c4abd4c bumped version number to v1.3.8 2018-11-13 15:53:38 -08:00
Yann Collet
f28af025d9
Merge pull request #1413 from felixhandte/attach-dict-fix-unsigned-compare
Fix #1412: Perform Signed Comparison When Setting Attach Dict Param
2018-11-12 17:53:11 -08:00
Yann Collet
626040ab53 changed PREFETCH() macro into PREFETCH_L2()
which is more accurate
2018-11-12 17:05:32 -08:00
W. Felix Handte
5faef4d378 Const 2018-11-12 14:48:42 -08:00
W. Felix Handte
2d9332eb21 Fix Types 2018-11-12 12:52:31 -08:00
W. Felix Handte
4127de5fa6 Switch Enum to Only Non-Negative Values, Update Comments 2018-11-12 12:47:47 -08:00
W. Felix Handte
596f7d1256 Fix #1412: Perform Signed Comparison When Setting Attach Dict Param 2018-11-12 12:07:57 -08:00
Yann Collet
7b0c551bff
Merge pull request #1411 from facebook/prefetch_dict
Improves decompression speed when using cold dictionary
2018-11-09 11:31:35 -08:00
Yann Collet
1b4a9c518b
Merge pull request #1410 from facebook/prefetch_dec
improve long-range decoder speed
2018-11-08 18:41:58 -08:00
Yann Collet
483759a3de Improves decompression speed when using cold dictionary
by triggering the prefetching decoder path
(which used to be dedicated to long-range offsets only).

Figures on my laptop :
no content prefetch : ~300 MB/s (for reference)
full content prefetch : ~325 MB/s (before this patch)
new prefetch path : ~375 MB/s (after this patch)

The benchmark speed is already significant,
but another side-effect is that this version
prefetch less data into memory,
since it only prefetches what's needed, instead of the full dictionary.

This is supposed to help highly active environments
such as active databases,
that can't be properly measured in benchmark environment (too clean).

Also :
fixed the largeNbDict test program
which was working improperly when setting nbBlocks > nbFiles.
2018-11-08 17:00:23 -08:00
Yann Collet
20fb9e7f36 reduced assertion strength
one limit case can apparently be generated during fuzzer tests
2018-11-08 12:57:34 -08:00
Yann Collet
9126da5b5c improve long-range decoder speed
on enwik9 at level 22 (which is almost a worst case scenario),
speed improves by +7% on my laptop (415 -> 445 MB/s)
2018-11-08 12:47:46 -08:00
Yann Collet
8bed4012bd fixed decompression-only benchmark 2018-11-08 12:36:39 -08:00
Nick Terrell
a8daa2d683 Signal before unlocking in pool.c 2018-11-08 10:45:53 -08:00
Bartosz Szreder
5c5c476338 Prevent deadlock on malloc() failure. 2018-11-08 10:29:31 +01:00
Yann Collet
e0701d3c5d
Merge pull request #1404 from facebook/T36302429
fixed T36302471
2018-11-06 11:53:20 -08:00
Yann Collet
3e5cdf1b6a fixed T36302429 2018-11-05 17:50:30 -08:00
Yann Collet
2caa995558 just add an assert() in ZSTD_insertBtAndGetAllMatches()
to express a condition on ll0 .
May help static analyzer as in #1397
2018-11-05 17:13:32 -08:00
Yann Collet
3a90229616
Merge pull request #1395 from facebook/decompressblock
created zstd_decompress_block module
2018-10-29 16:28:09 -07:00
Yann Collet
acd75a1448 fixed a second memset() on NULL
not sure why it only triggers now,
this code has been around for a while.

Introduced a new error code : dstBuffer_null,
I couldn't express anything even remotely similar with existing error codes set.
2018-10-29 15:03:57 -07:00
Yann Collet
9c58098200 fixed memcpy() on NULL warning
memcpy(NULL, src, 0) is undefined behavior.
2018-10-29 13:57:37 -07:00
Yann Collet
ea966c8fb1
Merge pull request #1396 from facebook/huf_refactor
refactor HUF_compress_internal for clarity
2018-10-29 13:06:45 -07:00
Yann Collet
fc20b3c441 added flag -Wc++-compat
for library and cli
2018-10-26 16:38:23 -07:00
Yann Collet
c3c7deb1e1
Merge pull request #1392 from coetry/dev
provide consistent spacing to enum field
2018-10-26 15:29:07 -07:00
Yann Collet
1866bd374a Merge branch 'dev' into huf_Refactor 2018-10-26 15:25:01 -07:00
Yann Collet
8d56f4baee added a few comments for clarifications 2018-10-26 15:21:52 -07:00
Yann Collet
450356b5af Merge branch 'dev' into decompressblock 2018-10-26 15:03:43 -07:00
Yann Collet
7d4960a5e8
Merge pull request #1390 from facebook/nullAsOutput
support decompressing an empty frame into NULL
2018-10-26 14:43:16 -07:00
Allen Hai
3783720f70 vertically align code comment 2018-10-26 16:16:06 -05:00
Yann Collet
7b74405150 refactor HUF_compress_internal for clarity
changed workspace parameter convention
to always provide workspaceSize,
so that size can be explicitly checked.

Also, use more enum to make the meaning of some parameters more explicit.
2018-10-26 13:21:37 -07:00
Allen Hai
26e34d8a73 provide consistent spacing to enum field 2018-10-25 18:45:20 -05:00
Yann Collet
2b4914082e created zstd_decompress_block module
isolate all logic associated with block decompression
into its own module.

zstd_decompress is still in charge
of context creation/destruction,
frames, headers, streaming, special blocks, etc.

Compressed blocks themselves are now handled within zstd_decompress_block .
2018-10-25 16:28:41 -07:00
Yann Collet
cb320a9fc0 added comment on public ddict functions 2018-10-24 16:50:03 -07:00
Yann Collet
806a5c84e4 support decompressing an empty frame into NULL
fix #1385
decompressing into NULL was an automatic error.
It is now allowed, as long as the content of the frame is empty.

Seems to simplify things for `arrow`.
Maybe some other projects rely on this behavior ?
2018-10-24 16:34:35 -07:00
Yann Collet
debff3929b fixed warnings in testpools 2018-10-24 10:36:06 -07:00
Yann Collet
cc3612e1c5 added simple guard macros
in case of accidental multi-includes
2018-10-23 17:55:23 -07:00
Yann Collet
ccd2d426fc separate DDict logic into its own module
created zstd_ddict.c within lib/decompress
2018-10-23 17:25:49 -07:00
Yann Collet
f181799082 fix decodecorpus incorrect frame generation
fix #1379
decodecorpus was generating one extraneous byte when `nbSeq==0`.
This is disallowed by the specification.

The reference decoder was just skipping the extraneous byte.
It is now stricter, and flag such situation as an error.
2018-10-20 18:56:21 -07:00
Yann Collet
1e6208e75e bumped version number to v1.3.7
updated documentation
2018-10-11 14:40:12 -07:00
Ori Livneh
f31715f5e0 Enable use of bswap intrinsics in clang
Necessary because clang disguises itself as an older (__GNUC_MINOR__ = 2) GCC.
2018-10-11 15:01:09 -04:00
Yann Collet
6ed3b526e4 restored bitMask for shift values
since corrupted bitstreams can generate too large values.

This slightly reduces the benefits from clang on my laptop.
gcc results and code generation are not affected.
2018-10-10 18:29:50 -07:00
Yann Collet
c012e9540a removed one assert()
that can be triggered by a corrupted bitstream.
2018-10-10 17:33:04 -07:00
Yann Collet
7791f192ee removed one assert()
which can be triggered when input is corrupted.
2018-10-10 16:39:15 -07:00
Yann Collet
d3ec23313d improved decompression speed
while reviewing #1364,
I found a decompression speed improvement.

On my laptop, the new code decompresses +5-6% faster on clang
and +2-3% faster on gcc.

not bad for an accidental optimization...
2018-10-10 15:48:43 -07:00
Yann Collet
942df522cc
Merge pull request #1361 from facebook/streamdoc
Clarify streaming api doc
2018-10-08 19:19:34 -07:00
W. Felix Handte
b8235be865 Avoid Searching Dictionary in ZSTD_btlazy2 When an Optimal Match is Found
Bailing here is important to avoid reading past the end of the input buffer.
2018-10-08 15:59:32 -07:00
W. Felix Handte
d121b3451c Clean Up Debug Log Statements 2018-10-08 15:59:32 -07:00
W. Felix Handte
08da9ad316 Remove Unused Variable 2018-10-08 15:59:32 -07:00
Yann Collet
8fc79fac07 clarify streaming api doc
as suggested by @indygreg in #1360
2018-10-08 15:53:29 -07:00
Yann Collet
11cd2ea43d finalized minor warnings on Haiku 2018-10-03 16:37:50 -07:00
Yann Collet
bc93b801f0
Merge pull request #1330 from korli/haiku
Enable building zstd on Haiku.
2018-10-03 13:36:00 -07:00
Jerome Duval
87c10e2f58 Enable building zstd on Haiku. 2018-10-03 09:51:56 +02:00
Yann Collet
22ddf3523a fixed msan warning
on btlazy2 strategy with dictAttach
2018-10-02 18:20:20 -07:00
Yann Collet
c9843ec232
Merge pull request #1348 from facebook/donotdelete
Fix #1082
2018-10-02 16:37:58 -07:00
Yann Collet
3ca6261223 fixed static analyzer warnings
note : for some reason,
scan-build version on my laptop found problems within fastcover.c
that scan-build on travisCI does not flag.

They are, as usual, false positive :
the analyzer does not understand that a table (`offset`) is correctly filled before usage.
2018-10-02 15:59:11 -07:00
Yann Collet
228c6e5147
Merge pull request #1317 from felixhandte/split-logs
Independent Dictionary and Working Context Table Logs
2018-10-01 17:20:12 -07:00
W. Felix Handte
5b296869df Revert Ability to Set HashLog and ChainLog on Context When Dict is Attached
This capability is not needed / used in the current unit of work. I'll
re-introduce it later, when we start allowing users to override the deduced
working context logs.
2018-10-01 13:28:13 -07:00
W. Felix Handte
c2369fedc4 Restore Passing CParams to ZSTD_insertAndFindFirstIndex_internal 2018-09-28 17:12:54 -07:00
W. Felix Handte
bad74c4781 Use Working Ctx Logs when not in DMS Mode
We pre-hash the ptr for the dict match state sometimes. When that actually
happens, a hashlog of 0 can produce undefined behavior (right shift a long
long by 64). Only applies to unoptimized compilations, since when
optimizations are applied, those hash operations are dropped when we're not
actually in dms mode.
2018-09-28 17:12:54 -07:00
W. Felix Handte
c38acff94f When Attaching Dictionary, Size Working Tables Based on Input Size Only 2018-09-28 17:12:54 -07:00
W. Felix Handte
9d87d50878 Remove Log Overriding for the Time Being 2018-09-28 17:12:54 -07:00
W. Felix Handte
77fd17d93f Remove Strategy-Dependency in Making Attachment Decision 2018-09-28 17:12:54 -07:00
W. Felix Handte
00c088b32d Support Split Logs in ZSTD_btopt..ZSTD_btultra 2018-09-28 17:12:54 -07:00
W. Felix Handte
0783492178 Bump Split Log Support to ZSTD_btultra 2018-09-28 17:12:54 -07:00
W. Felix Handte
e4ac4a0f16 Support Split Logs in ZSTD_greedy..ZSTD_btlazy2 2018-09-28 17:12:54 -07:00
W. Felix Handte
e710dc3369 Bump Split Log Support to ZSTD_btlazy2 2018-09-28 17:12:54 -07:00
W. Felix Handte
22fcb8d4c7 Support Split Logs in ZSTD_dfast 2018-09-28 17:12:54 -07:00
W. Felix Handte
a232b3bb7c Bump Split Log Support to ZSTD_dfast 2018-09-28 17:12:54 -07:00
W. Felix Handte
fe96e98f81 Support a Separate Hash Log in ZSTD_fast 2018-09-28 17:12:54 -07:00
W. Felix Handte
bc880ebe8f Stop Passing in hashLog and stepSize to ZSTD_compressBlock_fast_generic 2018-09-28 17:12:54 -07:00
W. Felix Handte
b3107c7799 Temporary Commit to Retain Requested Hash and Chain Logs During Dict Attach 2018-09-28 17:12:54 -07:00
W. Felix Handte
34e0193129 Allow Setting Hash and Chain Logs on Contexts with Attached CDict 2018-09-28 17:12:54 -07:00
W. Felix Handte
eae8232f50 For Supported Strategies, Attach Dict Even When Params Don't Match 2018-09-28 17:12:54 -07:00
W. Felix Handte
01ff945eae Split Attach and Copy Reset Strategies into Separate Implementation Functions 2018-09-28 17:12:54 -07:00
W. Felix Handte
a6d6bbeae1 Pull Attachment Decision into Separate Function 2018-09-28 17:12:54 -07:00
W. Felix Handte
b7fba599ae And Then Avoid the Unused Parameter Warning 2018-09-28 17:12:54 -07:00
W. Felix Handte
1f188ae655 Move Asserts into Function to Avoid Unused Function Warning 2018-09-28 17:12:54 -07:00
W. Felix Handte
7212b5e5c2 Move Match State CParams Setting into resetCCtx and continueCCtx 2018-09-28 17:12:54 -07:00
W. Felix Handte
01e34d365b Strengthen Assertion to Assert Equality 2018-09-28 17:12:53 -07:00
W. Felix Handte
50cc1cf4d5 Remove CParams Arg from ZSTD_ldm_blockCompress 2018-09-28 17:12:53 -07:00
W. Felix Handte
14764de49f Stop Separately Passing CParams in ZSTD_lazy Internal Functions 2018-09-28 17:12:53 -07:00
W. Felix Handte
97149f22c3 Stop Separately Passing CParams in ZSTD_opt Internal Functions 2018-09-28 17:10:42 -07:00
W. Felix Handte
dcdf437fed Also Remove CParams from Table Filling Functions' Args 2018-09-28 17:10:42 -07:00
W. Felix Handte
3483f89101 Also Assert Equivalency When Filling MatchState with Prefix 2018-09-28 17:10:42 -07:00
W. Felix Handte
6cb2454646 Remove CParams from Block Compressor Functions' Args 2018-09-28 17:10:42 -07:00
W. Felix Handte
03103269de Assert ctx and ms cparams Equivalency 2018-09-28 17:10:42 -07:00
W. Felix Handte
4e3ecee9ed Remove cParams from CDict 2018-09-28 17:10:42 -07:00
W. Felix Handte
76ef87ed9d Add ZSTD_compressionParameters to ZSTD_matchState_t 2018-09-28 17:10:42 -07:00
Nick Terrell
6391cd1030 [zstd] Fix newly added test case 2018-09-28 12:09:28 -07:00
Yann Collet
73773c6b6a fixed legacy compilation tests
for some reason, these tests started failing recently on CircleCI
2018-09-27 18:15:14 -07:00
Nick Terrell
a180ea07c4 Restore ZSTD_noCompressBlock() for clarity 2018-09-27 16:06:02 -07:00
Nick Terrell
aec1a3ec58 Change byte to value to avoid a GRUB typedef 2018-09-27 15:24:48 -07:00
Nick Terrell
109bd37474 Include stddef.h for size_t 2018-09-27 15:24:48 -07:00
Nick Terrell
f2d6db45cd [zstd] Add -Wmissing-prototypes 2018-09-27 15:24:48 -07:00
Yann Collet
2a5cd8535a
Merge pull request #1342 from facebook/fixcatyd
fix : huge (>4GB) chain of blocks
2018-09-27 10:20:14 -07:00
Yann Collet
404a7bfed0 moved again overflow correction
cannot work from within ZSTD_compressBlock()
2018-09-26 18:06:53 -07:00
Yann Collet
0e2dbac18a changed overflow correction place
keep one in compress_frameChunk(),
so that it's tested at every loop
in case some user simply some large mulit-GB input in a single invocation.

Add one in ZSTD_compressBlock(),
since compressBlock() explicitly skips frameChunk().
2018-09-26 15:35:38 -07:00
Yann Collet
e74eade251
Merge pull request #1339 from facebook/grep_colors
fixed usage of grep in Makefile
2018-09-26 14:39:20 -07:00
Yann Collet
8883af6a1e
Merge pull request #1327 from facebook/adapt
Adaptive compression
2018-09-26 14:39:08 -07:00
Yann Collet
f98c69d77c fix : huge (>4GB) stream of blocks
experimental function ZSTD_compressBlock() is designed for very small data in mind,
for situation where saving the ~12 bytes of frame header can actually make a difference.

Some systems though may have to deal with small and large data entangled.
If it's larger than a block (> 128KB), compressBlock() cannot compress them in one round.

That's why it's possible to compress in multiple rounds.
This is a chain of compressed blocks.

Some users push this capability to the limit, encoding gigantic chain of blocks.
On crossing the 4GB limit, some internal overflow occurs.

This fix moves the overflow correction mechanism higher in the call chain,
so that it's applied also to gigantic chains of blocks.

Added a test case in fuzzer.c, which crashes before the fix, and pass now.
2018-09-26 14:24:28 -07:00
Yann Collet
8ff17a6a09
Merge pull request #1329 from facebook/v04isout
Changed default legacy support to v0.5+
2018-09-26 13:39:05 -07:00
Yann Collet
08f68d83c5 fixed usage of grep in Makefile
when terminal uses colors
as suggested by @danielshir (#1294)
2018-09-25 16:56:53 -07:00
Yann Collet
04f47bbdd2 Merge branch 'dev' into adapt 2018-09-24 16:56:45 -07:00
Yann Collet
9bb6c15f79
Merge pull request #1332 from facebook/minclevel
defined a minimum negative level
2018-09-24 16:01:13 -07:00
Yann Collet
292d8e4a83 added some tests based on limits.h
in order to ensure proper type mapping
when not using stdint.h
2018-09-23 23:57:30 -07:00
Yann Collet
71a5210617 avoid recompiling dll every time under mingw 2018-09-21 17:40:30 -07:00
Yann Collet
c484345a82 Merge branch 'mingw' into adapt 2018-09-21 16:00:46 -07:00
Yann Collet
bfff4f4809 ensure all writes to job->cSize are mutex protected
even when reporting errors,
using a macro for code brevity, as suggested by @terrelln,
2018-09-21 16:00:39 -07:00
Yann Collet
32b7cf1bcf fixed tautological tests
involving ZSTD_TARGETLENGTH_MIN (== 0)
2018-09-21 15:04:43 -07:00
Yann Collet
c044345f8f Merge branch 'mingw' into minclevel 2018-09-21 14:56:57 -07:00
Yann Collet
de6c75e4e5
Merge pull request #1318 from felixhandte/shadow-dict-matches
Don't Search Dictionary Context When Working Context Search Resulted in Mismatch
2018-09-21 12:15:33 -07:00
Yann Collet
a54c86cfc6 defined a minimum negative level
which can be probed using new function ZSTD_minCLevel().

Also : redefined ZSTD_TARGETLENGTH_MIN/MAX for consistency

used the opportunity to bump version number to v1.3.6
2018-09-20 16:52:03 -07:00
Yann Collet
b2939163e1 Changed default legacy support to v0.5+
thus dropping read support for v0.4.

It's always possible to re-enable it, by changing build macro ZSTD_LEGACY_SUPPORT to 4.
2018-09-20 14:30:20 -07:00
Yann Collet
7992942d66 fixed complex tsan issue
when job->consumed == job->src.size , compression job is presumed completed,
so it must be the very last action done in worker thread.
2018-09-20 13:47:31 -07:00
Yann Collet
6b07a66aec fixed minor reporting discrepancy in MT mode 2018-09-19 16:30:55 -07:00
Yann Collet
ca02ebee07 removed static variables
so that --adapt can work on multiple input files too
2018-09-19 15:25:50 -07:00
Yann Collet
89bc309d90 error out when --adapt is associated with --single-thread
since they are not compatible
2018-09-19 14:49:13 -07:00
Yann Collet
2f78228f65 Merge branch 'dev' into adapt 2018-09-19 12:43:42 -07:00
Yann Collet
005f000aed updated documentation of *refPrefix()
indicating the equivalence with `diff` operation.
2018-09-18 13:07:08 -07:00
ko-zu
18b4a1da61 Fix clang build
Fix dixygen comment
Fix clang binary path
2018-09-16 10:27:02 +09:00
Yann Collet
7269fe6cd3 minor code comment update 2018-09-14 16:06:35 -07:00
Yann Collet
0403148315
Merge pull request #1295 from felixhandte/hdr-intro-comment-negative-lvls
Proposed Update to Zstd.h Introduction Comment
2018-09-14 15:29:19 -07:00
W. Felix Handte
b76c888497 ZSTD_dfast: Don't Search Dict Context When Mismatch Was Found 2018-09-14 15:24:25 -07:00
W. Felix Handte
b048af5999 ZSTD_fast: Don't Search Dict Context When Mismatch Was Found 2018-09-14 15:23:35 -07:00
Yann Collet
0e5b447aaa
Merge pull request #1316 from facebook/coldDict
Cold dictionary mitigation
2018-09-14 10:37:46 -07:00
Yann Collet
5512400677 updated code comments, based on @terrelln review 2018-09-13 16:44:04 -07:00
Yann Collet
d195eec97e fixed msan error
cold dictionary is detected through a comparison with dictEnd,
which was not initialized at the beginning of first DCtx usage.
2018-09-13 12:29:52 -07:00
Yann Collet
674dd21bd0 final parameter tuning 2018-09-12 17:25:34 -07:00
Yann Collet
419dfd4ea3 clean traces 2018-09-12 16:40:28 -07:00
Yann Collet
2618253da2 fixed PREFETCH() macro
for corner cases and platforms without this instruction
2018-09-12 16:15:37 -07:00
Yann Collet
44d3b83bb1 conditional dict content prefetching
based on nbSeq.
2018-09-12 15:35:21 -07:00
Yann Collet
5fb5ed3b31 adjust heuristic decisions 2018-09-12 12:32:09 -07:00
Nick Terrell
f6daddf2db Also allow x86 2018-09-12 12:05:32 -07:00
Nick Terrell
1e0bac6a9c [libzstd] Fix cpu for MSFT ARM
The `__cpuid()` and `__cpuidex()` intrinsics are only available
on x86 and x86_64.
2018-09-12 10:35:16 -07:00
Yann Collet
4de344d505 added conditional prefetch
depending on amount of work to do.
2018-09-12 10:29:47 -07:00
Yann Collet
63a519dbf6 implemented first prefetch
based on dictID.
dictContent is prefetched up to 32 KB
(no contentSize adaptation)
2018-09-11 17:23:44 -07:00
Yann Collet
3675ef4762 added comment about minimum size of FSE tables
required for DDict creation,
which use this space as workspace during Hufman table building stage.
2018-09-10 11:24:17 -07:00
Yann Collet
f97ca36eab strengthened conditions for using workplace into fse table space
ensure that the structure layout is as expected.
will trigger an error if it changes in the future.

Another solution would be to use a union,
this would be cleaner and get rid of these static asserts.

However, in order to keep the current code unmodified,
it would be necessary to use an un-named unions.
And apparently, un-named unions are only possible on "recent" compilers (C99+).
2018-09-06 17:54:13 -07:00
Yann Collet
87406548f0 reduced DDict size, by -2KB
corresponding to the removal of workspace
which is needed while building huffman table
and is now either present in DCtx,
or temporarily borrowed from available FSE table space.
2018-09-06 17:07:53 -07:00
Yann Collet
50b216146f
Merge pull request #1304 from facebook/largeNbDicts
contrib/largeNbDicts
2018-09-06 09:50:56 -07:00
Jennifer Liu
21721b75a3 Change default f to 20 2018-09-04 17:15:14 -07:00
Jennifer Liu
944c9986e0 Update comment on default steps of cover and fastcover 2018-08-30 15:37:29 -07:00
Jennifer Liu
16db0337b1 Always use splitPoint=1.0 for non-optimize cover and fastcover 2018-08-30 14:59:22 -07:00
Yann Collet
31ebb26945
Merge pull request #1301 from terrelln/lit-size
[zstd] Fix seqStore growth
2018-08-28 17:10:25 -07:00
Nick Terrell
5e580de6da [zstd] Fix seqStore growth
We could undersize the literals buffer by up to 11 bytes,
due to a combination of 2 bugs:
* The literals buffer didn't have `WILDCOPY_OVERLENGTH` extra
  space, like it is supposed to.
* We didn't check the literals buffer size in `ZSTD_sufficientBuff()`.
2018-08-28 13:24:44 -07:00
Yann Collet
b37a0a6bde
Merge pull request #1298 from facebook/bench
Refactored bench.c
2018-08-28 12:25:02 -07:00
modbw
d14edf259f
Fixed memory leak detected by cppcheck
cppcheck (which is run regularly in our CI environment)  detected a possible memory leak.
2018-08-28 07:25:05 +02:00
Yann Collet
6782725155 first sketch for largeNbDicts test program 2018-08-26 19:29:12 -07:00
Yann Collet
af23d39eb8
Merge pull request #1297 from felixhandte/check-offset-table
Fix Missing Offset Table Check
2018-08-24 17:36:44 -07:00
W. Felix Handte
37f17ee237 Mark Repeated Offset Table as Needing Check 2018-08-24 14:33:34 -07:00
Nick Terrell
e34e917655 Fix compiler warning 2018-08-23 17:48:06 -07:00
Nick Terrell
5ee5e71be3 [zstd] Add note about empty ZSTD_CDict 2018-08-23 17:48:06 -07:00
Nick Terrell
924944e471 [zstd] Reuse the ZSTD_CCtx more often with small data. 2018-08-23 17:48:06 -07:00
Yann Collet
2e45badff4 refactored bench.c
for clarity and safety, especially at interface level
2018-08-23 14:21:18 -07:00
Jennifer Liu
9d6ed9def3 Merge fastCover into DictBuilder (#1274)
* Minor fix

* Run non-optimize FASTCOVER 5 times in benchmark

* Merge fastCover into dictBuilder

* Fix mixed declaration issue

* Add fastcover to symbol.c

* Add fastCover.c and cover.h to build

* Change fastCover.c to fastcover.c

* Update benchmark to run FASTCOVER in dictBuilder

* Undo spliting fastcover_param into cover_param and f

* Remove convert param functions

* Assign f to parameter

* Add zdict.h to Makefile in lib

* Add cover.h to BUCK

* Cast 1 to U64 before shifting

* Remove trimming of zero freq head and tail in selectSegment and rebenchmark

* Remove f as a separate parameter of tryParam

* Read 8 bytes when d is 6

* Add trimming off zero frequency head and tail

* Use best functions from COVER and remove trimming part(which leads to worse compression ratio after previous bugs were fixed)

* Add finalize= argument to FASTCOVER to specify percentage of training samples passed to ZDICT_finalizeDictionary

* Change nbDmer to always read 8 bytes even when d=6

* Add skip=# argument to allow skipping dmers in computeFrequency in FASTCOVER

* Update comments and benchmarking result

* Change default method of ZDICT_trainFromBuffer to ZDICT_optimizeTrainFromBuffer_fastCover

* Add dictType enum and fix bug about passing zParam when converting to coverParam

* Combine finalize and skip into a single parameter

* Update acceleration parameters and benchmark on 3 sample sets

* Change default splitPoint of FASTCOVER to 0.75 and benchmark first 3 sample sets

* Initialize variables outside of for loop in benchmark.c

* Update benchmark result for hg-manifest

* Remove cover.h from install-includes

* Add explanation of f

* Set default compression level for trainFromBuffer to 3

* Add assertion of fastCoverParams in DiB_trainFromFiles

* Add checkTotalCompressedSize function + some minor fixes

* Add test for multithreading fastCovr

* Initialize segmentFreqs in every FASTCOVER_selectSegment and move mutex_unnlock to end of COVER_best_finish

* Free segmentFreqs

* Initialize segmentFreqs before calling FASTCOVER_buildDictionary instead of in FASTCOVER_selectSegment

* Add FASTCOVER_MEMMULT

* Minor fix

* Update benchmarking result
2018-08-23 12:06:20 -07:00
W. Felix Handte
e589ac6276 Reformat Introduction Comment and Mention Negative Levels 2018-08-22 17:07:34 -07:00
Yann Collet
c71c4f23d7 fix "unused parameter" in single-thread mode
within newly added ZSD_toFlushNow()
2018-08-20 11:40:10 -07:00
Yann Collet
105677c6db created ZSTDMT_toFlushNow()
tells in a non-blocking way if there is something ready to flush right now.
only works with multi-threading for the time being.

Useful to know if flush speed will be limited by lack of production.
2018-08-17 18:11:54 -07:00
Yann Collet
36d6165a2d Makefile: added variable SCANBUILD
so that a different version of scan-build can be selected
2018-08-16 16:44:13 -07:00
Yann Collet
1515f0bb0d fixed more issues detected by recent version of scan-build
test run on Linux
2018-08-16 15:20:25 -07:00
Yann Collet
5291d9ac31 fix scope of scan-build tests
exclude zlib code
2018-08-15 17:41:44 -07:00
Yann Collet
42a02ab745 fixed minor warnings issued by scan-build 2018-08-15 14:36:02 -07:00
Yann Collet
3692c31598 Merge branch 'dev' into scanbuild 2018-08-15 13:50:49 -07:00
Yann Collet
6e66bbf5dd fixed several minor issues detected by scan-build
only notable one :
writeNCount() resists better vs invalid distributions
(though it should never happen within zstd anyway)
2018-08-14 16:55:35 -07:00
Yann Collet
3e4617ef54 frameProgression reports nbActiveWorkers and output flushed 2018-08-14 11:49:25 -07:00
Yann Collet
e7a49c6683 introduced command --adapt 2018-08-11 20:48:06 -07:00
Yann Collet
2dd76037be zstd cli can increase level when input is too slow 2018-08-09 15:51:30 -07:00
Yann Collet
79a35ac20d minor code comments improvements 2018-08-09 15:16:31 -07:00
W. Felix Handte
2ca7c69167 Fix CDict Attachment to Handle CDicts with Non-Zero Starts
CDicts were previously guaranteed to be generated with `lowLimit=dictLimit=0`.
This is no longer true, and so the old length and index calculations are no
longer valid. This diff fixes them to handle non-zero start indices in CDicts.
2018-08-07 18:14:14 -07:00
Yann Collet
5808027abf Merge branch 'dev' into fix1241 2018-08-03 16:08:33 -07:00
Yann Collet
5892dd5da4
Merge pull request #1255 from terrelln/norm-fix
[FSE] Fix division by zero
2018-08-02 11:48:56 -07:00
Nick Terrell
dc5a67cb7b Disallow tableLog == srcLog 2018-08-02 11:12:17 -07:00
Jennifer Liu
f5228f2c44 Refactoring 2018-07-31 13:58:54 -07:00
Jennifer Liu
4e29bc2469 Use CDict instead of CCtx in analyzeEntropy 2018-07-31 10:36:45 -07:00
cyan4973
3f535007e4 fix %zu support under minGW
and relevant test on Appveyor
2018-07-30 16:56:18 +02:00
cyan4973
aade1e5904 Merge branch 'dev' into fix1241 2018-07-30 16:30:35 +02:00
Nick Terrell
9889bca530 [FSE] Fix division by zero
When the primary normalization method fails, and
`(1 << tableLog) == (maxSymbolValue + 1)`, and every symbol gets assigned
normalized weight 1 or -1 in the first loop, then the next division can
raise `SIGFPE`.
2018-07-27 17:30:03 -07:00
Yann Collet
6e490a2f09
Merge pull request #1237 from terrelln/init-cstream-adv
Set requestedParams in ZSTD_initCStream*()
2018-07-18 16:33:30 +02:00
cyan4973
9597b438e9 fix #1241
Ensure that first input position is valid for a match
even during first usage of context
by starting reference at 1
(avoiding the problematic 0).
2018-07-17 18:52:57 +02:00
cyan4973
53e1f0504e zstdmt debug traces compatibles with mingw
since mingw does not have `sys/times.h`,
remove this path when detecting mingw compilation.
2018-07-17 14:39:44 +02:00
Nick Terrell
45821fac0c
Merge pull request #1225 from jennifermliu/dev
Split samples when building dictionary for COVER
2018-07-13 13:26:15 -07:00
Nick Terrell
6d222c437c Set requestedParams in ZSTD_initCStream*()
The correct parameters are used once, but once `ZSTD_resetCStream()` is
called the default parameters (level 3) are used. Fix this by setting
`requestedParams` in the `ZSTD_initCStream*()` functions.

The added tests both fail before this patch and pass after.
2018-07-12 18:35:55 -07:00
Jennifer Liu
612b346ed5 Add explanation for split=100 2018-07-11 15:50:28 -07:00
Jennifer Liu
5021441d86 Change default splitPoint to 100 2018-07-10 11:19:33 -07:00
Jennifer Liu
456f290e31 Change back to splitPoint<=0 2018-07-09 13:53:25 -07:00
Jennifer Liu
7efabb2cf6 Only make 0.0 default splitPoint 2018-07-09 12:26:53 -07:00
Yann Collet
bbd78df59b add build macro NO_PREFETCH
prevent usage of prefetch intrinsic commands
which are not supported by c2rust
(see https://github.com/immunant/c2rust/issues/13)
2018-07-06 17:06:04 -07:00
Jennifer Liu
015a00af0f Change cover_sum back to 2 parameters and fix splitPoint issues 2018-07-06 14:24:18 -07:00
Jennifer Liu
0bbff01211 Fix testing parameter 2018-07-05 22:40:32 -07:00
Jennifer Liu
a085d1aae1 Allow splitPoint==1.0 (using all samples for both training and testing) 2018-07-05 10:38:45 -07:00
Jennifer Liu
0881184c89 Some edits based on pull request comments 2018-07-03 17:53:27 -07:00
Jennifer Liu
16e75e8804 Update minimal training sample size 2018-07-03 12:07:06 -07:00
Jennifer Liu
348e5f77a9 Add split=# to cli 2018-06-29 17:54:41 -07:00
Jennifer Liu
52fbbbcb6b Explicitly cast double to unsigned 2018-06-29 16:17:20 -07:00
Jennifer Liu
f9d19b83fb Fix variable declaration problem 2018-06-29 15:46:56 -07:00
Jennifer Liu
e061d84016 Another fix to comparator 2018-06-29 15:38:08 -07:00
Jennifer Liu
59797d3328 Fix splitPoint floating point comparison problem 2018-06-29 12:47:03 -07:00
Jennifer Liu
0ef06f2e8a Split samples into train and test sets 2018-06-29 12:33:34 -07:00
Yann Collet
121aa2c388
Merge pull request #1211 from facebook/staticAssert
updated DEBUG_STATIC_ASSERT()
2018-06-27 12:19:17 -07:00
Yann Collet
4489daec09 slightly adjusted default-distribution threshold
depending on strategy.
fast favors faster compression and decompression speeds.
2018-06-26 20:10:45 -07:00
Yann Collet
ff773bfcde zeroise freq table with memset()
improves decoding speed by ~5% in github_users sample set
2018-06-26 17:24:41 -07:00
Yann Collet
7b9bbf77c9 switched to a sizeof() version
avoid -Werror=unused-variable issue
2018-06-26 14:08:35 -07:00
Yann Collet
f98ec46979 updated DEBUG_STATIC_ASSERT()
following suggestion from #1209
2018-06-26 12:04:59 -07:00
Nick Terrell
b426bcc097
[zstdmt] Fix jobsize bugs (#1205)
[zstdmt] Fix jobsize bugs

* `ZSTDMT_serialState_reset()` should use `targetSectionSize`, not `jobSize` when sizing the seqstore.
  Add an assert that checks that we sized the seqstore using the right job size.
* `ZSTDMT_compressionJob()` should check if `rawSeqStore.seq == NULL`.
* `ZSTDMT_initCStream_internal()` should not adjust `mtctx->params.jobSize` (clamping to MIN/MAX is okay).
2018-06-25 15:21:08 -07:00
Yann Collet
3b53bfe4f3
Merge pull request #1200 from felixhandte/zstd-attach-dict-pref
Add CCtx Param Controlling Dict Attachment Behavior
2018-06-25 12:42:31 -07:00
Yann Collet
31769ce702 error on no forward progress
streaming decoders, such as ZSTD_decompressStream() or ZSTD_decompress_generic(),
may end up making no forward progress,
(aka no byte read from input __and__ no byte written to output),
due to unusual parameters conditions,
such as providing an output buffer already full.

In such case, the caller may be caught in an infinite loop,
calling the streaming decompression function again and again,
without making any progress.

This version detects such situation, and generates an error instead :
ZSTD_error_dstSize_tooSmall when output buffer is full,
ZSTD_error_srcSize_wrong when input buffer is empty.

The detection tolerates a number of attempts before triggering an error,
controlled by ZSTD_NO_FORWARD_PROGRESS_MAX macro constant,
which is set to 16 by default, and can be re-defined at compilation time.
This behavior tolerates potentially existing implementations
where such cases happen sporadically, like once or twice,
which is not dangerous (only infinite loops are),
without generating an error, hence without breaking these implementations.
2018-06-22 17:58:21 -07:00
Yann Collet
3934e010a2
Merge pull request #1197 from facebook/poolResize
Thread Pool resize
2018-06-22 14:20:07 -07:00
Yann Collet
fbd5dfc1b1 changed POOL_resize() return type to int
return is now just en error code.
This guarantee that `ctx` remains valid after POOL_resize().
Gets rid of internal POOL_free() operation.
2018-06-22 12:14:59 -07:00
Yann Collet
1d5648ca10
Merge pull request #1196 from felixhandte/zstd-btopt-in-place-dict
ZSTD_btopt: Support Searching the Dictionary Context In-Place
2018-06-22 11:53:23 -07:00
Yann Collet
f6242d30b7
Merge pull request #1202 from facebook/barelyCompressible
Increase threshold detection of poorly compressible data
2018-06-22 11:52:52 -07:00
Yann Collet
698fd00afb huf: increase threshold detection of poorly compressible data 2018-06-21 18:32:38 -07:00
Yann Collet
243cd9d8bb add a cond_broadcast after resize
to make sure all threads (notably newly available threads)
get awaken to immediately process potential items in the queue.
2018-06-21 18:04:58 -07:00
Yann Collet
818e72b4d5 added extended POOL test
abrupt end + downsizing with running jobs remaining in queue.

also : POOL_resize() requires numThreads >= 1
2018-06-21 14:58:59 -07:00
W. Felix Handte
01bb1c1016 Add CCtx Param Controlling Dict Attachment Behavior 2018-06-21 17:29:25 -04:00
W. Felix Handte
3e91dc4d6a Add Repcode Bounds Check 2018-06-21 15:54:41 -04:00
W. Felix Handte
5bd3d4b7d2 Add Debug Log Statement 2018-06-21 15:54:07 -04:00
W. Felix Handte
3caba150c6 Fix dmsBtLow Test 2018-06-21 15:53:40 -04:00
W. Felix Handte
5da9bbc38e Conceivably Dedup ZSTD_noDict and ZSTD_dictMatchState _insertBt1 Impls
By reverting to the bool extDict flag, we call ZSTD_insertBt1 with the same
const args in both non-extDict dictModes.
2018-06-21 11:20:01 -04:00
Yann Collet
6de249c1c6 fixed: bug when counting nb of active threads
when queueSize > 1

also : added a test in testpool.c
       verifying resizing is effective.
2018-06-20 18:28:49 -07:00
Yann Collet
6b48eb12c0 change control of threadLimit
now limits maximum nb of active threads
even when queueSize > 1.
2018-06-20 14:35:39 -07:00
W. Felix Handte
5d81f71e83 Consistency in Guarding DMS-Only Variable Initializations 2018-06-20 16:54:53 -04:00
W. Felix Handte
9c14eafe3d Also Use matchLow for HC3 Match 2018-06-20 15:51:14 -04:00
W. Felix Handte
0a6cf7cd1d Minor Changes 2018-06-20 15:27:23 -04:00
W. Felix Handte
ae1f3898a2 Remove Dead(!) HC3 DMS Lookup 2018-06-20 15:27:12 -04:00
Yann Collet
93702a7a62
Merge pull request #1198 from facebook/msdebug
made Visual Studio compatible with DEBUGLEVEL >= 2
2018-06-20 12:26:31 -07:00
cyan4973
ae0b7ffa0a made Visual Studio compatible with DEBUGLEVEL >= 2 2018-06-20 09:45:02 -07:00