AuroraMiddleware/zstd - zstd - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Yann Collet	096714d1b8	Merge pull request #1671 from ephiepark/dev Adding targetCBlockSize param	2019-07-03 17:47:44 -07:00
Ephraim Park	f57ac7b09e	Factor out the logic to build sequences	2019-07-03 15:42:38 -07:00
Ephraim Park	9007701670	Adding targetCBlockSize param	2019-07-03 15:41:52 -07:00
Nick Terrell	6c92ba774e	ZSTD_compressSequences_internal assert op <= oend (#1667 ) When we wrote one byte beyond the end of the buffer for RLE blocks back in 1.3.7, we would then have `op > oend`. That is a problem when we use `oend - op` for the size of the destination buffer, and allows further writes beyond the end of the buffer for the rest of the function. Lets assert that it doesn't happen.	2019-07-02 15:45:47 -07:00
Yann Collet	857e608b51	Merge pull request #1658 from facebook/memset memset() rather than reduceIndex()	2019-07-01 15:01:43 -07:00
Yann Collet	621adde3b2	changed naming to ZSTD_indexTooCloseToMax() Also : minor speed optimization : shortcut to ZSTD_reset_matchState() rather than the full reset process. It still needs to be completed with ZSTD_continueCCtx() for proper initialization. Also : changed position of LDM hash tables in the context, so that the "regular" hash tables can be at a predictable position, hence allowing the shortcut to ZSTD_reset_matchState() without complex conditions.	2019-06-24 14:39:29 -07:00
Yann Collet	45c9fbd6d9	prefer memset() rather than reduceIndex() when close to index range limit by disabling continue mode when index is close to limit.	2019-06-21 16:19:21 -07:00
Yann Collet	944e2e9e12	benchfn : added macro macro CONTROL() like assert() but cannot be disabled. proper separation of user contract errors (CONTROL()) and invariant verification (assert()).	2019-06-21 15:58:55 -07:00
Nick Terrell	674534a700	[zstd] Fix data corruption in niche use case * Extract the overflow correction into a helper function. * Load the dictionary `ZSTD_CHUNKSIZE_MAX = 512 MB` bytes at a time and overflow correct between each chunk. Data corruption could happen when all these conditions are true: * You are using multithreading mode * Your overlap size is >= 512 MB (implies window size >= 512 MB) * You are using a strategy >= ZSTD_btlazy * You are compressing more than 4 GB The problem is that when loading a large dictionary we don't do overflow correction. We can only load 512 MB at a time, and may need to do overflow correction before each chunk.	2019-06-21 15:47:31 -07:00
Nick Terrell	4156060ca4	[zstdmt] Update assert to use ZSTD_WINDOWLOG_MAX	2019-06-21 15:39:33 -07:00
Nick Terrell	95e2b430ea	[opt] Add asserts for corruption in ZSTD_updateTree()	2019-06-21 15:22:29 -07:00
Yann Collet	9af909bf35	Merge pull request #1624 from facebook/smallwlog Improves compression ratio for small windowLog	2019-06-14 17:28:21 -07:00
Nick Terrell	cdb9481e38	[libzstd] Optimize ZSTD_insertBt1() for repetitive data We would only skip at most 192 bytes at a time before this diff. This was added to optimize long matches and skip the middle of the match. However, it doesn't handle the case of repetitive data. This patch keeps the optimization, but also handles repetitive data by taking the max of the two return values. ``` > for n in $(seq 9); do echo strategy=$n; dd status=none if=/dev/zero bs=1024k count=1000 \| command time -f %U ./zstd --zstd=strategy=$n >/dev/null; done strategy=1 0.27 strategy=2 0.23 strategy=3 0.27 strategy=4 0.43 strategy=5 0.56 strategy=6 0.43 strategy=7 0.34 strategy=8 0.34 strategy=9 0.35 ``` At level 19 with multithreading the compressed size of `silesia.tar` regresses 300 bytes, and `enwik8` regresses 100 bytes. In single threaded mode `enwik8` is also within 100 bytes, and I didn't test `silesia.tar`. Fixes Issue #1634.	2019-06-05 20:34:00 -07:00
Yann Collet	80d6ccea79	removed UINT32_MAX apparently not guaranteed on all platforms, replaced by UINT_MAX.	2019-05-31 17:27:07 -07:00
Yann Collet	fce4df3ab7	fixed wrong assert in double_fast	2019-05-31 17:06:28 -07:00
Yann Collet	a968099038	minor code cleaning for new index invalidation strategy	2019-05-31 16:52:37 -07:00
Yann Collet	d605f482c7	make double_fast compatible with new index invalidation strategy	2019-05-31 16:50:04 -07:00
Yann Collet	a30febaeeb	Made fast strategy compatible with new offset validation strategy fast mode does the same thing as before : it pre-emptively invalidates any index that could lead to offset > maxDistance. It's supposed to help speed. But this logic is performed inside zstd_fast, so that other strategies can select a different behavior.	2019-05-31 16:34:55 -07:00
Yann Collet	58adb1059f	extended exact window size to greedy/lazy modes	2019-05-31 16:08:48 -07:00
Yann Collet	bc601bdc6d	first implementation of small window size for btopt noticeably improves compression ratio when window size is small (< 18). enwik7 level 19 windowLog `dev` `smallwlog` improvement 23 3.577 3.577 0.02% 22 3.536 3.538 0.06% 21 3.462 3.467 0.14% 20 3.364 3.377 0.39% 19 3.244 3.272 0.86% 18 3.110 3.166 1.80% 17 2.843 3.057 7.53% 16 2.724 2.943 8.04% 15 2.594 2.822 8.79% 14 2.456 2.686 9.36% 13 2.312 2.523 9.13% 12 2.162 2.361 9.20% 11 2.003 2.182 8.94%	2019-05-31 15:55:12 -07:00
Yann Collet	b13a9207f9	Merge pull request #1623 from facebook/fullbench fullbench minor improvements	2019-05-31 14:40:19 -07:00
Yann Collet	ed38b645db	fullbench: pass proper parameters in scenario 43	2019-05-29 15:26:06 -07:00
Yann Collet	9719fd616c	removed nextToUpdate3 from ZSTD_window it's now a local variable of ZSTD_compressBlock_opt()	2019-05-28 16:18:12 -07:00
Yann Collet	33dabc8c80	get bt matches : made it a bit clearer which parameters are input and output	2019-05-28 16:11:32 -07:00
Yann Collet	327cf6fac1	nextToUpdate3 does not need to be maintained outside of zstd_opt.c It's re-synchronized with nextToUpdate at beginning of each block. It only needs to be tracked from within zstd_opt block parser. Made the logic clear, so that no code tried to maintain this variable. An even better solution would be to make nextToUpdate3 an internal variable of ZSTD_compressBlock_opt_generic(). That would make it possible to remove it from ZSTD_matchState_t, thus restricting its visibility to only where it's actually useful. This would require deeper changes though, since the matchState is the natural structure to transport parameters into and inside the parser.	2019-05-28 15:26:52 -07:00
Yann Collet	6453f8158f	complementary code comments on variables used / impacted during maxDist check	2019-05-28 14:12:16 -07:00
Yann Collet	4baecdf72a	added comments to better understand enforceMaxDist()	2019-05-28 13:15:48 -07:00
Nick Terrell	a17fe4c9e5	[visual] Fix unreachable code warning	2019-04-16 11:32:35 -07:00
Nick Terrell	de0499f7fa	[libzstd] Require ZSTD_MULTITHREAD to create a ZSTDMT_CCtx ZSTDMT was broken when compiled without ZSTD_MULTITHREAD defined, because `ZSTD_CCtx_setParameter(cctx, ZSTD_c_nbWorkers, nbWorkerss)` failed. It was detected by the MSVC test which runs the fuzzer with multithreading disabled. This is a very niche use case of a deprecated API, because the API is inefficient and synchronous, since `threading.h` will be synchronous. Users almost certainly don't want this, and anyone who tested their code should realize that it is broken. Therefore, I think it is safe to require `ZSTD_MULTITHREAD` to be defined to use ZSTDMT.	2019-04-15 23:04:46 -07:00
Josh Soref	a880ca239b	Spelling (#1582 ) * spelling: accidentally * spelling: across * spelling: additionally * spelling: addresses * spelling: appropriate * spelling: assumed * spelling: available * spelling: builder * spelling: capacity * spelling: compiler * spelling: compressibility * spelling: compressor * spelling: compression * spelling: contract * spelling: convenience * spelling: decompress * spelling: description * spelling: deflate * spelling: deterministically * spelling: dictionary * spelling: display * spelling: eliminate * spelling: preemptively * spelling: exclude * spelling: failure * spelling: independence * spelling: independent * spelling: intentionally * spelling: matching * spelling: maximum * spelling: meaning * spelling: mishandled * spelling: memory * spelling: occasionally * spelling: occurrence * spelling: official * spelling: offsets * spelling: original * spelling: output * spelling: overflow * spelling: overridden * spelling: parameter * spelling: performance * spelling: probability * spelling: receives * spelling: redundant * spelling: recompression * spelling: resources * spelling: sanity * spelling: segment * spelling: series * spelling: specified * spelling: specify * spelling: subtracted * spelling: successful * spelling: return * spelling: translation * spelling: update * spelling: unrelated * spelling: useless * spelling: variables * spelling: variety * spelling: verbatim * spelling: verification * spelling: visited * spelling: warming * spelling: workers * spelling: with	2019-04-12 11:18:11 -07:00
Nick Terrell	48a6427d22	[libzstd] Fix ZSTD_compress2() for multithreaded compression `ZSTD_compress2()` wouldn't wait for multithreaded compression to finish. We didn't find this because ZSTDMT will block when it can compress all in one go, but it can't do that if it doesn't have enough output space, or if `ZSTD_c_rsyncable` is enabled. Since we will already sometimes block when using `ZSTD_e_end`, I've changed `ZSTD_e_end` and `ZSTD_e_flush` to guarantee maximum forward progress. This simplifies the API, and helps users avoid the easy bug that was made in `ZSTD_compress2()` * Found by the libfuzzer fuzzers. * Added a test case that catches the problem. * I will make the fuzzers sometimes allocate less than `ZSTD_compressBound()` output space.	2019-04-09 16:24:17 -07:00
Nick Terrell	641e594309	[libzstd] Remove ZSTDMT from the shared object * Remove ZSTDMT from the shared object by default. * Provide a macro `ZSTD_LEGACY_MULTITHREADED_API` to override it. * Document it in `lib/README.md`.	2019-04-07 18:47:52 -07:00
Nick Terrell	72a3fbc0e4	Merge pull request #1562 from terrelln/2fast [libzstd] Speed up single segment zstd_fast by 5%	2019-04-03 18:08:15 -07:00
Nick Terrell	95624b77e4	[libzstd] Speed up single segment zstd_fast by 5% This PR is based on top of PR #1563. The optimization is to process two input pointers per loop. It is based on ideas from [igzip] level 1, and talking to @gbtucker. \| Platform \| Silesia \| Enwik8 \| \|-------------------------\|-------------\|--------\| \| OSX clang-10 \| +5.3% \| +5.4% \| \| i9 5 GHz gcc-8 \| +6.6% \| +6.6% \| \| i9 5 GHz clang-7 \| +8.0% \| +8.0% \| \| Skylake 2.4 GHz gcc-4.8 \| +6.3% \| +7.9% \| \| Skylake 2.4 GHz clang-7 \| +6.2% \| +7.5% \| Testing on all Silesia files on my Intel i9-9900k with gcc-8 \| Silesia File \| Ratio Change \| Speed Change \| \|--------------\|--------------\|--------------\| \| silesia.tar \| +0.17% \| +6.6% \| \| dickens \| +0.25% \| +7.0% \| \| mozilla \| +0.02% \| +6.8% \| \| mr \| -0.30% \| +10.9% \| \| nci \| +1.28% \| +4.5% \| \| ooffice \| -0.35% \| +10.7% \| \| osdb \| +0.75% \| +9.8% \| \| reymont \| +0.65% \| +4.6% \| \| samba \| +0.70% \| +5.9% \| \| sao \| -0.01% \| +14.0% \| \| webster \| +0.30% \| +5.5% \| \| xml \| +0.92% \| +5.3% \| \| x-ray \| -0.00% \| +1.4% \| Same tests on Calgary. For brevity, I've only included files where compression ratio regressed or was much better. \| Calgary File \| Ratio Change \| Speed Change \| \|--------------\|--------------\|--------------\| \| calgary.tar \| +0.30% \| +7.1% \| \| geo \| -0.14% \| +25.0% \| \| obj1 \| -0.46% \| +15.2% \| \| obj2 \| -0.18% \| +6.0% \| \| pic \| +1.80% \| +9.3% \| \| trans \| -0.35% \| +5.5% \| We gain 0.1% of compression ratio on Silesia. We gain 0.3% of compression ratio on enwik8. I also tested on the GitHub and hg-commands datasets without a dictionary, and we gain a small amount of compression ratio on each, as well as speed. I tested the negative compression levels on Silesia on my Intel i9-9900k with gcc-8: \| Level \| Ratio Change \| Speed Change \| \|-------\|--------------\|--------------\| \| -1 \| +0.13% \| +6.4% \| \| -2 \| +4.6% \| -1.5% \| \| -3 \| +7.5% \| -4.8% \| \| -4 \| +8.5% \| -6.9% \| \| -5 \| +9.1% \| -9.1% \| Roughly, the negative levels now scale half as quickly. E.g. the new level 16 is roughly equivalent to the old level 8, but a bit quicker and smaller. If you don't think this is the right trade off, we can change it to multiply the step size by 2, instead of adding 1. I think this makes sense, because it gives a bit slower ratio decay. [igzip]: https://github.com/01org/isa-l/tree/master/igzip	2019-04-02 19:02:50 -07:00
Nick Terrell	56682a7709	Fix ZSTD_estimateCStreamSize_usingCCtxParams() It wasn't using the ZSTD_CCtx_params correctly. It must actualize the compression parameters by calling ZSTD_getCParamsFromCCtxParams() to get the real window log. Tested by updating the streaming memory usage example in the next commit. The CHECK() failed before this patch, and passes after. I also added a unit test to zstreamtest.c that failed before this patch, and passes after.	2019-04-01 18:02:52 -07:00
Nick Terrell	f00407b640	Split out zstd_fast dict match state function	2019-03-29 10:39:16 -06:00
Nick Terrell	6b053b9f60	[lib] Allow ZSTD_CCtx_loadDictionary() to be called before parameters are set * After loading a dictionary only create the cdict once we've started the compression job. This allows the user to pass the dictionary before they set other settings, and is in line with the rest of the API. * Add tests that mix the 3 dictionary loading APIs. * Add extra tests for `ZSTD_CCtx_loadDictionary()`. * The first 2 tests added fail before this patch. * Run the regression test suite.	2019-03-21 16:13:53 -07:00
Nick Terrell	e55da9e963	Wrap the new advanced api completely	2019-03-21 10:54:40 -07:00
Nick Terrell	787b76904a	[libzstd] Allow compression parameters to be set with a cdict The order you set parameters in the advanced API is not supposed to matter. However, once you call `ZSTD_CCtx_refCDict()` the compression parameters cannot be changed. Remove that restriction, and document what parameters are used when using a CDict. If the CCtx is in dictionary mode, then the CDict's parameters are used. If the CCtx is not in dictionary mode, then its requested parameters are used.	2019-03-13 16:10:05 -07:00
Nick Terrell	0594e8135b	[libzstd] Free local cdict when referencing cdict We no longer care about the `cdictLocal` after calling `ZSTD_CCtx_refCDict()`, so we should free it to save some memory.	2019-03-13 14:54:31 -07:00
Nick Terrell	7ad7ba3178	[libzstd] Rename ZSTD_CCtxParam_* to ZSTD_CCtxParams_*	2019-02-19 17:44:52 -08:00
Nick Terrell	f4abba02ba	[libzstd] Clean up parameter code * Move all ZSTDMT parameter setting code to ZSTD_CCtxParams_Parameter(). ZSTDMT now calls these functions, so we can keep all the logic in the same place. Clean up `ZSTD_CCtx_setParameter()` to only add extra checks where needed. * Clean up `ZSTDMT_initJobCCtxParams()` by copying all parameters by default, and then zeroing the ones that need to be zeroed. We've missed adding several parameters here, and it makes more sense to only have to update it if you change something in ZSTDMT. * Add `ZSTDMT_cParam_clampBounds()` to clamp a parameter into its valid range. Use this to keep backwards compatibility when setting ZSTDMT parameters, which clamp into the valid range.	2019-02-19 13:22:37 -08:00
Nick Terrell	3d7377b874	[libzstd] Handle uncompressed literals	2019-02-15 14:58:11 -08:00
Nick Terrell	f9513115e4	[libzstd] Add ZSTD_c_literalCompressionMode flag It controls the literals compression. It is either `auto`, `huffman`, or `uncompressed`. It defaults to `auto`, which is the current behavior.	2019-02-13 14:59:22 -08:00
W. Felix Handte	501eb25102	Rename FORWARD_ERROR -> FORWARD_IF_ERROR	2019-01-29 12:56:07 -05:00
W. Felix Handte	03e040a966	Replace Uses of CHECK_E with RETURN_ERROR_IF(*_isError(...	2019-01-28 17:33:01 -05:00
W. Felix Handte	64bb6640f2	Replace CHECK_F Uses in zstdmt_compress.c and zstd_ddict.c	2019-01-28 17:15:57 -05:00
W. Felix Handte	cafc3b1bcb	Also Convert zstd_compress.c	2019-01-28 17:05:18 -05:00
Yann Collet	f9e4f89252	improved comments for adjustCParams() and getCParams()	2019-01-02 12:18:40 -08:00
Yann Collet	e980ba212f	Merge pull request #1471 from facebook/nofloat guard functions using floating point for debug mode only	2018-12-23 12:35:51 -08:00
Yann Collet	aae5bc538a	Merge pull request #1470 from facebook/U32 fix confusion between unsigned <-> U32	2018-12-23 12:35:39 -08:00
Yann Collet	c9dfb7e445	guard functions using floating point for debug mode only they are only used to print debug messages. Requested in #1386,	2018-12-22 09:09:40 -08:00
Yann Collet	ededcfca57	fix confusion between unsigned <-> U32 as suggested in #1441. generally U32 and unsigned are the same thing, except when they are not ... case : 32-bit compilation for MIPS (uint32_t == unsigned long) A vast majority of transformation consists in transforming U32 into unsigned. In rare cases, it's the other way around (typically for internal code, such as seeds). Among a few issues this patches solves : - some parameters were declared with type `unsigned` in .h, but with type `U32` in their implementation .c . - some parameters have type unsigned*, but the caller user a pointer to U32 instead. These fixes are useful. However, the bulk of changes is about %u formating, which requires unsigned type, but generally receives U32 values instead, often just for brevity (U32 is shorter than unsigned). These changes are generally minor, or even annoying. As a consequence, the amount of code changed is larger than I would expect for such a patch. Testing is also a pain : it requires manually modifying `mem.h`, in order to lie about `U32` and force it to be an `unsigned long` typically. On a 64-bit system, this will break the equivalence unsigned == U32. Unfortunately, it will also break a few static_assert(), controlling structure sizes. So it also requires modifying `debug.h` to make `static_assert()` a noop. And then reverting these changes. So it's inconvenient, and as a consequence, this property is currently not checked during CI tests. Therefore, these problems can emerge again in the future. I wonder if it is worth ensuring proper distinction of U32 != unsigned in CI tests. It's another restriction for coding, adding more frustration during merge tests, since most platforms don't need this distinction (hence contributor will not see it), and while this can matter in theory, the number of platforms impacted seems minimal. Thoughts ?	2018-12-21 18:09:41 -08:00
Yann Collet	c8d1fda982	update aarch64 test to xenial in an attempt to circumvent the `ld` bug	2018-12-21 15:08:48 -08:00
Yann Collet	8f35c7f94c	Merge pull request #1466 from facebook/noDictPresent fixed : better error message	2018-12-20 19:01:27 -08:00
Yann Collet	41b45b84a1	Merge pull request #1465 from facebook/noFilePresent fixed : detection of non-existing file	2018-12-20 17:21:04 -08:00
Yann Collet	ed2fb6bd57	fixed : better error message when dictionary missing during benchmark. Also : refactored ZSTD_fillHashTable(), just for readability (it does the same thing)	2018-12-20 17:20:07 -08:00
Yann Collet	95784c654c	fixed shadowing of stat variable some standard lib declares a `stat` variable at global scope shadowing local declarations ....	2018-12-20 14:56:44 -08:00
Yann Collet	2898afab52	fixed OSSfuzz 11849 The problem was already masked, due to no longer accepting tiny blocks for statistics. But in case it could still happen with not-so-tiny blocks, there is a stricter control which ensures that nothing was already loaded prior to statistics collection.	2018-12-19 16:54:15 -08:00
Yann Collet	8e0e495ce8	fixed: compression ratio discrepancy depending on initialization, the first byte of a new frame was invalidated or not. As a consequence, one match opportunity was available or not, resulting in slightly different compressed sizes (on average, 1 or 2 bytes once every 20 frames). It impacted ratio comparison between one-shot and streaming modes. This fix makes the first byte of a new frame always a valid match. Now compressed size is always the same. It also improves compressed size by a negligible amount.	2018-12-19 10:11:06 -08:00
Yann Collet	d0e15f8d32	Merge pull request #1458 from terrelln/estimate [libzstd] Fix estimate with negative levels	2018-12-18 15:12:21 -08:00
Yann Collet	04baecaeed	Merge pull request #1457 from facebook/btultra2.1 btultra2 and very small input	2018-12-18 14:46:55 -08:00
Nick Terrell	d7def456d8	[libzstd] Fix estimate with negative levels * Fix `ZSTD_estimateCCtxSize()` with negative levels. * Fix `ZSTD_estimateCStreamSize()` with negative levels. * Add a unit test to test for this error.	2018-12-18 14:24:49 -08:00
Yann Collet	ef984e7307	fix debug levels as reported by @terrelln. 2 is reserved for temporary usage only.	2018-12-18 13:40:07 -08:00
Yann Collet	635783da12	btultra2 and very small srcSize When srcSize is small, the nb of symbols produced is likely too small to warrant dedicated probability tables. In which case, predefined distribution tables will be used instead. There is a cheap algorithm in btultra initialization : it presumes default distribution will be used if srcSize <= 1024. btultra2 now uses the same threshold to shut down probability estimation, since measured frequencies won't be used at entropy stage, and therefore relying on them to determine sequence cost is misleading, resulting in worse compression ratios. This fixes btultra2 performance issue on very small input. Note that, a proper way should be to determine which symbol is going to use predefined probaility and which symbol is going to use dynamic ones. But the current algorithm is unable to make a "per-symbol" decision. So this will require significant modifications.	2018-12-18 12:32:58 -08:00
Yann Collet	373ff8b983	play around with rescale weights	2018-12-17 15:48:34 -08:00
Yann Collet	8be145a8c1	fixed default job size	2018-12-13 16:38:08 -08:00
Yann Collet	62180b27d5	zstdmt parameter getter/setter use `int`	2018-12-13 15:47:34 -08:00
Yann Collet	34f01e600f	fixed multiple conversions from 64-bit to 32-bit	2018-12-13 14:02:22 -08:00
Yann Collet	f2f86d369b	Merge branch 'btultra2' into ovlog_def	2018-12-12 20:58:14 -08:00
Yann Collet	9a92ed401d	updated compression results.csv and fixed nit	2018-12-12 20:30:09 -08:00
Yann Collet	9792acda3b	Merge branch 'dev' into btultra2	2018-12-12 20:18:27 -08:00
Yann Collet	7bb8dfc62f	new overlapLog default values varies between 6 and 9, depending on strategy	2018-12-11 18:10:29 -08:00
Yann Collet	eee789b7ea	continued: changed to overlapLog in deeper code layer. for consistency.	2018-12-11 17:41:42 -08:00
Yann Collet	9b784dec7f	changed parameter name to ZSTD_c_overlapLog from overlapSizeLog. Reasoning : `overlapLog` is already used everwhere, in the code, command line and documentation. `ZSTD_c_overlapSizeLog` feels unnecessarily different.	2018-12-11 16:55:33 -08:00
Yann Collet	5e6aaa3abb	fixed btultra2 usage with prefix notably while using multi-threading	2018-12-10 18:45:03 -08:00
Yann Collet	3619c34399	fix assert position within ZSTD_compress2()	2018-12-10 17:42:35 -08:00
Yann Collet	c226a7b9f3	fixed ZSTD_compress2() as suggested by @terrelln	2018-12-10 17:33:49 -08:00
Yann Collet	37e314a68d	updated clevel table for large inputs	2018-12-09 22:38:05 -08:00
Yann Collet	c9c4c7ec8c	update clevel table for 256K	2018-12-08 21:40:08 -08:00
Yann Collet	8075d75f9c	update clevel table for 128K	2018-12-08 10:42:55 -08:00
Yann Collet	95b152ab33	updated clevel table for 16K to introduce btultra2	2018-12-07 20:12:43 -08:00
Yann Collet	d613fd9afe	linked btultra2 as strategy9 and ensure zstdbench detects out-of-bound parameters	2018-12-06 19:27:37 -08:00
Yann Collet	ae370b0e12	minor bound refinements	2018-12-06 16:51:17 -08:00
Yann Collet	39e28982cf	introduced constants ZSTD_STRATEGY_MIN and ZSTD_STRATEGY_MAX	2018-12-06 16:16:16 -08:00
Yann Collet	c3c3488981	fixed c++ assignment to enum	2018-12-06 15:57:55 -08:00
Yann Collet	be9e561da4	changed ZSTD_c_compressionStrategy into ZSTD_c_strategy also : fixed paramgrill, and limit conditions	2018-12-06 15:00:52 -08:00
Yann Collet	e9448cdf4c	introduced strategy btultra2 note : not yet applied on any compression level	2018-12-06 13:38:09 -08:00
Yann Collet	3583d19c4e	changed parameter names from ZSTD_p_* to ZSTD_c_* for naming consistency	2018-12-05 17:26:02 -08:00
Yann Collet	aec945f0dc	implemented ZSTD_dParam_getBounds() and ZSTD_DCtx_setParameter()	2018-12-04 15:35:37 -08:00
Yann Collet	6ced8f7c7c	joined normal streaming API with advanced one	2018-12-03 14:22:38 -08:00
Yann Collet	d8e215cbee	created ZSTD_compress2() and ZSTD_compressStream2() ZSTD_compress_generic() is renamed ZSTD_compressStream2(). Note that, for the time being, the "stable" API and advanced one use different parameter planes : setting parameters using the advanced API does not influence ZSTD_compressStream() and using ZSTD_initCStream() does not influence parameters for ZSTD_compressStream2().	2018-11-30 11:25:56 -08:00
Yann Collet	d4d4e109e9	getParameter fills an int* rather than an unsigned* for consistency since type of setParameter() changed to int.	2018-11-21 15:37:26 -08:00
Yann Collet	41c7d0b1e1	changed hashEveryLog into hashRateLog	2018-11-21 14:36:57 -08:00
Yann Collet	5d3592398d	fixed fall-through	2018-11-20 16:09:33 -08:00
Yann Collet	5c6d4b18ac	completed implementation of ZSTD_cParam_getBounds() for all parameters	2018-11-20 16:06:00 -08:00
Yann Collet	2e7fd6a2cb	fixed remaining searchLength invocations	2018-11-20 15:13:27 -08:00
Yann Collet	e874dacc08	changed searchLength into minMatch refactored all relevant API and calls for consistency.	2018-11-20 14:56:07 -08:00
Yann Collet	114bd4346e	changed enum type name to ZSTD_ResetDirective for naming consistency : types should start with a capital letter (after prefix)	2018-11-20 12:00:20 -08:00
Yann Collet	3b838abf97	ZSTD_CCtx_setParameter : `value` argument is now `int` for compatibility with compression level	2018-11-20 11:53:01 -08:00
Yann Collet	5c68639186	updated ZSTD_DCtx_reset() signature and behavior is now the same as ZSTD_CCtx_reset()	2018-11-15 16:12:39 -08:00
Yann Collet	06c8d5a4f4	Merge branch 'dev' into advancedAPI fixed rsyncable	2018-11-15 10:51:24 -08:00
Nick Terrell	b9693d3a49	[lib] Add rsyncable mode - Add rsyncable mode to multithreaded mode - Factor out LDM's hash function for reuse	2018-11-14 16:59:57 -08:00
Yann Collet	7b0391e37e	finalized retrofit of ZSTD_CCtx_reset() updated all depending sources	2018-11-14 13:05:35 -08:00
Yann Collet	ff8d371708	modified ZSTD_CCtx_reset() which now accepts an enum, to distinguish between resetting the session, or the parameters (or both). removed ZSTD_CCtx_resetParameters(), which is redundant. start replacing invocation of ZSTD_CCtx_reset*() functions Updated advanced API documentation trimmed down amount of API staged in RC, in particular, all functions related to ZSTD_CCtxParams() seem too advanced.	2018-11-14 12:33:57 -08:00
Yann Collet	d7e10a774a	added constant ZSTD_WINDOWLOG_LIMIT_DEFAULT answering #1407. Also : removed obsolete function ZSTD_setDStreamParameter() which could only be used with one parameter (DStream_p_maxWindowSize). Now replaced by ZSTD_DCtx_setWindowSize() (which exists since a few revisions)	2018-11-13 18:12:34 -08:00
Yann Collet	b83d1e7714	removed some `static const` variables and replaced by traditional macro constants. Unfortunately, C doesn't consider `static const` to mean "constant"	2018-11-13 16:56:32 -08:00
Yann Collet	f28af025d9	Merge pull request #1413 from felixhandte/attach-dict-fix-unsigned-compare Fix #1412: Perform Signed Comparison When Setting Attach Dict Param	2018-11-12 17:53:11 -08:00
Yann Collet	626040ab53	changed PREFETCH() macro into PREFETCH_L2() which is more accurate	2018-11-12 17:05:32 -08:00
W. Felix Handte	5faef4d378	Const	2018-11-12 14:48:42 -08:00
W. Felix Handte	2d9332eb21	Fix Types	2018-11-12 12:52:31 -08:00
W. Felix Handte	4127de5fa6	Switch Enum to Only Non-Negative Values, Update Comments	2018-11-12 12:47:47 -08:00
W. Felix Handte	596f7d1256	Fix #1412 : Perform Signed Comparison When Setting Attach Dict Param	2018-11-12 12:07:57 -08:00
Yann Collet	e0701d3c5d	Merge pull request #1404 from facebook/T36302429 fixed T36302471	2018-11-06 11:53:20 -08:00
Yann Collet	3e5cdf1b6a	fixed T36302429	2018-11-05 17:50:30 -08:00
Yann Collet	2caa995558	just add an assert() in ZSTD_insertBtAndGetAllMatches() to express a condition on ll0 . May help static analyzer as in #1397	2018-11-05 17:13:32 -08:00
Yann Collet	3a90229616	Merge pull request #1395 from facebook/decompressblock created zstd_decompress_block module	2018-10-29 16:28:09 -07:00
Yann Collet	8d56f4baee	added a few comments for clarifications	2018-10-26 15:21:52 -07:00
Yann Collet	7b74405150	refactor HUF_compress_internal for clarity changed workspace parameter convention to always provide workspaceSize, so that size can be explicitly checked. Also, use more enum to make the meaning of some parameters more explicit.	2018-10-26 13:21:37 -07:00
W. Felix Handte	b8235be865	Avoid Searching Dictionary in ZSTD_btlazy2 When an Optimal Match is Found Bailing here is important to avoid reading past the end of the input buffer.	2018-10-08 15:59:32 -07:00
W. Felix Handte	d121b3451c	Clean Up Debug Log Statements	2018-10-08 15:59:32 -07:00
W. Felix Handte	08da9ad316	Remove Unused Variable	2018-10-08 15:59:32 -07:00
Yann Collet	22ddf3523a	fixed msan warning on btlazy2 strategy with dictAttach	2018-10-02 18:20:20 -07:00
Yann Collet	228c6e5147	Merge pull request #1317 from felixhandte/split-logs Independent Dictionary and Working Context Table Logs	2018-10-01 17:20:12 -07:00
W. Felix Handte	5b296869df	Revert Ability to Set HashLog and ChainLog on Context When Dict is Attached This capability is not needed / used in the current unit of work. I'll re-introduce it later, when we start allowing users to override the deduced working context logs.	2018-10-01 13:28:13 -07:00
W. Felix Handte	c2369fedc4	Restore Passing CParams to `ZSTD_insertAndFindFirstIndex_internal`	2018-09-28 17:12:54 -07:00
W. Felix Handte	bad74c4781	Use Working Ctx Logs when not in DMS Mode We pre-hash the ptr for the dict match state sometimes. When that actually happens, a hashlog of 0 can produce undefined behavior (right shift a long long by 64). Only applies to unoptimized compilations, since when optimizations are applied, those hash operations are dropped when we're not actually in dms mode.	2018-09-28 17:12:54 -07:00
W. Felix Handte	c38acff94f	When Attaching Dictionary, Size Working Tables Based on Input Size Only	2018-09-28 17:12:54 -07:00
W. Felix Handte	9d87d50878	Remove Log Overriding for the Time Being	2018-09-28 17:12:54 -07:00
W. Felix Handte	77fd17d93f	Remove Strategy-Dependency in Making Attachment Decision	2018-09-28 17:12:54 -07:00
W. Felix Handte	00c088b32d	Support Split Logs in ZSTD_btopt..ZSTD_btultra	2018-09-28 17:12:54 -07:00
W. Felix Handte	0783492178	Bump Split Log Support to ZSTD_btultra	2018-09-28 17:12:54 -07:00
W. Felix Handte	e4ac4a0f16	Support Split Logs in ZSTD_greedy..ZSTD_btlazy2	2018-09-28 17:12:54 -07:00
W. Felix Handte	e710dc3369	Bump Split Log Support to ZSTD_btlazy2	2018-09-28 17:12:54 -07:00
W. Felix Handte	22fcb8d4c7	Support Split Logs in ZSTD_dfast	2018-09-28 17:12:54 -07:00
W. Felix Handte	a232b3bb7c	Bump Split Log Support to ZSTD_dfast	2018-09-28 17:12:54 -07:00
W. Felix Handte	fe96e98f81	Support a Separate Hash Log in ZSTD_fast	2018-09-28 17:12:54 -07:00
W. Felix Handte	bc880ebe8f	Stop Passing in `hashLog` and `stepSize` to `ZSTD_compressBlock_fast_generic`	2018-09-28 17:12:54 -07:00
W. Felix Handte	b3107c7799	Temporary Commit to Retain Requested Hash and Chain Logs During Dict Attach	2018-09-28 17:12:54 -07:00
W. Felix Handte	34e0193129	Allow Setting Hash and Chain Logs on Contexts with Attached CDict	2018-09-28 17:12:54 -07:00
W. Felix Handte	eae8232f50	For Supported Strategies, Attach Dict Even When Params Don't Match	2018-09-28 17:12:54 -07:00
W. Felix Handte	01ff945eae	Split Attach and Copy Reset Strategies into Separate Implementation Functions	2018-09-28 17:12:54 -07:00
W. Felix Handte	a6d6bbeae1	Pull Attachment Decision into Separate Function	2018-09-28 17:12:54 -07:00
W. Felix Handte	b7fba599ae	And Then Avoid the Unused Parameter Warning	2018-09-28 17:12:54 -07:00
W. Felix Handte	1f188ae655	Move Asserts into Function to Avoid Unused Function Warning	2018-09-28 17:12:54 -07:00
W. Felix Handte	7212b5e5c2	Move Match State CParams Setting into `resetCCtx` and `continueCCtx`	2018-09-28 17:12:54 -07:00
W. Felix Handte	01e34d365b	Strengthen Assertion to Assert Equality	2018-09-28 17:12:53 -07:00
W. Felix Handte	50cc1cf4d5	Remove CParams Arg from ZSTD_ldm_blockCompress	2018-09-28 17:12:53 -07:00
W. Felix Handte	14764de49f	Stop Separately Passing CParams in ZSTD_lazy Internal Functions	2018-09-28 17:12:53 -07:00
W. Felix Handte	97149f22c3	Stop Separately Passing CParams in ZSTD_opt Internal Functions	2018-09-28 17:10:42 -07:00
W. Felix Handte	dcdf437fed	Also Remove CParams from Table Filling Functions' Args	2018-09-28 17:10:42 -07:00
W. Felix Handte	3483f89101	Also Assert Equivalency When Filling MatchState with Prefix	2018-09-28 17:10:42 -07:00
W. Felix Handte	6cb2454646	Remove CParams from Block Compressor Functions' Args	2018-09-28 17:10:42 -07:00
W. Felix Handte	03103269de	Assert `ctx` and `ms` cparams Equivalency	2018-09-28 17:10:42 -07:00
W. Felix Handte	4e3ecee9ed	Remove cParams from CDict	2018-09-28 17:10:42 -07:00
W. Felix Handte	76ef87ed9d	Add ZSTD_compressionParameters to ZSTD_matchState_t	2018-09-28 17:10:42 -07:00
Nick Terrell	6391cd1030	[zstd] Fix newly added test case	2018-09-28 12:09:28 -07:00
Nick Terrell	a180ea07c4	Restore ZSTD_noCompressBlock() for clarity	2018-09-27 16:06:02 -07:00
Nick Terrell	f2d6db45cd	[zstd] Add -Wmissing-prototypes	2018-09-27 15:24:48 -07:00
Yann Collet	2a5cd8535a	Merge pull request #1342 from facebook/fixcatyd fix : huge (>4GB) chain of blocks	2018-09-27 10:20:14 -07:00
Yann Collet	404a7bfed0	moved again overflow correction cannot work from within ZSTD_compressBlock()	2018-09-26 18:06:53 -07:00
Yann Collet	0e2dbac18a	changed overflow correction place keep one in compress_frameChunk(), so that it's tested at every loop in case some user simply some large mulit-GB input in a single invocation. Add one in ZSTD_compressBlock(), since compressBlock() explicitly skips frameChunk().	2018-09-26 15:35:38 -07:00
Yann Collet	f98c69d77c	fix : huge (>4GB) stream of blocks experimental function ZSTD_compressBlock() is designed for very small data in mind, for situation where saving the ~12 bytes of frame header can actually make a difference. Some systems though may have to deal with small and large data entangled. If it's larger than a block (> 128KB), compressBlock() cannot compress them in one round. That's why it's possible to compress in multiple rounds. This is a chain of compressed blocks. Some users push this capability to the limit, encoding gigantic chain of blocks. On crossing the 4GB limit, some internal overflow occurs. This fix moves the overflow correction mechanism higher in the call chain, so that it's applied also to gigantic chains of blocks. Added a test case in fuzzer.c, which crashes before the fix, and pass now.	2018-09-26 14:24:28 -07:00
Yann Collet	04f47bbdd2	Merge branch 'dev' into adapt	2018-09-24 16:56:45 -07:00
Yann Collet	c484345a82	Merge branch 'mingw' into adapt	2018-09-21 16:00:46 -07:00
Yann Collet	bfff4f4809	ensure all writes to job->cSize are mutex protected even when reporting errors, using a macro for code brevity, as suggested by @terrelln,	2018-09-21 16:00:39 -07:00
Yann Collet	32b7cf1bcf	fixed tautological tests involving ZSTD_TARGETLENGTH_MIN (== 0)	2018-09-21 15:04:43 -07:00
Yann Collet	c044345f8f	Merge branch 'mingw' into minclevel	2018-09-21 14:56:57 -07:00
Yann Collet	de6c75e4e5	Merge pull request #1318 from felixhandte/shadow-dict-matches Don't Search Dictionary Context When Working Context Search Resulted in Mismatch	2018-09-21 12:15:33 -07:00
Yann Collet	a54c86cfc6	defined a minimum negative level which can be probed using new function ZSTD_minCLevel(). Also : redefined ZSTD_TARGETLENGTH_MIN/MAX for consistency used the opportunity to bump version number to v1.3.6	2018-09-20 16:52:03 -07:00
Yann Collet	7992942d66	fixed complex tsan issue when job->consumed == job->src.size , compression job is presumed completed, so it must be the very last action done in worker thread.	2018-09-20 13:47:31 -07:00
Yann Collet	6b07a66aec	fixed minor reporting discrepancy in MT mode	2018-09-19 16:30:55 -07:00
Yann Collet	ca02ebee07	removed static variables so that --adapt can work on multiple input files too	2018-09-19 15:25:50 -07:00
Yann Collet	2f78228f65	Merge branch 'dev' into adapt	2018-09-19 12:43:42 -07:00
ko-zu	18b4a1da61	Fix clang build Fix dixygen comment Fix clang binary path	2018-09-16 10:27:02 +09:00
W. Felix Handte	b76c888497	ZSTD_dfast: Don't Search Dict Context When Mismatch Was Found	2018-09-14 15:24:25 -07:00
W. Felix Handte	b048af5999	ZSTD_fast: Don't Search Dict Context When Mismatch Was Found	2018-09-14 15:23:35 -07:00
Yann Collet	31ebb26945	Merge pull request #1301 from terrelln/lit-size [zstd] Fix seqStore growth	2018-08-28 17:10:25 -07:00
Nick Terrell	5e580de6da	[zstd] Fix seqStore growth We could undersize the literals buffer by up to 11 bytes, due to a combination of 2 bugs: * The literals buffer didn't have `WILDCOPY_OVERLENGTH` extra space, like it is supposed to. * We didn't check the literals buffer size in `ZSTD_sufficientBuff()`.	2018-08-28 13:24:44 -07:00
Yann Collet	b37a0a6bde	Merge pull request #1298 from facebook/bench Refactored bench.c	2018-08-28 12:25:02 -07:00
Yann Collet	af23d39eb8	Merge pull request #1297 from felixhandte/check-offset-table Fix Missing Offset Table Check	2018-08-24 17:36:44 -07:00
W. Felix Handte	37f17ee237	Mark Repeated Offset Table as Needing Check	2018-08-24 14:33:34 -07:00
Nick Terrell	e34e917655	Fix compiler warning	2018-08-23 17:48:06 -07:00
Nick Terrell	924944e471	[zstd] Reuse the ZSTD_CCtx more often with small data.	2018-08-23 17:48:06 -07:00
Yann Collet	2e45badff4	refactored bench.c for clarity and safety, especially at interface level	2018-08-23 14:21:18 -07:00
Yann Collet	c71c4f23d7	fix "unused parameter" in single-thread mode within newly added ZSD_toFlushNow()	2018-08-20 11:40:10 -07:00
Yann Collet	105677c6db	created ZSTDMT_toFlushNow() tells in a non-blocking way if there is something ready to flush right now. only works with multi-threading for the time being. Useful to know if flush speed will be limited by lack of production.	2018-08-17 18:11:54 -07:00
Yann Collet	1515f0bb0d	fixed more issues detected by recent version of scan-build test run on Linux	2018-08-16 15:20:25 -07:00
Yann Collet	3692c31598	Merge branch 'dev' into scanbuild	2018-08-15 13:50:49 -07:00
Yann Collet	6e66bbf5dd	fixed several minor issues detected by scan-build only notable one : writeNCount() resists better vs invalid distributions (though it should never happen within zstd anyway)	2018-08-14 16:55:35 -07:00
Yann Collet	3e4617ef54	frameProgression reports nbActiveWorkers and output flushed	2018-08-14 11:49:25 -07:00
Yann Collet	e7a49c6683	introduced command --adapt	2018-08-11 20:48:06 -07:00
Yann Collet	2dd76037be	zstd cli can increase level when input is too slow	2018-08-09 15:51:30 -07:00
Yann Collet	79a35ac20d	minor code comments improvements	2018-08-09 15:16:31 -07:00
W. Felix Handte	2ca7c69167	Fix CDict Attachment to Handle CDicts with Non-Zero Starts CDicts were previously guaranteed to be generated with `lowLimit=dictLimit=0`. This is no longer true, and so the old length and index calculations are no longer valid. This diff fixes them to handle non-zero start indices in CDicts.	2018-08-07 18:14:14 -07:00
Yann Collet	5808027abf	Merge branch 'dev' into fix1241	2018-08-03 16:08:33 -07:00
Nick Terrell	dc5a67cb7b	Disallow tableLog == srcLog	2018-08-02 11:12:17 -07:00
cyan4973	aade1e5904	Merge branch 'dev' into fix1241	2018-07-30 16:30:35 +02:00
Nick Terrell	9889bca530	[FSE] Fix division by zero When the primary normalization method fails, and `(1 << tableLog) == (maxSymbolValue + 1)`, and every symbol gets assigned normalized weight 1 or -1 in the first loop, then the next division can raise `SIGFPE`.	2018-07-27 17:30:03 -07:00
Yann Collet	6e490a2f09	Merge pull request #1237 from terrelln/init-cstream-adv Set requestedParams in ZSTD_initCStream*()	2018-07-18 16:33:30 +02:00

... 2 3 4 5 6 ...

1370 Commits