AuroraMiddleware/zstd - zstd - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Yann Collet	c9dfb7e445	guard functions using floating point for debug mode only they are only used to print debug messages. Requested in #1386,	2018-12-22 09:09:40 -08:00
Yann Collet	ededcfca57	fix confusion between unsigned <-> U32 as suggested in #1441. generally U32 and unsigned are the same thing, except when they are not ... case : 32-bit compilation for MIPS (uint32_t == unsigned long) A vast majority of transformation consists in transforming U32 into unsigned. In rare cases, it's the other way around (typically for internal code, such as seeds). Among a few issues this patches solves : - some parameters were declared with type `unsigned` in .h, but with type `U32` in their implementation .c . - some parameters have type unsigned*, but the caller user a pointer to U32 instead. These fixes are useful. However, the bulk of changes is about %u formating, which requires unsigned type, but generally receives U32 values instead, often just for brevity (U32 is shorter than unsigned). These changes are generally minor, or even annoying. As a consequence, the amount of code changed is larger than I would expect for such a patch. Testing is also a pain : it requires manually modifying `mem.h`, in order to lie about `U32` and force it to be an `unsigned long` typically. On a 64-bit system, this will break the equivalence unsigned == U32. Unfortunately, it will also break a few static_assert(), controlling structure sizes. So it also requires modifying `debug.h` to make `static_assert()` a noop. And then reverting these changes. So it's inconvenient, and as a consequence, this property is currently not checked during CI tests. Therefore, these problems can emerge again in the future. I wonder if it is worth ensuring proper distinction of U32 != unsigned in CI tests. It's another restriction for coding, adding more frustration during merge tests, since most platforms don't need this distinction (hence contributor will not see it), and while this can matter in theory, the number of platforms impacted seems minimal. Thoughts ?	2018-12-21 18:09:41 -08:00
Yann Collet	95784c654c	fixed shadowing of stat variable some standard lib declares a `stat` variable at global scope shadowing local declarations ....	2018-12-20 14:56:44 -08:00
Yann Collet	8e0e495ce8	fixed: compression ratio discrepancy depending on initialization, the first byte of a new frame was invalidated or not. As a consequence, one match opportunity was available or not, resulting in slightly different compressed sizes (on average, 1 or 2 bytes once every 20 frames). It impacted ratio comparison between one-shot and streaming modes. This fix makes the first byte of a new frame always a valid match. Now compressed size is always the same. It also improves compressed size by a negligible amount.	2018-12-19 10:11:06 -08:00
Yann Collet	eee789b7ea	continued: changed to overlapLog in deeper code layer. for consistency.	2018-12-11 17:41:42 -08:00
Yann Collet	41c7d0b1e1	changed hashEveryLog into hashRateLog	2018-11-21 14:36:57 -08:00
Nick Terrell	b9693d3a49	[lib] Add rsyncable mode - Add rsyncable mode to multithreaded mode - Factor out LDM's hash function for reuse	2018-11-14 16:59:57 -08:00
W. Felix Handte	4127de5fa6	Switch Enum to Only Non-Negative Values, Update Comments	2018-11-12 12:47:47 -08:00
Yann Collet	8d56f4baee	added a few comments for clarifications	2018-10-26 15:21:52 -07:00
W. Felix Handte	6cb2454646	Remove CParams from Block Compressor Functions' Args	2018-09-28 17:10:42 -07:00
W. Felix Handte	76ef87ed9d	Add ZSTD_compressionParameters to ZSTD_matchState_t	2018-09-28 17:10:42 -07:00
Nick Terrell	5e580de6da	[zstd] Fix seqStore growth We could undersize the literals buffer by up to 11 bytes, due to a combination of 2 bugs: * The literals buffer didn't have `WILDCOPY_OVERLENGTH` extra space, like it is supposed to. * We didn't check the literals buffer size in `ZSTD_sufficientBuff()`.	2018-08-28 13:24:44 -07:00
Nick Terrell	924944e471	[zstd] Reuse the ZSTD_CCtx more often with small data.	2018-08-23 17:48:06 -07:00
W. Felix Handte	01bb1c1016	Add CCtx Param Controlling Dict Attachment Behavior	2018-06-21 17:29:25 -04:00
Yann Collet	fa41bcc2c2	grouped debug functions into debug.h There were 2 competing set of debug functions within zstd_internal.h and bitstream.h. They were mostly duplicate, and required care to avoid messing with each other. There is now a single implementation, shared by both. Significant change : The macro variable ZSTD_DEBUG does no longer exist, it has been replaced by DEBUGLEVEL, which required modifying several source files.	2018-06-13 15:43:09 -04:00
Yann Collet	3050733042	Merge branch 'dev' into negLevels	2018-06-07 15:51:35 -07:00
Yann Collet	a57b4df85f	removed literalCompression directive in this version, literal compression is always disabled for ZSTD_fast strategy. Performance parity between ZSTD_compress_advanced() and ZSTD_compress_generic()	2018-06-07 15:24:12 -07:00
Yann Collet	e5e17d009f	changed member name to workSpaceOversizedDuration	2018-06-06 15:00:27 -07:00
Yann Collet	3d523c741b	added workSpaceTooLarge and workSpaceWasteful also : slightly increased speed of test fuzzer.16	2018-06-05 11:42:48 -07:00
Yann Collet	2108decb41	Fixed a nasty corruption bug recently introduce into the new dictionary mode. The bug could be reproduced with this command : ./zstreamtest -v --opaqueapi --no-big-tests -s4092 -t639 error was in function ZSTD_count_2segments() : the beginning of the 2nd segment corresponds to prefixStart and not the beginning of the current block (istart == src). This would result in comparing the wrong byte.	2018-06-01 18:54:34 -07:00
Yann Collet	463a0fe38b	simplified optimal parser removed "cached" structure. prices are now saved in the optimal table. Primarily done for simplification. Might improve speed by a little. But actually, and surprisingly, also improves ratio in some circumstances.	2018-05-29 14:07:25 -07:00
Yann Collet	f6ad59ab5c	Merge branch 'dev' into staticDictCost	2018-05-24 16:21:02 -07:00
W. Felix Handte	298d24fa57	Make loadedDictEnd an Index, not the Dict Len	2018-05-23 17:53:03 -04:00
W. Felix Handte	3ba70cc759	Clear the Dictionary When Sliding the Window	2018-05-23 17:53:03 -04:00
W. Felix Handte	191fc74a51	Rename 'hasDict' to 'dictMode'	2018-05-23 17:53:03 -04:00
W. Felix Handte	ae4fcf7816	Respond to PR Comments; Formatting/Style/Lint Fixes	2018-05-23 17:53:03 -04:00
W. Felix Handte	b67196f30d	Coalesce hasDictMatchState and extDict Checks into One Enum and Rename Stuff	2018-05-23 17:53:03 -04:00
W. Felix Handte	265c2869d1	Split Wrapper Functions to Cause Inlining	2018-05-23 17:53:03 -04:00
W. Felix Handte	8d24ff0353	Preliminary Support in ZSTD_compressBlock_fast_generic() for Ext Dict Ctx	2018-05-23 17:53:03 -04:00
W. Felix Handte	d18a405779	Refer to the Dictionary Match State In-Place (Sometimes)	2018-05-23 17:53:03 -04:00
Nick Terrell	e3959d5eba	Fixes	2018-05-22 16:06:33 -07:00
Nick Terrell	49cf880513	Approximate FSE encoding costs for selection Estimate the cost for using FSE modes `set_basic`, `set_compressed`, and `set_repeat`, and select the one with the lowest cost. * The cost of `set_basic` is computed using the cross-entropy cost function `ZSTD_crossEntropyCost()`, using the normalized default count and the count. * The cost of `set_repeat` is computed using `FSE_bitCost()`. We check the previous table to see if it is able to represent the distribution. * The cost of `set_compressed` is computed with the entropy cost function `ZSTD_entropyCost()`, together with the cost of writing the normalized count `ZSTD_NCountCost()`.	2018-05-22 14:33:22 -07:00
Yann Collet	a95e9e80d1	adding some debug functions to observe statistics	2018-05-18 14:09:42 -07:00
Yann Collet	8572b4d09f	fixed a pretty complex bug when combining ldm + btultra	2018-05-17 16:13:53 -07:00
Yann Collet	a243020d37	slightly improved weight calculation translating into a tiny compression ratio improvement	2018-05-17 11:19:44 -07:00
Yann Collet	18fc3d3cd5	introduced bit-fractional cost evaluation this improves compression ratio by a tiny amount. It also reduces speed by a small amount. Consequently, bit-fractional evaluation is only turned on for btultra.	2018-05-16 14:53:35 -07:00
Yann Collet	2c26df0e13	opt: removed static prices after testing, it's actually always better to use dynamic prices albeit initialised from dictionary.	2018-05-14 18:04:08 -07:00
Yann Collet	761758982e	replaced FSE_count by FSE_count_simple to reduce usage of stack memory. Also : tweaked a few comments, as suggested by @terrelln	2018-05-11 16:03:37 -07:00
Yann Collet	74b1c75d64	btopt : minor adjustment of update frequencies	2018-05-10 16:32:36 -07:00
Yann Collet	338f738c24	pass entropy tables to optimal parser for proper estimation of symbol's weights when using dictionary compression. Note : using only huffman costs is not good enough, presumably because sequence symbol costs are incorrect.	2018-05-08 15:37:06 -07:00
Yann Collet	a155061328	minor code refactor for readability removed some useless operations from optimal parser (should not change performance, too small a difference)	2018-05-08 12:32:44 -07:00
Nick Terrell	295ab0dbfa	Only load extra table positions for CDicts Zstdmt uses prefixes to load the overlap between segments. Loading extra positions makes compression non-deterministic, depending on the previous job the context was used for. Since loading extra position takes extra time as well, only do it when creating a `ZSTD_CDict`. Fixes #1077.	2018-04-02 14:41:30 -07:00
Yann Collet	a99c4a3621	Merge branch 'dev' into advancedDecompress	2018-03-21 06:08:28 -07:00
Yann Collet	87b0cf05bd	Merge pull request #1057 from facebook/lrmSettings LRM parameters	2018-03-21 05:59:39 -07:00
Yann Collet	6873fec658	changed dictMore for dictContentType which seems clearer to describe what the variable/argument is about.	2018-03-20 15:13:14 -07:00
Yann Collet	6f4d0778a5	make it possible to express compression parameters in any order	2018-03-19 14:41:23 -07:00
Nick Terrell	4af1fafeb8	Restore setting loadedDictEnd Setting `loadedDictEnd` was accidently removed from `ZSTD_loadDictionaryContent()`, which means that dictionary compression will only be able to reference the parts of the dictionary within the window. The spec allows us to reference the entire dictionary so long as even one byte is in the window. `ZSTD_enforceMaxDist()` incorrectly always allowed offsets up to `loadedDictEnd` beyond the window, even once the dictionary was out of range. When overflow protection kicked in, the check `current > loadedDictEnd + maxDist` is incorrect if `loadedDictEnd` isn't reset back to zero. `current` could be reset below the value, which would incorrectly allow references beyond the window. This bug is present in `master`, but is very hard to trigger, since it requires both dictionaries and data which triggers overflow correction.	2018-03-16 14:54:06 -07:00
Nick Terrell	1908c92c46	Merge remote-tracking branch 'upstream/dev' into extern-seq * upstream/dev: Fix overflow protection with wlog=31	2018-03-14 17:26:31 -07:00
Nick Terrell	a9a6dcba63	Expose reference external sequence API * Expose the reference external sequences API for zstdmt. Allows external sequences of any length, which get split when necessary. * Reset the LDM window when the context is reset. * Store the maximum number of LDM sequences. * Sequence generation now returns the number of last literals. * Fix sequence generation to not throw out the last literals when blocks of more than 1 MB are encountered.	2018-03-14 12:29:31 -07:00
Nick Terrell	33fb966e56	Fix overflow protection with wlog=31 The overflow protection is broken when the window log is `> (3U << 29)`, so 31. It doesn't work when `current` isn't around `1U << windowLog` ahead of `lowLimit`, and the the assertion `current > newCurrent` fails. This happens when the same context is used many times over, but with a large window log, like in zstdmt. Fix it by triggering correction based on `nextSrc - base` instead of `lowLimit`. The added test fails before the patch, and passes after.	2018-03-14 11:45:44 -07:00
Yann Collet	a146ee04ae	added negative compression levels negative compression level trade compression ratio for more compression speed. They turn off huffman compression of literals, and use row 0 as baseline with a stepSize = -cLevel. added associated test in fuzzer also added : new advanced parameter ZSTD_p_literalCompression	2018-03-11 05:21:53 -07:00
Nick Terrell	0a0e64c641	LDM manages its own window round buffer	2018-02-27 12:13:23 -08:00
Nick Terrell	7e5e226cbf	Split the window state into substructure	2018-02-26 13:29:57 -08:00
Nick Terrell	af866b3a58	Split block compresser out of long range matcher * `ZSTD_ldm_generateSequences()` generates the LDM sequences and stores them in a table. It should work with any chunk size, but is currently only called one block at a time. * `ZSTD_ldm_blockCompress()` emits the pre-defined sequences, and instead of encoding the literals directly, it passes them to a secondary block compressor. The code to handle chunk sizes greater than the block size is currently commented out, since it is unused. The next PR will uncomment exercise this code. * During optimal parsing, ensure LDM `minMatchLength` is at least `targetLength`. Also don't emit repcode matches in the LDM block compressor. Enabling the LDM with the optimal parser now actually improves the compression ratio. * The compression ratio is very similar to before. It is very slightly different, because the repcode handling is slightly different. If I remove immediate repcode checking in both branches the compressed size is exactly the same. * The speed looks to be the same or better than before. Up Next (in a separate PR) -------------------------- Allow sequence generation to happen prior to compression, and produce more than a block worth of sequences. Expose some API for zstdmt to consume. This will test out some currently untested code in `ZSTD_ldm_blockCompress()`.	2018-02-22 15:18:41 -08:00
Nick Terrell	6e128d3534	[BMI2] Add comments to the bmi2 variable in the contexts	2018-02-20 14:12:11 -08:00
Nick Terrell	b58f01537e	[compress] Support BMI2	2018-02-14 19:20:32 -08:00
Yann Collet	9945e60ac4	Merge branch 'dev' into flexibleLevel	2018-02-10 11:54:49 -08:00
Yann Collet	de68c2ff10	Merged ZSTD_preserveUnsortedMark() into ZSTD_reduceIndex() as it's faster, due to one memory scan instead of two (confirmed by microbenchmark). Note : as ZSTD_reduceIndex() is rarely invoked, it does not translate into a visible gain. Consider it an exercise in auto-vectorization and micro-benchmarking.	2018-02-07 14:22:35 -08:00
Yann Collet	5188749e1c	ensure compression parameters are updated when only compression level is changed	2018-02-02 16:31:20 -08:00
Yann Collet	90eca318a7	fileio: create dedicated function to generate zstd frames like other formats	2018-02-02 14:24:56 -08:00
Yann Collet	209df52ba2	Changed nbThreads for nbWorkers This makes it easier to explain that nbWorkers=0 --> single-threaded mode, while nbWorkers=1 --> asynchronous mode (one mode thread on top of the "main" caller thread). No need for an additional asynchronous mode flag. nbWorkers>=2 works the same as nbThreads>=2 previously.	2018-02-01 19:29:30 -08:00
Yann Collet	a1d4041e69	zstdmt: removed job->jobCompleted replaced by equivalent signal job->consumer == job->srcSize. created additional functions ZSTD_writeLastEmptyBlock() and ZSTDMT_writeLastEmptyBlock() required when it's necessary to finish a frame with a last empty job, to create an "end of frame" marker. It avoids creating a job with srcSize==0.	2018-01-25 17:35:49 -08:00
Yann Collet	c7190c69cc	fixes for @terrelln comments	2018-01-18 11:15:23 -08:00
Yann Collet	394eec697b	Introduce ZSTD_getFrameProgression() Produces 3 statistics for ongoing frame compression : - ingested - consumed (effectively compressed) - produced Ingested can be larger than consumed due to buffering effect. For the time being, this patch mostly fixes the % ratio issue, since it computes consumed / produced, instead of ingested / produced. That being said, update is not "smooth", because on a slow enough setting, fileio spends most of its time waiting for a worker to complete its job. This could be improved thanks to more granular flushing i.e. start flushing before ongoing job is fully completed.	2018-01-17 16:39:02 -08:00
Yann Collet	1dba98d563	introduced parameter ZSTD_p_nonBlockingMode This new parameter makes it possible to call streaming ZSTDMT with a single thread set which is non blocking. It makes it possible for the main thread to do other tasks in parallel while the worker thread does compression. Typically, for zstd cli, it means it can do I/O stuff. Applied within fileio.c, this patch provides non-negligible gains during compression. Tested on my laptop, with enwik9 (1000000000 bytes) : time zstd -f enwik9 With traditional single-thread blocking mode : real 0m9.557s user 0m8.861s sys 0m0.538s With new single-worker non blocking mode : real 0m7.938s user 0m8.049s sys 0m0.514s => 20% faster	2018-01-16 16:15:47 -08:00
Nick Terrell	aae267a2e1	Reorganize block state	2018-01-16 11:17:50 -08:00
Nick Terrell	887cd4e35e	Split ZSTD_CCtx into smaller sub-structures	2018-01-16 11:17:50 -08:00
Yann Collet	e28305fcca	fix #944 : ZSTDMT with large files and dictionary now works correctly windowLog is now enforced from provided compression parameters, instead of being copied blindly from `cdict` where it could be smaller. also : - fix a minor bug in zstreamtest --mt : advanced parameters must be set before init - changed advanced parameter name to ZSTDMT_jobSize	2017-12-12 18:04:58 -08:00
Yann Collet	c173dbd6e7	no longer supported starting C++17	2017-12-04 18:00:53 -08:00
Yann Collet	0a0a212934	zstd_opt: changed cost formula There was a flaw in the formula which compared literal cost with match cost : at a given position, a non-null literal suite is going to be part of next sequence, while if position ends a previous match, to immediately start another match, next sequence will have a litlength of zero. A litlength of zero has a non-null cost. It follows that literals cost should be compared to match cost + litlength==0. Not doing so gave a structural advantage to matches, which would be selected more often. I believe that's what led to the creation of the strange heuristic which added a complex cost to matches. The heuristic was actually compensating. It was probably created through multiple trials, settling for best outcome on a given scenario (I suspect silesia.tar). The problem with this heuristic is that it's hard to understand, and unfortunately, any future change in the parser would impact the way it should be calculated and its effects. The "proper" formula makes it possible to remove this heuristic. Now, the problem is : in a head to head comparison, it's sometimes better, sometimes worse. Note that all differences are small (< 0.01 ratio). In general, the newer formula is better for smaller files (for example, calgary.tar and enwik7). I suspect that's because starting statistics are pretty poor (another area of improvement). However, for silesia.tar specifically, it's worse at level 22 (while being better at level 17, so even compression level has an impact ...). It's a pity that zstd -22 gets worse on silesia.tar. That being said, I like that the new code gets rid of strange variables, which were introducing complexity for any future evolution (faster variants being in mind). Therefore, in spite of this detrimental side effect, I tend to be in favor of it.	2017-11-28 14:07:03 -08:00
Yann Collet	b71405dc51	removed a bunch of code related to cached literal price optState was used both to evaluate price and to cache cost of previously calculated literals. This created a strong dependency, forcing parser to request cost in a strict order. This limitation is forbids future parser with skipping capabilities. After this patch, caching literals price still exists, but is now explicit, in a stack structure.	2017-11-28 12:32:24 -08:00
Yann Collet	3f457264d1	slightly improved compression speed	2017-11-19 14:40:21 -08:00
Yann Collet	e717a5b0dd	zstd_opt: minor speed optimization Calculate reference log2sums only once per serie of sequence (as opposed to once per sequence) Also: improved code comments	2017-11-18 16:24:02 -08:00
Yann Collet	05dffe43a7	Fixed Btree update ZSTD_updateTree() expected to be followed by a Bt match finder, which would update zc->nextToUpdate. With the new optimal match finder, it's not necessarily the case : a match might be found during repcode or hash3, and stops there because it reaches sufficient_len, without even entering the binary tree. Previous policy was to nonetheless update zc->nextToUpdate, but the current position would not be inserted, creating "holes" in the btree, aka positions that will no longer be searched. Now, when current position is not inserted, zc->nextToUpdate is not update, expecting ZSTD_updateTree() to fill the tree later on. Solution selected is that ZSTD_updateTree() takes care of properly setting zc->nextToUpdate, so that it no longer depends on a future function to do this job. It took time to get there, as the issue started with a memory sanitizer error. The pb would have been easier to spot with a proper `assert()`. So this patch add a few of them. Additionnally, I discovered that `make test` does not enable `assert()` during CLI tests. This patch enables them. Unfortunately, these `assert()` triggered other (unrelated) bugs during CLI tests, mostly within zstdmt. So this patch also fixes them. - Changed packed structure for gcc memory access : memory sanitizer would complain that a read "might" reach out-of-bound position on the ground that the `union` is larger than the type accessed. Now, to avoid this issue, each type is independent. - ZSTD_CCtxParams_setParameter() : @return provides the value of parameter, clamped/fixed appropriately. - ZSTDMT : changed constant name to ZSTDMT_JOBSIZE_MIN - ZSTDMT : multithreading is automatically disabled when srcSize <= ZSTDMT_JOBSIZE_MIN, since only one thread will be used in this case (saves memory and runtime). - ZSTDMT : nbThreads is automatically clamped on setting the value.	2017-11-16 12:18:56 -08:00
Yann Collet	046ea53bef	still fighting data corruption due to messed up tree. Seems to happen when reaching end of buffer.	2017-11-15 11:29:24 -08:00
Yann Collet	4202b2e8a6	merged rep search into btMatchSearch but there is a tree corruption somewhere ... bug hunt ongoing	2017-11-14 20:38:52 -08:00
Yann Collet	9a11f70dc3	merged repcode search into BT match search this version has same speed as branch `opt` which is itself 5-10% slower than branch `dev` (no identified reason) It does not compress exactly the same as `opt` or `dev`, maybe because it doesn't stop search after repcodes, leading to sometimes better compression, sometimes worse (by a small margin). warning : _extDict path does not work for the time being This means that benchmark module works, but file module will fail with large files (and high compression level). Objective is to fuse _extDict path into current one, in order to have a single parser to maintain.	2017-11-13 02:23:48 -08:00
Yann Collet	100d8ad6be	lib/compress: created ZSTD_LLcode() and ZSTD_MLcode() transform length into code. Since transformation is needed in several places throughout the code, better write the logic in one place.	2017-11-08 12:43:05 -08:00
Yann Collet	5aa0352742	zstd_opt: simplified ZSTD_getPrice() and ZSTD_updatePrice() interface ZSTD_getPrice() and ZSTD_updatePrice() accept normal matchlength as argument instead of matchlength-MINMATCH, which makes them easier / more logical to use and read. Conversion is simply done internally.	2017-11-08 12:23:27 -08:00
Yann Collet	4191efa993	zstd_opt: ensure sufficient_len < ZSTD_OPT_NUM to simplify some tests	2017-11-08 11:24:00 -08:00
Yann Collet	ee441d5d2b	renamed zstd_compress.h into zstd_compress_internal.h to emphasize the fact that all definitions it contains must remain private, accross lib/compress modules.	2017-11-07 16:15:23 -08:00

1 2 3

131 Commits