AuroraMiddleware/zstd - zstd - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Nick Terrell	7e103cdaf5	[threading] Cast unused arguments to void	2018-03-06 18:36:40 -08:00
Yann Collet	d02b44cf55	DYNAMIC_BMI2 enabled for clang clang only claims compatibility with gcc 4.2. Consequently, recent patch which reserved DYNAMIC_BMI2 for gcc >= 4.8 also disabled it for clang. fix : __clang__ is now enough to enable DYNAMIC_BMI2 (associated with other existing conditions : x64/x64, !bmi2)	2018-03-04 16:05:59 -08:00
Yann Collet	45b09e7625	limit DYNAMIC_BMI2 to gcc >= 4.8 attribute bmi2 not supported by gcc 4.4	2018-03-01 15:02:18 -08:00
Yann Collet	89741653ab	added error code workSpace_tooSmall	2018-02-26 15:11:50 -08:00
Yann Collet	6cdf690441	minor cleaning of huff0 Update code documentation, and properly names a few "magic constants". Also, HUF_compress_internal() gets a cleaner way to determine size of tables inside workspace.	2018-02-26 14:52:23 -08:00
Nick Terrell	af866b3a58	Split block compresser out of long range matcher * `ZSTD_ldm_generateSequences()` generates the LDM sequences and stores them in a table. It should work with any chunk size, but is currently only called one block at a time. * `ZSTD_ldm_blockCompress()` emits the pre-defined sequences, and instead of encoding the literals directly, it passes them to a secondary block compressor. The code to handle chunk sizes greater than the block size is currently commented out, since it is unused. The next PR will uncomment exercise this code. * During optimal parsing, ensure LDM `minMatchLength` is at least `targetLength`. Also don't emit repcode matches in the LDM block compressor. Enabling the LDM with the optimal parser now actually improves the compression ratio. * The compression ratio is very similar to before. It is very slightly different, because the repcode handling is slightly different. If I remove immediate repcode checking in both branches the compressed size is exactly the same. * The speed looks to be the same or better than before. Up Next (in a separate PR) -------------------------- Allow sequence generation to happen prior to compression, and produce more than a block worth of sequences. Expose some API for zstdmt to consume. This will test out some currently untested code in `ZSTD_ldm_blockCompress()`.	2018-02-22 15:18:41 -08:00
Yann Collet	010ba5f71f	Merge pull request #1017 from terrelln/c-bmi2 [compress] Support BMI2	2018-02-20 15:34:59 -08:00
Yann Collet	70163bf0d3	added clarification comments in zstd_errors.h answering some points in #1018	2018-02-20 12:54:49 -08:00
Nick Terrell	b58f01537e	[compress] Support BMI2	2018-02-14 19:20:32 -08:00
Nick Terrell	4319132312	[decompress] Support BMI2	2018-02-13 17:00:15 -08:00
Yann Collet	95424409ea	addBits and baseline into FSE decoding table note : unfinished - need new default tables - need modify long mode	2018-02-09 04:25:15 -08:00
Yann Collet	0170cf9a7a	minor : modified ZSTD_preserveUnsortedMark() to be more vectorization friendly	2018-02-05 11:46:02 -08:00
Yann Collet	997e4d0ccd	added POOL_tryAdd()	2018-01-18 14:39:51 -08:00
Nick Terrell	887cd4e35e	Split ZSTD_CCtx into smaller sub-structures	2018-01-16 11:17:50 -08:00
Yann Collet	e8093dde09	fixed #304 Pathological samples may result in literal section being incompressible. This case is now detected, and literal distribution is replaced by one that can be written into the dictionary.	2018-01-11 11:16:32 -08:00
Yann Collet	f299fa39ac	fix a subtle issue in continue mode The deep fuzzer tests caught a subtle bug that was probably there for a long time. The impact of the bug is not a crash, or any other clear error signal, rather, it reduces performance, by cutting data into smaller blocks. Eventually, the following test would fail because it produces too many 1-byte blocks, requiring more space than buffer can provide : `./zstreamtest_asan --mt -s3514 -t1678312 -i1678314` The root scenario is as follows : - Create context, initialize it using explicit parameters or a `cdict` to pin them down, set `pledgedSrcSize=1` - The compression parameters will not be adapted, but `windowSize` and `blockSize` will be automatically set to `1`. `windowSize` and `blockSize` are dynamic values, set within `ZSTD_resetCCtx_internal()`. The automatic adaptation makes it possible to generate smaller contexts for smaller input sizes. - Complete compression - New compression with same context, using same parameters, but `pledgedSrcSize=ZSTD_CONTENTSIZE_UNKNOWN` trigger "continue mode" - Continue mode doesn't modify blockSize, because it used to depend on `windowLog` only, but in fact, it also depends on `pledgedSrcSize`. - The "old" blocksize (1) is still there, next compression will use this value to cut input into blocks, resulting in more blocks and worse performance than necessary performance. Given the scenario, and its possible variants, I'm surprised it did not show up before. But I suspect it did show up, it's just that it never triggered an error, because "worse performance" is not a trigger. The above test is a special corner case, where performance is so impacted that it reaches an error case. The fix works, but I'm not completely pleased. I think the current code relies too much on implied relations between variables. This will likely break again in the future when some related part of the code change. Unfortunately, no time to make larger changes if we want to keep the release target for zstd v1.3.3. So a longer term fix will have to be considered after the release. To do : create a reliable test case which triggers this scenario for CI tests.	2017-12-19 09:43:03 +01:00
Yann Collet	c173dbd6e7	no longer supported starting C++17	2017-12-04 18:00:53 -08:00
Yann Collet	0a0a212934	zstd_opt: changed cost formula There was a flaw in the formula which compared literal cost with match cost : at a given position, a non-null literal suite is going to be part of next sequence, while if position ends a previous match, to immediately start another match, next sequence will have a litlength of zero. A litlength of zero has a non-null cost. It follows that literals cost should be compared to match cost + litlength==0. Not doing so gave a structural advantage to matches, which would be selected more often. I believe that's what led to the creation of the strange heuristic which added a complex cost to matches. The heuristic was actually compensating. It was probably created through multiple trials, settling for best outcome on a given scenario (I suspect silesia.tar). The problem with this heuristic is that it's hard to understand, and unfortunately, any future change in the parser would impact the way it should be calculated and its effects. The "proper" formula makes it possible to remove this heuristic. Now, the problem is : in a head to head comparison, it's sometimes better, sometimes worse. Note that all differences are small (< 0.01 ratio). In general, the newer formula is better for smaller files (for example, calgary.tar and enwik7). I suspect that's because starting statistics are pretty poor (another area of improvement). However, for silesia.tar specifically, it's worse at level 22 (while being better at level 17, so even compression level has an impact ...). It's a pity that zstd -22 gets worse on silesia.tar. That being said, I like that the new code gets rid of strange variables, which were introducing complexity for any future evolution (faster variants being in mind). Therefore, in spite of this detrimental side effect, I tend to be in favor of it.	2017-11-28 14:07:03 -08:00
Yann Collet	cdade555ee	fixed one UB pointer arithmetic	2017-11-17 11:40:08 -08:00
Yann Collet	05dffe43a7	Fixed Btree update ZSTD_updateTree() expected to be followed by a Bt match finder, which would update zc->nextToUpdate. With the new optimal match finder, it's not necessarily the case : a match might be found during repcode or hash3, and stops there because it reaches sufficient_len, without even entering the binary tree. Previous policy was to nonetheless update zc->nextToUpdate, but the current position would not be inserted, creating "holes" in the btree, aka positions that will no longer be searched. Now, when current position is not inserted, zc->nextToUpdate is not update, expecting ZSTD_updateTree() to fill the tree later on. Solution selected is that ZSTD_updateTree() takes care of properly setting zc->nextToUpdate, so that it no longer depends on a future function to do this job. It took time to get there, as the issue started with a memory sanitizer error. The pb would have been easier to spot with a proper `assert()`. So this patch add a few of them. Additionnally, I discovered that `make test` does not enable `assert()` during CLI tests. This patch enables them. Unfortunately, these `assert()` triggered other (unrelated) bugs during CLI tests, mostly within zstdmt. So this patch also fixes them. - Changed packed structure for gcc memory access : memory sanitizer would complain that a read "might" reach out-of-bound position on the ground that the `union` is larger than the type accessed. Now, to avoid this issue, each type is independent. - ZSTD_CCtxParams_setParameter() : @return provides the value of parameter, clamped/fixed appropriately. - ZSTDMT : changed constant name to ZSTDMT_JOBSIZE_MIN - ZSTDMT : multithreading is automatically disabled when srcSize <= ZSTDMT_JOBSIZE_MIN, since only one thread will be used in this case (saves memory and runtime). - ZSTDMT : nbThreads is automatically clamped on setting the value.	2017-11-16 12:18:56 -08:00
Yann Collet	4202b2e8a6	merged rep search into btMatchSearch but there is a tree corruption somewhere ... bug hunt ongoing	2017-11-14 20:38:52 -08:00
Yann Collet	9a11f70dc3	merged repcode search into BT match search this version has same speed as branch `opt` which is itself 5-10% slower than branch `dev` (no identified reason) It does not compress exactly the same as `opt` or `dev`, maybe because it doesn't stop search after repcodes, leading to sometimes better compression, sometimes worse (by a small margin). warning : _extDict path does not work for the time being This means that benchmark module works, but file module will fail with large files (and high compression level). Objective is to fuse _extDict path into current one, in order to have a single parser to maintain.	2017-11-13 02:23:48 -08:00
Yann Collet	4191efa993	zstd_opt: ensure sufficient_len < ZSTD_OPT_NUM to simplify some tests	2017-11-08 11:24:00 -08:00
Yann Collet	8b6aecf2cb	moved a few structures from `zstd_internal.h` to `zstd_compress.h` which is a more precise scope	2017-11-07 16:03:14 -08:00
Yann Collet	61e5a1adfc	removed direct call to malloc() from pool.c	2017-10-31 17:43:24 -07:00
Nick Terrell	a86a7097ec	Ensure dictionary Huff table can encode any symbol * Ensure that the dictionary Huffman CTable has maxSymbolValue 255. * Fix a stack buffer overflow during compression dictionary loading.	2017-10-03 13:22:13 -07:00
Yann Collet	ee1ed78fcb	fix proper naming on FSE_createCTable() arguments in fse.h	2017-09-30 11:08:50 -07:00
Yann Collet	86b4fe5b45	adjustCParams : restored previous behavior unknowns srcSize presumed small if there is a dictionary (dictSize>0) and presumed large otherwise.	2017-09-28 18:14:28 -07:00
Yann Collet	54a827fff0	Merge branch 'dev' into newFormats Fixed conflicts in zstdmt_compress.c	2017-09-27 16:39:40 -07:00
Nick Terrell	6c41adfb28	[libzstd] pthread function prefixed with ZSTD_ * `sed -i 's/pthread_/ZSTD_pthread_/g' lib/{,common,compress,decompress,dictBuilder}/.[hc]` Fix up `lib/common/threading.[hc]` * `sed -i s/PTHREAD_MUTEX_LOCK/ZSTD_PTHREAD_MUTEX_LOCK/g lib/compress/zstdmt_compress.c`	2017-09-27 11:48:48 -07:00
Yann Collet	9416195221	changed error code when pos<=size condition is not respected Now pointing towards src_size or dst_size, instead of error_GENERIC.	2017-09-27 10:35:56 -07:00
Yann Collet	df4e9bba25	fixed constant errors for gcc in c99 mode C standard does not consider a `static const int` as a constant. This is a problem for initializer, and ZSTD_STATIC_ASSERT(). Replaced by macro values	2017-09-26 14:31:06 -07:00
Yann Collet	9f0b8dfbe9	Merge branch 'dev' into newFormats	2017-09-26 14:22:39 -07:00
Nick Terrell	c233bdbaee	Increase maximum window size * Maximum window size in 32-bit mode is 1GB, since allocations for 2GB fail on my Mac. * Maximum window size in 64-bit mode is 2GB, since that is the largest power of 2 that works with the overflow prevention. * Allow `--long=windowLog` to set the window log, along with `--zstd=wlog=#`. These options also set the window size during decompression, but don't override `--memory=#` if it is set. * Present a helpful error message when the window size is too large during decompression. * The long range matcher defaults to a hash log 7 less than the window log, which keeps it at 20 for window log 27. * Keep the default long range matcher window size and the default maximum window size at 27 for the API and CLI. * Add tests that use the maximum window size and hash size for compression and decompression.	2017-09-26 14:00:01 -07:00
Yann Collet	5d8fdd1641	Merge pull request #855 from terrelln/maxoff [libzstd] Increase MaxOff	2017-09-25 16:34:29 -07:00
Yann Collet	b8d4a3887f	introduced constant ZSTD_frameIdSize within zstd_internal.h This is the size of magic number. Avoids using `4` directly in source code, which is a bit less meaningful.	2017-09-25 15:26:18 -07:00
Nick Terrell	bbe77212ef	[libzstd] Increase MaxOff	2017-09-25 13:36:18 -07:00
Yann Collet	7c3dea42ce	added prototypes for advanced parameters for decompression API required to decode custom formats	2017-09-24 15:57:29 -07:00
Nick Terrell	74718d7e43	[bitstream] Allow adding 31 bits at a time	2017-09-19 13:57:33 -07:00
Stella Lau	eb3327c10a	Merge branch 'dev' of https://github.com/facebook/zstd into ldm-mergeDev	2017-09-11 15:00:01 -07:00
Yann Collet	3128e03be6	updated license header to clarify dual-license meaning as "or"	2017-09-08 00:09:23 -07:00
Stella Lau	eeff55dfa8	Merge remote-tracking branch 'upstream/dev' into ldm-mergeDev	2017-09-06 15:56:32 -07:00
Nick Terrell	423b133568	[POOL] Allow free on NULL when multithreading is disabled	2017-09-05 11:18:13 -07:00
Stella Lau	67d4a6161c	Add ldmBucketSizeLog param	2017-09-02 21:55:29 -07:00
Stella Lau	a1f04d518d	Move hashEveryLog to cctxParams and update cli	2017-09-01 15:05:47 -07:00
Stella Lau	767a0b3be1	Move ldm hashLog, bucketLog, and mml to cctxParams	2017-09-01 12:24:59 -07:00
Stella Lau	17d8e0bdcc	Merge remote-tracking branch 'upstream/longRangeMatcher' into ldm-integrate	2017-09-01 10:19:38 -07:00
Stella Lau	8081becadc	Add long distance matching as a CCtxParam	2017-09-01 09:18:58 -07:00
Yann Collet	d963daa6a9	fixed minor warning (empty translation unit)	2017-09-01 00:12:07 -07:00
Yann Collet	d7ad99b2ab	Merge branch 'longRangeMatcher' into dev	2017-08-31 18:08:37 -07:00
Stella Lau	6a546efb8c	Add long distance matcher Move last literals section to ZSTD_block_internal	2017-08-31 12:53:19 -07:00
Yann Collet	e21384fffb	fixed more file headers after license change (#825 )	2017-08-31 12:11:57 -07:00
Yann Collet	e9dc204f42	fixed a bunch of headers after license change (#825 )	2017-08-31 11:24:54 -07:00
Stella Lau	ee65701720	Minor fixes; remove formatting only changes	2017-08-29 20:27:35 -07:00
Stella Lau	c7a18b7c21	Localize 'dictMode' from cctx to function param	2017-08-29 15:52:24 -07:00
Nick Terrell	9822f97721	[error] Don't guard undef X with ifdef X	2017-08-29 11:54:38 -07:00
Nick Terrell	02033be08c	[pool] Visual Studios disallows empty structs	2017-08-28 17:19:01 -07:00
Nick Terrell	7c365eb02c	[threading] Fix ERROR macro after including windows.h	2017-08-28 16:25:02 -07:00
Stella Lau	024098a47d	Fix parameter retrieval from cdict	2017-08-25 17:58:28 -07:00
Stella Lau	2adde898c8	Fix typo with ZSTDMT_parameter	2017-08-25 16:13:40 -07:00
Stella Lau	eb7bbab36a	Remove ZSTD_p_refDictContent and dictContentByRef	2017-08-25 11:11:45 -07:00
Nick Terrell	de6c6bce85	Fix zstd_internal.h for C++ mode	2017-08-24 18:09:50 -07:00
Nick Terrell	26dc040a7b	[pool] Accept custom allocators	2017-08-24 17:01:41 -07:00
Nick Terrell	89dc856cae	[pool] Fix formatting	2017-08-24 16:48:32 -07:00
Stella Lau	5bc2c1e982	Add prototype support for customMem with cctxParams	2017-08-23 12:03:30 -07:00
Stella Lau	6f1a21c7e9	Remove formatting-only changes	2017-08-23 10:24:19 -07:00
Stella Lau	23fc0e41fa	Remove 'opaque' naming from internal functions	2017-08-22 14:24:47 -07:00
Stella Lau	8fd1636776	Remove unused functions	2017-08-22 13:33:58 -07:00
Stella Lau	e50ed1fa3a	Fix undefined behavior when srcSize==1	2017-08-22 11:55:42 -07:00
Stella Lau	5b956f4753	Comment out CCtx_param versions of CDict functions	2017-08-21 14:49:16 -07:00
Stella Lau	502031ca10	Use cctxParam version of createCDict internally	2017-08-21 11:00:44 -07:00
Stella Lau	91b30dbe84	Remove test parameter	2017-08-21 10:09:06 -07:00
Stella Lau	f181f33bdf	Disable tests and refactor	2017-08-21 01:59:08 -07:00
Stella Lau	023b24e6d4	Add cctx param tests	2017-08-20 22:55:07 -07:00
Yann Collet	7db552676e	reduced pool queue to 0 to save memory fixed : pool performance when jobs are fires fast and queueSize==0	2017-08-19 15:07:54 -07:00
Stella Lau	d775519296	Add cctxParam versions of internal functions	2017-08-18 17:37:58 -07:00
Yann Collet	32fb407c9d	updated a bunch of headers for the new license	2017-08-18 16:52:05 -07:00
Stella Lau	399ae013d4	Add function to apply cctx params	2017-08-18 13:01:55 -07:00
Stella Lau	81d89d82a6	Move nbThreads to cctx params	2017-08-18 12:08:57 -07:00
Stella Lau	2300c58a6f	Move dictContentByRef to cctx params	2017-08-18 12:03:16 -07:00
Stella Lau	b6cb2ed8cb	Move dictMode to cctxParams	2017-08-18 11:43:31 -07:00
Stella Lau	c0221124d5	Add function to set opaque parameters	2017-08-17 19:30:22 -07:00
Stella Lau	699f11b4f7	Create opaque parameter structure	2017-08-17 17:33:46 -07:00
Yann Collet	f9e6590715	Merge pull request #796 from terrelln/is-error [FSE][HUF] Inline error checks	2017-08-15 12:37:28 -07:00
Nick Terrell	07c6ff588e	[FSE][HUF] Inline error checks Caught by Clang's optimization remarks.	2017-08-15 11:23:28 -07:00
Nick Terrell	565e925eb7	[libzstd] Fix FORCE_INLINE macro	2017-08-14 21:12:05 -07:00
Stella Lau	73ba58955f	Signal after finishing job when queueSize=0	2017-08-01 20:12:06 -07:00
Stella Lau	1d76da1d87	Replace marker with queueEmpty variable and update pool.h comment	2017-08-01 12:30:16 -07:00
Stella Lau	5adceeed01	Allow queueSize=0 in pool.c and update poolTests	2017-07-31 10:10:16 -07:00
Yann Collet	a90b16e150	Visual blind fix 2	2017-07-20 15:57:55 -07:00
Yann Collet	b4d460f32c	pool.c : blindfix for Visual warnings	2017-07-20 01:13:14 -07:00
Yann Collet	3974d2b38a	blind fix for Windows Multithreading module adds a fake 0 return value for mutex/cond init	2017-07-19 13:33:21 -07:00
Yann Collet	b71363b967	check pthread_*_init() success condition	2017-07-19 01:05:40 -07:00
Yann Collet	77d67fb167	Merge pull request #766 from terrelln/real-block-split [libzstd] Pull optimal parser state out of seqStore_t	2017-07-18 08:26:24 -07:00
Yann Collet	14c83b05c7	Merge pull request #765 from terrelln/real-block-split [libzstd] Remove ZSTD_CCtx* argument of ZSTD_compressSequences()	2017-07-17 19:25:55 -07:00
Nick Terrell	7a28b9e4a3	[libzstd] Pull optimal parser state out of seqStore_t	2017-07-17 15:29:11 -07:00
Yann Collet	3381bf4b84	Merge pull request #764 from terrelln/real-block-split [libzstd] Refactor ZSTD_compressSequences()	2017-07-17 14:46:01 -07:00
Nick Terrell	e198230645	[libzstd] Remove ZSTD_CCtx* argument of ZSTD_compressSequences()	2017-07-17 12:27:24 -07:00
Yann Collet	3b0cff3c33	fixed clang's -Wdocumentation	2017-07-13 18:58:30 -07:00
Yann Collet	2bd6440be0	pinned down error code enum values Note : all error codes are changed by this new version, but it's expected to be the last change for existing codes. Codes are now grouped by category, and receive a manually attributed value. The objective is to guarantee that error code values will not change in the future when introducing new codes. Intentionnal empty spaces and ranges are defined in order to keep room for potential new codes.	2017-07-13 17:12:16 -07:00

1 2 3 4 5 ...

422 Commits