AuroraMiddleware/zstd - zstd - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Yann Collet	c173dbd6e7	no longer supported starting C++17	2017-12-04 18:00:53 -08:00
Yann Collet	0a0a212934	zstd_opt: changed cost formula There was a flaw in the formula which compared literal cost with match cost : at a given position, a non-null literal suite is going to be part of next sequence, while if position ends a previous match, to immediately start another match, next sequence will have a litlength of zero. A litlength of zero has a non-null cost. It follows that literals cost should be compared to match cost + litlength==0. Not doing so gave a structural advantage to matches, which would be selected more often. I believe that's what led to the creation of the strange heuristic which added a complex cost to matches. The heuristic was actually compensating. It was probably created through multiple trials, settling for best outcome on a given scenario (I suspect silesia.tar). The problem with this heuristic is that it's hard to understand, and unfortunately, any future change in the parser would impact the way it should be calculated and its effects. The "proper" formula makes it possible to remove this heuristic. Now, the problem is : in a head to head comparison, it's sometimes better, sometimes worse. Note that all differences are small (< 0.01 ratio). In general, the newer formula is better for smaller files (for example, calgary.tar and enwik7). I suspect that's because starting statistics are pretty poor (another area of improvement). However, for silesia.tar specifically, it's worse at level 22 (while being better at level 17, so even compression level has an impact ...). It's a pity that zstd -22 gets worse on silesia.tar. That being said, I like that the new code gets rid of strange variables, which were introducing complexity for any future evolution (faster variants being in mind). Therefore, in spite of this detrimental side effect, I tend to be in favor of it.	2017-11-28 14:07:03 -08:00
Yann Collet	cdade555ee	fixed one UB pointer arithmetic	2017-11-17 11:40:08 -08:00
Yann Collet	05dffe43a7	Fixed Btree update ZSTD_updateTree() expected to be followed by a Bt match finder, which would update zc->nextToUpdate. With the new optimal match finder, it's not necessarily the case : a match might be found during repcode or hash3, and stops there because it reaches sufficient_len, without even entering the binary tree. Previous policy was to nonetheless update zc->nextToUpdate, but the current position would not be inserted, creating "holes" in the btree, aka positions that will no longer be searched. Now, when current position is not inserted, zc->nextToUpdate is not update, expecting ZSTD_updateTree() to fill the tree later on. Solution selected is that ZSTD_updateTree() takes care of properly setting zc->nextToUpdate, so that it no longer depends on a future function to do this job. It took time to get there, as the issue started with a memory sanitizer error. The pb would have been easier to spot with a proper `assert()`. So this patch add a few of them. Additionnally, I discovered that `make test` does not enable `assert()` during CLI tests. This patch enables them. Unfortunately, these `assert()` triggered other (unrelated) bugs during CLI tests, mostly within zstdmt. So this patch also fixes them. - Changed packed structure for gcc memory access : memory sanitizer would complain that a read "might" reach out-of-bound position on the ground that the `union` is larger than the type accessed. Now, to avoid this issue, each type is independent. - ZSTD_CCtxParams_setParameter() : @return provides the value of parameter, clamped/fixed appropriately. - ZSTDMT : changed constant name to ZSTDMT_JOBSIZE_MIN - ZSTDMT : multithreading is automatically disabled when srcSize <= ZSTDMT_JOBSIZE_MIN, since only one thread will be used in this case (saves memory and runtime). - ZSTDMT : nbThreads is automatically clamped on setting the value.	2017-11-16 12:18:56 -08:00
Yann Collet	4202b2e8a6	merged rep search into btMatchSearch but there is a tree corruption somewhere ... bug hunt ongoing	2017-11-14 20:38:52 -08:00
Yann Collet	9a11f70dc3	merged repcode search into BT match search this version has same speed as branch `opt` which is itself 5-10% slower than branch `dev` (no identified reason) It does not compress exactly the same as `opt` or `dev`, maybe because it doesn't stop search after repcodes, leading to sometimes better compression, sometimes worse (by a small margin). warning : _extDict path does not work for the time being This means that benchmark module works, but file module will fail with large files (and high compression level). Objective is to fuse _extDict path into current one, in order to have a single parser to maintain.	2017-11-13 02:23:48 -08:00
Yann Collet	4191efa993	zstd_opt: ensure sufficient_len < ZSTD_OPT_NUM to simplify some tests	2017-11-08 11:24:00 -08:00
Yann Collet	8b6aecf2cb	moved a few structures from `zstd_internal.h` to `zstd_compress.h` which is a more precise scope	2017-11-07 16:03:14 -08:00
Yann Collet	61e5a1adfc	removed direct call to malloc() from pool.c	2017-10-31 17:43:24 -07:00
Nick Terrell	a86a7097ec	Ensure dictionary Huff table can encode any symbol * Ensure that the dictionary Huffman CTable has maxSymbolValue 255. * Fix a stack buffer overflow during compression dictionary loading.	2017-10-03 13:22:13 -07:00
Yann Collet	ee1ed78fcb	fix proper naming on FSE_createCTable() arguments in fse.h	2017-09-30 11:08:50 -07:00
Yann Collet	86b4fe5b45	adjustCParams : restored previous behavior unknowns srcSize presumed small if there is a dictionary (dictSize>0) and presumed large otherwise.	2017-09-28 18:14:28 -07:00
Yann Collet	54a827fff0	Merge branch 'dev' into newFormats Fixed conflicts in zstdmt_compress.c	2017-09-27 16:39:40 -07:00
Nick Terrell	6c41adfb28	[libzstd] pthread function prefixed with ZSTD_ * `sed -i 's/pthread_/ZSTD_pthread_/g' lib/{,common,compress,decompress,dictBuilder}/.[hc]` Fix up `lib/common/threading.[hc]` * `sed -i s/PTHREAD_MUTEX_LOCK/ZSTD_PTHREAD_MUTEX_LOCK/g lib/compress/zstdmt_compress.c`	2017-09-27 11:48:48 -07:00
Yann Collet	9416195221	changed error code when pos<=size condition is not respected Now pointing towards src_size or dst_size, instead of error_GENERIC.	2017-09-27 10:35:56 -07:00
Yann Collet	df4e9bba25	fixed constant errors for gcc in c99 mode C standard does not consider a `static const int` as a constant. This is a problem for initializer, and ZSTD_STATIC_ASSERT(). Replaced by macro values	2017-09-26 14:31:06 -07:00
Yann Collet	9f0b8dfbe9	Merge branch 'dev' into newFormats	2017-09-26 14:22:39 -07:00
Nick Terrell	c233bdbaee	Increase maximum window size * Maximum window size in 32-bit mode is 1GB, since allocations for 2GB fail on my Mac. * Maximum window size in 64-bit mode is 2GB, since that is the largest power of 2 that works with the overflow prevention. * Allow `--long=windowLog` to set the window log, along with `--zstd=wlog=#`. These options also set the window size during decompression, but don't override `--memory=#` if it is set. * Present a helpful error message when the window size is too large during decompression. * The long range matcher defaults to a hash log 7 less than the window log, which keeps it at 20 for window log 27. * Keep the default long range matcher window size and the default maximum window size at 27 for the API and CLI. * Add tests that use the maximum window size and hash size for compression and decompression.	2017-09-26 14:00:01 -07:00
Yann Collet	5d8fdd1641	Merge pull request #855 from terrelln/maxoff [libzstd] Increase MaxOff	2017-09-25 16:34:29 -07:00
Yann Collet	b8d4a3887f	introduced constant ZSTD_frameIdSize within zstd_internal.h This is the size of magic number. Avoids using `4` directly in source code, which is a bit less meaningful.	2017-09-25 15:26:18 -07:00
Nick Terrell	bbe77212ef	[libzstd] Increase MaxOff	2017-09-25 13:36:18 -07:00
Yann Collet	7c3dea42ce	added prototypes for advanced parameters for decompression API required to decode custom formats	2017-09-24 15:57:29 -07:00
Nick Terrell	74718d7e43	[bitstream] Allow adding 31 bits at a time	2017-09-19 13:57:33 -07:00
Stella Lau	eb3327c10a	Merge branch 'dev' of https://github.com/facebook/zstd into ldm-mergeDev	2017-09-11 15:00:01 -07:00
Yann Collet	3128e03be6	updated license header to clarify dual-license meaning as "or"	2017-09-08 00:09:23 -07:00
Stella Lau	eeff55dfa8	Merge remote-tracking branch 'upstream/dev' into ldm-mergeDev	2017-09-06 15:56:32 -07:00
Nick Terrell	423b133568	[POOL] Allow free on NULL when multithreading is disabled	2017-09-05 11:18:13 -07:00
Stella Lau	67d4a6161c	Add ldmBucketSizeLog param	2017-09-02 21:55:29 -07:00
Stella Lau	a1f04d518d	Move hashEveryLog to cctxParams and update cli	2017-09-01 15:05:47 -07:00
Stella Lau	767a0b3be1	Move ldm hashLog, bucketLog, and mml to cctxParams	2017-09-01 12:24:59 -07:00
Stella Lau	17d8e0bdcc	Merge remote-tracking branch 'upstream/longRangeMatcher' into ldm-integrate	2017-09-01 10:19:38 -07:00
Stella Lau	8081becadc	Add long distance matching as a CCtxParam	2017-09-01 09:18:58 -07:00
Yann Collet	d963daa6a9	fixed minor warning (empty translation unit)	2017-09-01 00:12:07 -07:00
Yann Collet	d7ad99b2ab	Merge branch 'longRangeMatcher' into dev	2017-08-31 18:08:37 -07:00
Stella Lau	6a546efb8c	Add long distance matcher Move last literals section to ZSTD_block_internal	2017-08-31 12:53:19 -07:00
Yann Collet	e21384fffb	fixed more file headers after license change (#825 )	2017-08-31 12:11:57 -07:00
Yann Collet	e9dc204f42	fixed a bunch of headers after license change (#825 )	2017-08-31 11:24:54 -07:00
Stella Lau	ee65701720	Minor fixes; remove formatting only changes	2017-08-29 20:27:35 -07:00
Stella Lau	c7a18b7c21	Localize 'dictMode' from cctx to function param	2017-08-29 15:52:24 -07:00
Nick Terrell	9822f97721	[error] Don't guard undef X with ifdef X	2017-08-29 11:54:38 -07:00
Nick Terrell	02033be08c	[pool] Visual Studios disallows empty structs	2017-08-28 17:19:01 -07:00
Nick Terrell	7c365eb02c	[threading] Fix ERROR macro after including windows.h	2017-08-28 16:25:02 -07:00
Stella Lau	024098a47d	Fix parameter retrieval from cdict	2017-08-25 17:58:28 -07:00
Stella Lau	2adde898c8	Fix typo with ZSTDMT_parameter	2017-08-25 16:13:40 -07:00
Stella Lau	eb7bbab36a	Remove ZSTD_p_refDictContent and dictContentByRef	2017-08-25 11:11:45 -07:00
Nick Terrell	de6c6bce85	Fix zstd_internal.h for C++ mode	2017-08-24 18:09:50 -07:00
Nick Terrell	26dc040a7b	[pool] Accept custom allocators	2017-08-24 17:01:41 -07:00
Nick Terrell	89dc856cae	[pool] Fix formatting	2017-08-24 16:48:32 -07:00
Stella Lau	5bc2c1e982	Add prototype support for customMem with cctxParams	2017-08-23 12:03:30 -07:00
Stella Lau	6f1a21c7e9	Remove formatting-only changes	2017-08-23 10:24:19 -07:00

1 2 3 4 5 ...

356 Commits