AuroraMiddleware/zstd - zstd - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
senhuang42	354b5f1c0a	Use cycleLog instead of chainLog to determine LDM jobLog	2020-10-12 16:09:59 -04:00
Nick Terrell	441ce4178f	[zstdmt] Clarify a comment	2020-10-12 12:58:13 -07:00
Nick Terrell	efff5d8b2d	[zstdmt] Fix determinism issue with rsyncable mode The problem occurs in this scenario: 1. We find a synchronization point. 2. We attmept to create the job. 3. We fail because the job table is full: `mtctx->nextJobID > mtctx->doneJobID + mtctx->jobIDMask`. 4. We call `ZSTDMT_compressStream_generic` again. 5. We forget that we're at a sync point already, and we continue looking for the next sync point. This fix is to detect if we're currently paused at a sync point, and if we are then don't load any more input. Caught by zstreamtest. I modified it to make the bug occur more often (~1/100K -> ~1/200) and verified that it is fixed after. I then ran a few hundred thousand unmodified zstreamtest iterations to verify.	2020-10-12 12:55:17 -07:00
Nick Terrell	ede4f97153	[zstdmt] Fix bug where extra empty blocks are emitted When zstdmt cannot get a buffer and `ZSTD_e_end` is passed an empty compression job can be created. Additionally, `mtctx->frameEnded` can be set to 1, which could potentially cause problems like unterminated blocks. The fix is to adjust to `ZSTD_e_flush` even when we can't get a buffer.	2020-10-12 12:55:17 -07:00
Nick Terrell	9ab9229e11	[zstreamtest] Add compression determinism tests * Run compression twice and check the compressed data is byte-identical. The compression loop had to be rewritten to ensure deteriminism. It is guaranteed by always making maximal forward progress. * When nbWorkers > 0, change the number of workers 1/8 of the time. * Run in single-pass mode 1/4 of the time. I've run a few hundred thousand iterations of zstreamtest and have seen no deteriminism issues so far. Before the zstdmt fix that skips the single-pass shortcut non-determinism showed up in a few hundred iterations.	2020-10-12 12:55:17 -07:00
Nick Terrell	c51a9e79b9	[zstdmt] Rip out the zstdmt API This commit leaves only the functions used by zstd_compress.c. All other functions have been removed from the API. The ZSTDMT unit tests in fuzzer.c and zstreamtest.c have been rewritten to use the ZSTD API. And the --mt zstreamtest tests have been ripped out.	2020-10-12 12:55:16 -07:00
Nick Terrell	1784c4b4ab	[zstdmt] Remove single-pass shortcut Simplifies the code and removes blocking from zstdmt. At this point we could completely delete `ZSTDMT_compress_advanced_internal()`. However I'm leaving it in because I think we want to do that in the zstd-1.5.0 release, in case anyone is still using the ZSTDMT API, even though it is not installed by default. Fixes #2327.	2020-10-12 12:53:26 -07:00
Nick Terrell	b55ae009ac	[zstdmt] Remove singleBlockingThread mode This is already handled by zstd, so this logic is never used.	2020-10-12 12:53:26 -07:00
Nick Terrell	d5c688e8ae	Fix ZSTD_adjustCParams_internal() to handle dictionary logic Pass in the `ZSTD_cParamMode_e` to select how we define our cparams. Based on the mode we either take the `dictSize` into account or we set it to `0`. See the documentation for `ZSTD_cParamMode_e`. Some of the modes currently share the same behavior. But they have distinct modes because they are drastically different cases. E.g. compression + reprocessing the dictionary and creating a cdict. Additionally, when downsizing the hashLog and chainLog take the (adjusted) dictionary size into account, since the size of the dictionary gets added onto the window size. Adds a simple test to ensure that we aren't downsizing too far.	2020-10-12 12:50:04 -07:00
Nick Terrell	fadaab8c7c	[minor improvement] Pass 0 as the content size in the DDS The DDS structure can't be copied into the working tables like the DMS. So it doesn't need to account for the source size when sizing its parameters, just the dictionary size.	2020-10-12 12:47:21 -07:00
Nick Terrell	48ef15fb47	[minor improvement] Pass dictSize when selecting parameters When selecting parameters in streaming compression with a dictionary use the dictionary size to select the parameters.	2020-10-12 12:47:19 -07:00
Nick Terrell	012818df99	[refactor] Remove ZSTD_resetCStream_internal() This function is only called in one place. It isn't a logical separation of duties, and it was only obsfucating the code now, so inline it.	2020-10-12 12:46:10 -07:00
Nick Terrell	7083f79008	[bug] Fix dictContentType when reprocessing cdict Conditions to trigger: * CDict is loaded as raw content. * CDict starts with the zstd dictionary magic number. * The CDict is reprocessed (not attached or copied). * The new API is used (streaming or `ZSTD_compress2()`). Bug: The dictionary is loaded as a zstd dictionary, not a raw content dictionary, because the dict content type is set to `ZSTD_dct_auto`. Fix: Pass in the dictionary content type from cdict creation to the call to `ZSTD_compress_insertDictionary()`. Test: Added a test case that exposes the bug, and fixed the raw content tests to not modify the `dictBuffer`, which makes all future tests with the `dictBuffer` raw content, which doesn't seem intentional.	2020-10-12 12:46:10 -07:00
senhuang42	d6911b86be	Require LDM matches to be strictly greater in length	2020-10-09 12:56:18 -04:00
Like Ma	cc907770bd	Fix building on AIX 5.1	2020-10-09 18:34:00 +08:00
Yann Collet	b951ad20a2	Merge pull request #2329 from senhuang42/prevent_summary_updates_when_using_stdout Prevent summary updates when using stdout	2020-10-09 01:01:36 -07:00
Yann Collet	12541931fa	Merge pull request #2328 from marxin/zstd-pool-api Allow external creation of POOLs that can be shared.	2020-10-09 01:00:50 -07:00
Yann Collet	6fdb0cb8d9	Merge pull request #2303 from senhuang42/let_cdict_take_clevel_priority For ZSTD_compressStream2(), let cdict take compression level priority	2020-10-09 00:48:30 -07:00
Yann Collet	c3ee284ca2	Merge pull request #2319 from facebook/fullbench_stream2 update fullbench for compressStream2()	2020-10-09 00:40:59 -07:00
senhuang42	b9c8033cde	Define kNullRawSeqStore for every file	2020-10-07 19:02:41 -04:00
senhuang42	a6165c1b28	Change matchState_t::ldmSeqStore to pointer	2020-10-07 14:13:57 -04:00
senhuang42	abce708a56	Move posInSequence correction to correct location	2020-10-07 13:56:25 -04:00
senhuang42	0c515590d8	Replace offCode of largest match if ldm's offCode is superior	2020-10-07 13:56:25 -04:00
senhuang42	0fac8e07e1	Refactor usage of ms->ldmSeqStore so that it is not modified during compressBlock(), and simplify skipRawSeqStoreBytes	2020-10-07 13:56:25 -04:00
senhuang42	a5500cf2af	Refactor separate ldm variables all into one struct	2020-10-07 13:56:25 -04:00
senhuang42	0731b94e7c	Use kNullRawSeqStore constant in zstdmt_compress.c	2020-10-07 13:56:25 -04:00
senhuang42	0325d878f2	Remove bubbling down matches with longer offCode and same matchLen	2020-10-07 13:56:25 -04:00
senhuang42	031b7ec15f	Disable LDM minMatch adjustment when using opt parser	2020-10-07 13:56:25 -04:00
senhuang42	ddf8a3f1b9	Enable inclusion of mid-flight LDMs in opt parser	2020-10-07 13:56:25 -04:00
senhuang42	88f72ed942	Correct incorrect offcode calculation	2020-10-07 13:56:25 -04:00
senhuang42	e96ea5d147	Fix static analyze fuzzer.c error	2020-10-07 13:56:25 -04:00
senhuang42	d8b43a4202	Add explicit conversion of size_t to U32	2020-10-07 13:56:25 -04:00
senhuang42	b8bfc4e63d	Add cSize regression test to fuzzer.c	2020-10-07 13:56:25 -04:00
senhuang42	c87d2e5866	Prefix new static ldm helpers with ZSTD_opt	2020-10-07 13:56:25 -04:00
senhuang42	429dec4f42	Add DEBUGLOG() calls in ldm helpers	2020-10-07 13:56:25 -04:00
senhuang42	10647924f1	Make function descriptions more accurate	2020-10-07 13:56:25 -04:00
senhuang42	1a687b3fcb	Improve documentation of relevant structs	2020-10-07 13:56:25 -04:00
senhuang42	37617e23d7	Correct matchLength calculation and remove unnecessary functions	2020-10-07 13:56:25 -04:00
senhuang42	7dee62c287	Reset ldmSeqStore after initStats_ultra() pass for btultra2	2020-10-07 13:56:25 -04:00
senhuang42	0718aa70df	Refactor existing functions to use posInSequence	2020-10-07 13:56:25 -04:00
senhuang42	7348b40a87	Adjustments to ldm_calculateMatchRange() to calculate bounds correctly	2020-10-07 13:56:25 -04:00
senhuang42	a1ef2db5b2	Add ldm_calculateMatchRange() function	2020-10-07 13:56:25 -04:00
senhuang42	ef823e0299	Remove rawSeqStore.base and add rawSeqStore.posInSequence	2020-10-07 13:56:25 -04:00
senhuang42	cfd2aec1b7	Add unit tests into playTests.sh	2020-10-07 13:56:25 -04:00
senhuang42	4793ae3b84	Prevent duplicate LDMs from being inserted	2020-10-07 13:56:25 -04:00
senhuang42	65f9cfeeec	Add extra bounds check to prevent heap access after free ASAN error	2020-10-07 13:56:25 -04:00
senhuang42	bff5785fd5	Address mixed variables C90 warning	2020-10-07 13:56:25 -04:00
senhuang42	724b94ed18	ldm_getNextMatch fixed return values	2020-10-07 13:56:25 -04:00
senhuang42	ea92fb3a68	Cleanups, add comments and explanations	2020-10-07 13:56:25 -04:00
senhuang42	78da2e1808	Fixed sifting algorithm	2020-10-07 13:56:25 -04:00

... 3 4 5 6 7 ...

8352 Commits