AuroraMiddleware/zstd - zstd - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Yann Collet	16f9c572fc	Merge branch 'dev' into compressionFlow	2017-04-20 11:16:40 -07:00
Yann Collet	e348dad305	minor long line reformatting	2017-04-20 11:14:13 -07:00
Yann Collet	2c5514c759	fixed ZSTDMT_initCStream_advanced() Must use the new ZSTD_compressBegin_usingCDict_advanced() to enforce correct frame parameters	2017-04-18 22:52:41 -07:00
Yann Collet	a4cab80183	added ZSTD_copyCCtx_internal() which respects provided fParams.	2017-04-18 14:54:54 -07:00
Yann Collet	30fb499208	Changed ZSTD_resetCCtx_advanced() into ZSTD_resetCCtx_internal() for naming consistency : _advanced() can be invoked while _internal() are strictly static	2017-04-18 14:08:50 -07:00
Yann Collet	715b9aa113	created ZSTD_compressBegin_usingCDict_advanced()	2017-04-18 13:55:53 -07:00
Yann Collet	4f818182b8	clarified frame parameters for ZSTD_compress*_usingCDict() created ZSTD_compressBegin_usingCDict_internal(), which gives direct control to frame Parameters. ZSTD_resetCStream_internal() now points into it.	2017-04-17 18:29:06 -07:00
Yann Collet	c47c68f6ca	proper evaluation of Huffman CTable size	2017-04-17 16:14:21 -07:00
Yann Collet	88009a8ba2	removed srcSize control from CStream since it's already done from lower bufferless API level	2017-04-12 00:51:24 -07:00
Yann Collet	20d5e03893	content size is controlled at bufferless level so it's active for all entry points Also : added relevant test (wrong content size) in fuzzer	2017-04-11 18:34:02 -07:00
Yann Collet	4ee6b15dac	force contentSizeFlag=0 when using ZSTD_initCStream_usingCDict() because by definition srcSize is not known when using this prototype. added relevant test Note : this use was already working, because at a later stage (both ZSTD_compressBegin_usingCDict() and ZSTD_copyCCtx()) pledgedSrcSize=0 is translated into "unknown", no matter the frame parameter. This is not correct, but of little importance, as the medium term plan is to no longer set fParams within CDict	2017-04-11 11:59:44 -07:00
Yann Collet	ab9162ebb4	simplified call graph by calling ZSTD_compressBegin_internal() instead of ZSTD_compressBegin_advanced()	2017-04-11 10:46:20 -07:00
Yann Collet	e88034fe26	simplified ZSTD_initCStream*() flow all variants converge towards ZSTD_initCStream_stage2()	2017-04-10 22:24:02 -07:00
Yann Collet	4b987ad8ce	Introduce ZSTD_initCStream_internal() This is now the regroup point for ZSTD_initCStream*() functions ZSTD_initCStream_advanced() now properly checks for parameters validity. Also : added <assert.h> usage inside zstd_compress.c Needs ZSTD_DEBUG=1 macro to be triggered. Will be triggered by default from `tests` directory	2017-04-10 17:50:44 -07:00
Yann Collet	0181fef545	ensure cctx internal buffer is correctly sized in case of memory error	2017-04-06 01:25:26 -07:00
Yann Collet	02d37aa1c1	ensure correct size of internal buffers in case of error	2017-04-05 14:53:51 -07:00
Nick Terrell	405d2a1027	Explicitly convert scratchBuffer to unsigned*	2017-04-04 16:35:31 -07:00
Nick Terrell	16a739cab0	Switch call of FSE_count() to FSE_count_wksp()	2017-04-04 16:17:21 -07:00
Yann Collet	7cf78f1be7	Protects ZSTD_compressBegin_usingCDict() vs NULL cdict dereference Will issue an error (GENERIC) is cdict==NULL	2017-04-04 12:38:14 -07:00
Nick Terrell	26b046a7c4	Remove unnecessary dictID store	2017-04-03 21:46:28 -07:00
Nick Terrell	39a6cc5172	Make ZSTD_compress_usingCDict() respect contentSizeFlag	2017-04-03 21:09:55 -07:00
Nick Terrell	62ecad3819	Fix ZSTD_initCStream_usingCDict() to use dictionary	2017-04-03 21:05:59 -07:00
Yann Collet	30c7698970	optimize ZSTDMT_compress() memory usage does no longer allocate temporary buffers when there is enough room in dstBuffer to decompress directly there. (previous method would skip that for 1st chunk only). Also : fix ZSTD_compressBound() for small srcSize	2017-03-31 18:27:03 -07:00
Yann Collet	3f75d52527	Changed ZSTD_compressBound() required so that if Total = A+B compressBound(Total) <= compressBound(A) + compressBound(B) under condition of a minimum size for A and B Will help for ZSTDMT_compress() memory allocation	2017-03-31 17:11:38 -07:00
Yann Collet	eea7858e2b	fixed minor warnings in debug code	2017-03-30 16:47:19 -07:00
Yann Collet	34cc487d05	overlap at full windowSize for max compression level as it provides max compression ratio	2017-03-30 16:23:22 -07:00
Yann Collet	458e955c23	improved ZSTDMT_compress() Use a bit more threads by default. Uses overlap segments to boost compression ratio (like the streaming variant)	2017-03-30 15:51:58 -07:00
Yann Collet	6476c51b86	Merge pull request #637 from facebook/zstdmt Zstdmt	2017-03-30 14:18:37 -07:00
Nick Terrell	5152fb2cb2	Convert all tabs to spaces	2017-03-29 18:51:58 -07:00
Yann Collet	ca5a8bbe36	re-added patch ...	2017-03-29 17:15:27 -07:00
Yann Collet	2e2e78de47	removed unnecessary restriction on minmatchLength it's now transparently translated to nearest value when unsupported (7->6) (3->4)	2017-03-29 16:02:47 -07:00
Yann Collet	933ce4a1dd	fix : minmatch 7 conversion minmatch 7 now converted to minmatch 6 for strategies which do not support 7 Used to folded into "default", which applied minmatch 4	2017-03-29 14:35:38 -07:00
Yann Collet	2238870eb6	Merge pull request #625 from facebook/loadCDict limited CDict acceptation criteria to be the same as DDict	2017-03-24 16:06:20 -07:00
Yann Collet	16a0b10781	fixed ZSTD_loadZstdDictionary() forgot to add the dictionary content (tests were not failing, just compressing less). Also : added size protections when adding dict content since hc/bt table filling would fail if size < 8	2017-03-24 12:46:46 -07:00
Yann Collet	23776ce290	fixed ERROR_GENERIC on dstSize_tooSmall required by users which depends on this error code to size dest buffer	2017-03-23 17:59:50 -07:00
Yann Collet	bea78e8fc2	limited CDict acceptation criteria to be the same as DDict	2017-03-23 15:46:06 -07:00
Nick Terrell	eaf69b07f0	Zero pointers after freeing	2017-03-21 13:20:59 -07:00
Yann Collet	a41a4ed39a	Merge pull request #594 from terrelln/bugs Small fixes	2017-03-08 14:56:07 -08:00
Nick Terrell	e06c303475	Fix ZSTD_sizeof_CStream()	2017-03-08 13:45:10 -08:00
Sean Purcell	881abe44f1	Reduce point at which we reduce offsets to protect against UB	2017-03-07 16:58:08 -08:00
Sean Purcell	3437bf2feb	Add build targets to the Makefile, and update CircleCI tests	2017-03-06 15:05:02 -08:00
Nick Terrell	54c4babd8f	Always check Huffman tables for ZSTD_lazy+ The compressor always reuses the existing Huffman table if the literals size is at most 1 KiB. If the compression strategy is `ZSTD_lazy` or stronger always check to see if reusing the previous table or creating a new table is better. This doesn't yet weigh in decompression speed. I don't want to add any heuristics there until I have real data to work with to ensure that the heuristic works for at least one use case, preferably more.	2017-03-03 16:49:38 -08:00
Yann Collet	f44b55c18d	Merge pull request #584 from terrelln/huff-repeat Allow compressor to repeat Huffman tables	2017-03-02 17:20:11 -08:00
Nick Terrell	d051cd5b43	Use workspace for count and CTable	2017-03-02 16:38:07 -08:00
Sean Purcell	553f67e0c1	Remove 'generic' inline strategy Seems to avoid performance loss for compression. Same strategy tested on decompression side, did not appear to improve speed.	2017-03-02 15:18:13 -08:00
Sean Purcell	3d95925a59	Merge remote-tracking branch 'origin/dev' into m32	2017-03-02 15:17:56 -08:00
Nick Terrell	a419777eb1	Allow compressor to repeat Huffman tables * Compressor saves most recently used Huffman table and reuses it if it produces better results. * I attempted to preserve CPU usage profile. I intentionally left all of the existing heuristics in place. There is only a speed difference on the second block and later. When compressing large enough blocks (say >= 4 KiB) there is no significant difference in compression speed. Dictionary compression of one block is the same speed for blocks with literals <= 1 KiB, and after that the difference is not very significant. * In the synthetic data, with blocks 10 KB or smaller, most blocks can't use repeated tables because the previous block did not contain a symbol that the current block contains. Once blocks are about 12 KB or more, most previous blocks have valid Huffman tables for the current block, and the compression ratio and decompression speed jumped. * In silesia blocks as small as 4KB can frequently reuse the previous Huffman table (85%), but it isn't as profitable, and the previous Huffman table only gets used about 3% of the time. * Microbenchmarks show that `HUF_validateCTable()` takes ~55 ns and `HUF_estimateCompressedSize()` takes ~35 ns. They are decently well optimized, the first versions took 90 ns and 120 ns respectively. `HUF_validateCTable()` could be twice as fast, if we cast the `HUF_CElt` to a `U32` and compare to 0. However, `U32` has an alignment of 4 instead of 2, so I think that might be undefined behavior. * I've ran `zstreamtest` compiled normally, with UASAN and with MSAN for 4 hours each. The worst case for the speed difference is a bunch of small blocks in the same frame. I modified `bench.c` to compress the input in a single frame but with blocks of the given block size, set by `-B`. Benchmarks on level 1: \| Program \| Block size \| Corpus \| Ratio \| Compression MB/s \| Decompression MB/s \| \|-----------\|------------\|-----------\|-------\|------------------\|--------------------\| \| zstd.base \| 256 \| synthetic \| 2.364 \| 110.0 \| 297.0 \| \| zstd \| 256 \| synthetic \| 2.367 \| 108.9 \| 297.0 \| \| zstd.base \| 256 \| silesia \| 2.204 \| 93.8 \| 415.7 \| \| zstd \| 256 \| silesia \| 2.204 \| 93.4 \| 415.7 \| \| zstd.base \| 512 \| synthetic \| 2.594 \| 144.2 \| 420.0 \| \| zstd \| 512 \| synthetic \| 2.599 \| 141.5 \| 425.7 \| \| zstd.base \| 512 \| silesia \| 2.358 \| 118.4 \| 432.6 \| \| zstd \| 512 \| silesia \| 2.358 \| 119.8 \| 432.6 \| \| zstd.base \| 1024 \| synthetic \| 2.790 \| 192.3 \| 594.1 \| \| zstd \| 1024 \| synthetic \| 2.794 \| 192.3 \| 600.0 \| \| zstd.base \| 1024 \| silesia \| 2.524 \| 148.2 \| 464.2 \| \| zstd \| 1024 \| silesia \| 2.525 \| 148.2 \| 467.6 \| \| zstd.base \| 4096 \| synthetic \| 3.023 \| 300.0 \| 1000.0 \| \| zstd \| 4096 \| synthetic \| 3.024 \| 300.0 \| 1010.1 \| \| zstd.base \| 4096 \| silesia \| 2.779 \| 223.1 \| 623.5 \| \| zstd \| 4096 \| silesia \| 2.779 \| 223.1 \| 636.0 \| \| zstd.base \| 16384 \| synthetic \| 3.131 \| 350.0 \| 1150.1 \| \| zstd \| 16384 \| synthetic \| 3.152 \| 350.0 \| 1630.3 \| \| zstd.base \| 16384 \| silesia \| 2.871 \| 296.5 \| 883.3 \| \| zstd \| 16384 \| silesia \| 2.872 \| 294.4 \| 898.3 \|	2017-03-02 13:27:52 -08:00
Sean Purcell	d44703d145	Offsets >= 32MB in 32-bits mode	2017-03-01 16:27:56 -08:00
Yann Collet	4bcc69b761	solves warnings when compiling with global XXH_STATIC_LINKING_ONLY XXH_STATIC_LINKING_ONLY protection macro is intended to be triggered just before the include. The main idea is to keep this setting local : user module shall explicitly understand and accept the static linking restriction which becomes transparent when triggering the macro at project level. Global definition also triggers redefinition warnings for user modules which do locally define the macro. This new version compiles lib and cli without warning when the macro is set globally. That's not a scenario to be recommended, since it trades a local effect for a global one, but it was easy enough to provide from zstd side.	2017-03-01 11:33:25 -08:00
Yann Collet	dccd6b6f65	cli : fix : --rm is silent when input is stdin previously, app would produce an error message, and stop.	2017-02-27 15:57:50 -08:00

1 2 3 4 5 ...

337 Commits