AuroraMiddleware/zstd - zstd - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Nick Terrell	e06c303475	Fix ZSTD_sizeof_CStream()	2017-03-08 13:45:10 -08:00
Sean Purcell	881abe44f1	Reduce point at which we reduce offsets to protect against UB	2017-03-07 16:58:08 -08:00
Sean Purcell	3437bf2feb	Add build targets to the Makefile, and update CircleCI tests	2017-03-06 15:05:02 -08:00
Yann Collet	8b1d004031	added -Wformat-security flag, as recommended by @pixelb	2017-03-05 21:17:32 -08:00
Yann Collet	1f2c95c5f3	minor code refactor in HUF module	2017-03-05 21:07:20 -08:00
Yann Collet	5d801278dc	Merge pull request #586 from terrelln/repeat-heuristic Always check Huffman tables for ZSTD_lazy+	2017-03-03 19:38:56 -08:00
Nick Terrell	54c4babd8f	Always check Huffman tables for ZSTD_lazy+ The compressor always reuses the existing Huffman table if the literals size is at most 1 KiB. If the compression strategy is `ZSTD_lazy` or stronger always check to see if reusing the previous table or creating a new table is better. This doesn't yet weigh in decompression speed. I don't want to add any heuristics there until I have real data to work with to ensure that the heuristic works for at least one use case, preferably more.	2017-03-03 16:49:38 -08:00
Yann Collet	1af570bd05	Merge pull request #585 from terrelln/cover-leak Fix COVER_optimizeTrainFromBuffer() resource leaks	2017-03-02 20:46:35 -08:00
Yann Collet	f44b55c18d	Merge pull request #584 from terrelln/huff-repeat Allow compressor to repeat Huffman tables	2017-03-02 17:20:11 -08:00
Yann Collet	fe5d27062e	disable prefetch-decode for 32-bits target This decoder variant is detrimental to x86 architecture likely due to register pressure. Note that the variant is disabled for all 32-bits targets. It's unclear if it would help for different architectures, such as ARM, MIPS or PowerPC.	2017-03-02 17:09:21 -08:00
Nick Terrell	d051cd5b43	Use workspace for count and CTable	2017-03-02 16:38:07 -08:00
Nick Terrell	976e325b2e	Fix COVER_optimizeTrainFromBuffer() resource leaks Thanks to @nemequ for reporting the resource leaks.	2017-03-02 15:54:39 -08:00
Sean Purcell	553f67e0c1	Remove 'generic' inline strategy Seems to avoid performance loss for compression. Same strategy tested on decompression side, did not appear to improve speed.	2017-03-02 15:18:13 -08:00
Sean Purcell	3d95925a59	Merge remote-tracking branch 'origin/dev' into m32	2017-03-02 15:17:56 -08:00
Nick Terrell	a419777eb1	Allow compressor to repeat Huffman tables * Compressor saves most recently used Huffman table and reuses it if it produces better results. * I attempted to preserve CPU usage profile. I intentionally left all of the existing heuristics in place. There is only a speed difference on the second block and later. When compressing large enough blocks (say >= 4 KiB) there is no significant difference in compression speed. Dictionary compression of one block is the same speed for blocks with literals <= 1 KiB, and after that the difference is not very significant. * In the synthetic data, with blocks 10 KB or smaller, most blocks can't use repeated tables because the previous block did not contain a symbol that the current block contains. Once blocks are about 12 KB or more, most previous blocks have valid Huffman tables for the current block, and the compression ratio and decompression speed jumped. * In silesia blocks as small as 4KB can frequently reuse the previous Huffman table (85%), but it isn't as profitable, and the previous Huffman table only gets used about 3% of the time. * Microbenchmarks show that `HUF_validateCTable()` takes ~55 ns and `HUF_estimateCompressedSize()` takes ~35 ns. They are decently well optimized, the first versions took 90 ns and 120 ns respectively. `HUF_validateCTable()` could be twice as fast, if we cast the `HUF_CElt` to a `U32` and compare to 0. However, `U32` has an alignment of 4 instead of 2, so I think that might be undefined behavior. * I've ran `zstreamtest` compiled normally, with UASAN and with MSAN for 4 hours each. The worst case for the speed difference is a bunch of small blocks in the same frame. I modified `bench.c` to compress the input in a single frame but with blocks of the given block size, set by `-B`. Benchmarks on level 1: \| Program \| Block size \| Corpus \| Ratio \| Compression MB/s \| Decompression MB/s \| \|-----------\|------------\|-----------\|-------\|------------------\|--------------------\| \| zstd.base \| 256 \| synthetic \| 2.364 \| 110.0 \| 297.0 \| \| zstd \| 256 \| synthetic \| 2.367 \| 108.9 \| 297.0 \| \| zstd.base \| 256 \| silesia \| 2.204 \| 93.8 \| 415.7 \| \| zstd \| 256 \| silesia \| 2.204 \| 93.4 \| 415.7 \| \| zstd.base \| 512 \| synthetic \| 2.594 \| 144.2 \| 420.0 \| \| zstd \| 512 \| synthetic \| 2.599 \| 141.5 \| 425.7 \| \| zstd.base \| 512 \| silesia \| 2.358 \| 118.4 \| 432.6 \| \| zstd \| 512 \| silesia \| 2.358 \| 119.8 \| 432.6 \| \| zstd.base \| 1024 \| synthetic \| 2.790 \| 192.3 \| 594.1 \| \| zstd \| 1024 \| synthetic \| 2.794 \| 192.3 \| 600.0 \| \| zstd.base \| 1024 \| silesia \| 2.524 \| 148.2 \| 464.2 \| \| zstd \| 1024 \| silesia \| 2.525 \| 148.2 \| 467.6 \| \| zstd.base \| 4096 \| synthetic \| 3.023 \| 300.0 \| 1000.0 \| \| zstd \| 4096 \| synthetic \| 3.024 \| 300.0 \| 1010.1 \| \| zstd.base \| 4096 \| silesia \| 2.779 \| 223.1 \| 623.5 \| \| zstd \| 4096 \| silesia \| 2.779 \| 223.1 \| 636.0 \| \| zstd.base \| 16384 \| synthetic \| 3.131 \| 350.0 \| 1150.1 \| \| zstd \| 16384 \| synthetic \| 3.152 \| 350.0 \| 1630.3 \| \| zstd.base \| 16384 \| silesia \| 2.871 \| 296.5 \| 883.3 \| \| zstd \| 16384 \| silesia \| 2.872 \| 294.4 \| 898.3 \|	2017-03-02 13:27:52 -08:00
Yann Collet	fdb0fd34b3	Merge pull request #583 from terrelln/set-dictid Set dictID to 0 for content only dictionaries	2017-03-02 13:15:31 -08:00
Nick Terrell	3475b9b431	Set dictID to 0 for content only dictionaries	2017-03-02 12:33:02 -08:00
Sean Purcell	d44703d145	Offsets >= 32MB in 32-bits mode	2017-03-01 16:27:56 -08:00
Yann Collet	76f0494089	xxhash can be included twice in any order Previously, followed by : would fail to include the static definitions, because the second include was simply skipped by guard macro. Now it works as intended : the missing static part is included during the second include.	2017-03-01 13:29:29 -08:00
Yann Collet	4bcc69b761	solves warnings when compiling with global XXH_STATIC_LINKING_ONLY XXH_STATIC_LINKING_ONLY protection macro is intended to be triggered just before the include. The main idea is to keep this setting local : user module shall explicitly understand and accept the static linking restriction which becomes transparent when triggering the macro at project level. Global definition also triggers redefinition warnings for user modules which do locally define the macro. This new version compiles lib and cli without warning when the macro is set globally. That's not a scenario to be recommended, since it trades a local effect for a global one, but it was easy enough to provide from zstd side.	2017-03-01 11:33:25 -08:00
Yann Collet	31432cc57d	Merge pull request #579 from iburinoc/multiframe Check to ensure ddict isn't null before dereference	2017-03-01 11:02:04 -08:00
Sean Purcell	a81d4fee58	Check to ensure ddict isn't null before dereference	2017-02-28 15:28:29 -08:00
Yann Collet	22d79762ef	fixed multi frames	2017-02-28 02:12:42 -08:00
Yann Collet	a33ae64204	fixed decoding skippable frames	2017-02-28 01:15:28 -08:00
Yann Collet	d1760113ec	Improved speed of ZSTD_decompressStream() When ZSTD_decompressStream() detects that there is enough space in dst to complete decompression in a single pass, delegates to ZSTD_decompress(), for an extra ~5% speed boost	2017-02-28 00:14:28 -08:00
Yann Collet	a81c2e7e44	Merge pull request #573 from facebook/ddict Improved DDict memory usage	2017-02-27 20:54:42 -08:00
Yann Collet	dccd6b6f65	cli : fix : --rm is silent when input is stdin previously, app would produce an error message, and stop.	2017-02-27 15:57:50 -08:00
Yann Collet	0b9b894b2d	reduced ZSTD_DDict memory usage saved 128 KB	2017-02-27 00:27:30 -08:00
Yann Collet	bd7fa21deb	added ZSTD_refDDict() Now DDict does no longer depends on DCtx duplication	2017-02-26 14:43:07 -08:00
Yann Collet	d73eebc00f	loadEntropy works on new ZSTD_entropy_t type	2017-02-26 10:16:42 -08:00
Yann Collet	8629f0e41f	created entropy structure type	2017-02-25 18:33:31 -08:00
Yann Collet	8dff956dbf	Added DDict unit test in fuzzer also : slightly modified loadEntropy : know src must points at start of dictionary	2017-02-25 10:11:15 -08:00
Yann Collet	14312d833e	zstdmt : fix : loading prefix from previous segments There used to be a (very small) chance that loading prefix from previous segment would be confused with a real zstd dictionary. For that to happen, the prefix needs to start with the same value as dictionary magic. That's 1 chance in 4 billions if all values have equal probability. But in fact, since some values are more common (0x00000000 for example) others are less common, and dictionary magic was selected to be one of them, so probabilities are likely even lower. Anyway, this risk is no down to zero by adding a new CCtx parameter : ZSTD_p_forceRawDict Current parameter policy : the parameter "stick" to its CCtx, so any dictionary loading after ZSTD_p_forceRawDict is set will be loaded in "raw" ("content only") mode, even if CCtx is re-used multiple times with multiple different dictionary. It's up to the user to reset this value differently if it needs so.	2017-02-23 23:42:12 -08:00
Yann Collet	831b4890ce	minor tests/Makefile refactoring and update of zstd_manual,html	2017-02-23 23:09:10 -08:00
Yann Collet	cce8d8ba2b	Merge pull request #560 from iburinoc/findcompressedsize Change name to to findFrameCompressedSize and add skippable support	2017-02-23 13:39:23 -08:00
Sean Purcell	83038d236a	Fix bug in FSE distribution normalization	2017-02-22 13:52:48 -08:00
Sean Purcell	64417cd2ff	Describe ambiguity around skippable frames	2017-02-22 13:29:01 -08:00
Sean Purcell	9757cc811b	Update comment	2017-02-22 12:28:21 -08:00
Sean Purcell	9050e1925e	Change name to to findFrameCompressedSize and add skippable support	2017-02-22 12:12:34 -08:00
Przemyslaw Skibinski	d8114e5802	zstd_compress.c: fix memory leaks	2017-02-21 18:59:56 +01:00
Anders Oleson	517577bf53	spelling fixes in comments i.e. occurred labeled Huffman	2017-02-20 12:08:59 -08:00
Sean Purcell	6b010dec80	execSequence copies up to 2*WILDCOPY_OVERLENGTH extra	2017-02-16 12:05:40 -08:00
Sean Purcell	887eaa9e21	Fix wildcopy overwriting data still in window	2017-02-15 16:43:45 -08:00
Yann Collet	2252d29a5a	Merge branch 'dev' of github.com:facebook/zstd into dev	2017-02-15 12:00:50 -08:00
Yann Collet	4596037042	updated fse version feature minor refactoring (removing FSE_abs()) also : fix a few minor issues recently introduced in examples	2017-02-15 12:00:03 -08:00
Yann Collet	44f82d781f	Merge pull request #545 from terrelln/force-window [zstdmt] Fix MSAN failure with ZSTD_p_forceWindow	2017-02-15 10:20:15 -08:00
Yann Collet	f0b9a8dddb	Merge pull request #547 from inikep/dev11 Avoid fseek()'s 2GiB barrier with MacOS and *BSD	2017-02-14 12:29:00 -08:00
Yann Collet	9696bfc2ad	Merge pull request #544 from ds77/avoid-empty Portable way to avoid empty unit warning in threading.c	2017-02-14 00:54:55 -08:00
Przemyslaw Skibinski	b876b96ce1	Merge remote-tracking branch 'refs/remotes/facebook/dev' into dev11	2017-02-14 09:26:03 +01:00
Nick Terrell	ecf90ca24b	[zstdmt] Fix MSAN failure with ZSTD_p_forceWindow Reproduction steps: ``` make zstreamtest CC=clang CFLAGS="-O3 -g -fsanitize=memory -fsanitize-memory-track-origins" ./zstreamtest -vv -t4178 -i4178 -s4531 ``` How to get to the error in gdb (may be a more efficient way): * 2 breaks at zstd_compress.c:2418 -- in ZSTD_compressContinue_internal() * 2 breaks at zstd_compress.c:2276 -- in ZSTD_compressBlock_internal() * 1 break at zstd_compress.c:1547 Why the error occurred: When `zc->forceWindow == 1`, after calling `ZSTD_loadDictionaryContent()` we have `zc->loadedDictEnd == zc->nextToUpdate == 0`. But, we've really loaded up to `iend` into the dictionary. Then in `ZSTD_compressBlock_internal()` we see that `current > zc->nextToUpdate + 384`, so we load the last 192 bytes a second time. In this case the bytes we are loading are a block of all 0s, starting in the previous block. So when we are loading the last 192 bytes, we find a `match` in the future, 183 bytes beyond `ip`. Since the block is all 0s, the match extends to the end of the block. But in `ZSTD_count()` we only check that `pIn < pInLoopLimit`, but since `pMatch > pIn`, `pMatch` eventually points past the end of the buffer, causing the MSAN failure. The fix: The line changed sets sets `zc->nextToUpdate` to the end of the dictionary. This is the behavior that existed before `ZSTD_p_forceWindow` was introduced. This fixes the exposing test case. Since the code doesn't fail without `zc->forceWindow`, it makes sense that this works. I've run the command `./zstreamtest -T2mn` 64 times without failures. CI should also verify nothing obvious broke.	2017-02-13 19:11:22 -08:00
Yann Collet	58af614ef2	push version and NEWS to v1.1.4	2017-02-13 18:32:44 -08:00
ds77	08e6a88a97	avoid empty translation unit warning without #pragma	2017-02-14 00:46:47 +01:00
Przemyslaw Skibinski	09c8e5390d	__builtin_bswap requires gcc 4.3+	2017-02-13 12:45:53 +01:00
Sean Purcell	d7bfcac18a	Expose frameSrcSize to experimental API	2017-02-10 11:55:44 -08:00
Sean Purcell	5069b6c2c3	Merge branch 'dev' into multiframe	2017-02-10 10:08:55 -08:00
Yann Collet	bbba42acd1	Merge pull request #537 from terrelln/small-bugs Fix small bugs	2017-02-10 04:35:43 -08:00
Yann Collet	a28c34cb7a	Merge pull request #538 from iburinoc/errorstring Fix ZSTD_getErrorString and add tests	2017-02-10 03:59:56 -08:00
Sean Purcell	269b2cd3d8	Documentation updates	2017-02-09 13:25:30 -08:00
Sean Purcell	2db7249265	Make pledgedSrcSize meaning clear for other functions - Added tests - Moved new size functions to static link only	2017-02-09 11:49:58 -08:00
Nick Terrell	545987996a	Fix deprecation warnings for clang with C++14	2017-02-08 17:38:17 -08:00
Sean Purcell	e0b3265e87	Fix ZSTD_getErrorString and add tests	2017-02-08 17:28:49 -08:00
Sean Purcell	0f5c95af44	Disambiguate pledgedSrcSize == 0 - Modify ZSTD CLI to only set contentSizeFlag if it _knows_ the size - Change pzstd to stop setting contentSizeFlag without accurate pledgedSrcSize	2017-02-08 15:12:46 -08:00
Sean Purcell	ba2ad9f25c	ZSTD_decompress now handles multiple frames	2017-02-08 14:50:10 -08:00
Sean Purcell	4e709712e1	Decompressed size functions now handle multiframes and distinguish cases - Add ZSTD_findDecompressedSize - Traverses multiple frames to find total output size - Add ZSTD_getFrameContentSize - Gets the decompressed size of a single frame by reading header - Deprecate ZSTD_getDecompressedSize	2017-02-08 14:50:10 -08:00
Przemyslaw Skibinski	cdf5a7bd9f	Merge remote-tracking branch 'refs/remotes/facebook/dev' into dev11	2017-02-08 13:49:35 +01:00
Nick Terrell	71c5263c00	Attribute cover dictionary code	2017-02-07 11:35:07 -08:00
Przemyslaw Skibinski	7060aee8c2	platform.h added to build_package.bat	2017-02-06 19:43:13 +01:00
Yann Collet	b54e235bf3	fixed Mac OS-X specific directory in $(RM) list these directories are now removed with -r command	2017-02-05 10:22:58 -08:00
Yann Collet	c2a4632789	release builds use less debug symbols and warnings release build are triggered through either `make`, or their specific target `make zstd-release` and `make lib-release`.	2017-02-02 20:54:41 -08:00
Yann Collet	48bed91606	Merge pull request #527 from facebook/zstdmt zstdmt refinements	2017-01-31 16:36:46 -08:00
Yann Collet	b2e1b3d670	fixed overlapLog==0 => no overlap	2017-01-30 14:54:46 -08:00
Yann Collet	3672d06d06	zstdmt : section size is set to be a minimum of overlapSize the minimum size condition size is applied transparently (no warning, no error) like previous minimum section size condition (1 KB) which still applies.	2017-01-30 13:35:45 -08:00
Yann Collet	88df1aed61	changed advanced parameter overlapLog Follows a positive logic (increasing value => increasing overlap) which is easier to use	2017-01-30 11:00:00 -08:00
Yann Collet	b5fd15ccb2	fixed : legacy decoders v04 and v05	2017-01-30 10:45:58 -08:00
Yann Collet	cc3d1bc262	Merge pull request #525 from terrelln/covermt Multithreaded COVER dictionary training	2017-01-30 10:15:33 -08:00
Nick Terrell	43474313f8	Fix documentation about memory usage	2017-01-27 18:43:05 -08:00
Nick Terrell	b42dd27ef5	Add include guards and extern C	2017-01-27 16:00:19 -08:00
Yann Collet	f6d4a786fc	reduced zstdmt latency when using small custom section sizes with high compression levels Previous version was requiring a fairly large initial amount of input data before starting to create compression jobs. This new version starts the process much sooner.	2017-01-27 15:55:30 -08:00
Nick Terrell	c43c27127f	Merge branch 'dev' into buck * dev: updated NEWS fixed MSAN warnings in legacy decoders Fix cmake build updated NEWS Edits as per comments, and change wildcard 'X' to '?' Fix Visual Studios project Fix pool.c threading.h import Fix zstdmt_compress.h include Fixed commented issues Updated format specification to be easier to understand improved #232 fix Fixed https://github.com/facebook/zstd/issues/232 .travis.yml: different tests for "master" branch .travis.yml: optimized order of short tests .travis.yml: test jobs 12-15 JOB_NUMBER -eq 9 improved ZSTD_compressBlock_opt_extDict_generic	2017-01-27 12:05:48 -08:00
Nick Terrell	2fe9126591	Add multithread support to COVER	2017-01-27 11:56:02 -08:00
Yann Collet	609c123a01	Merge pull request #522 from terrelln/benchmt Fix some includes	2017-01-27 11:40:25 -08:00
Yann Collet	cafdd31a38	fixed MSAN warnings in legacy decoders In some extraordinary circumstances, *Length field can be generated from reading a partially uninitialized memory segment. Data is correctly identified as corrupted later on, but the read taints some later pointer arithmetic operation.	2017-01-27 10:44:03 -08:00
Nick Terrell	9c018cc140	Add BUCK files for Nuclide support	2017-01-27 10:43:12 -08:00
Przemyslaw Skibinski	29157320fb	improved ZSTD_compressBlock_opt_extDict_generic	2017-01-27 10:43:02 -08:00
Nick Terrell	e628eaf87a	Fix pool.c threading.h import	2017-01-26 15:29:10 -08:00
Yann Collet	717c65d690	Merge pull request #519 from inikep/dev11 Dev11	2017-01-26 14:23:44 -08:00
Yann Collet	ef33d00532	fixed : ZSTD_setCCtxParameter() properly exposed in DLL	2017-01-26 12:24:21 -08:00
Yann Collet	4a62f79ec9	fixed clang documentation warning	2017-01-26 09:16:56 -08:00
Yann Collet	8dafb1acf5	CLI : automatically set overlap size to max (windowSize) for max compression level	2017-01-25 17:01:13 -08:00
Yann Collet	06e7697f96	added test of new parameter ZSTD_p_forceWindow	2017-01-25 16:39:03 -08:00
Yann Collet	bb0027405a	fixed zstdmt corruption issue when enabling overlapped sections see Asana board for detailed explanation on why and how to fix it	2017-01-25 16:25:38 -08:00
Yann Collet	943cff9c37	fixed zstdmt cli freeze issue with large nb of threads fileio.c was continually pushing more content without giving a chance to flush compressed one. It would block the job queue when input data was accumulated too fast (requiring to define many threads). Fixed : fileio flushes whatever it can after each input attempt.	2017-01-25 12:35:19 -08:00
Yann Collet	dc8dae596a	overlapped section, for improved compression Sections 2+ read a bit of data from previous section in order to improve compression ratio. This also costs some CPU, to reference read data. Read data is currently fixed to window>>3 size	2017-01-24 22:32:12 -08:00
Yann Collet	f14a669054	refactor job creation code shared accross ZSTDMT_{compress,flush,end}Stream(), for easier maintenance	2017-01-24 17:41:49 -08:00
Yann Collet	512cbe8c10	zstdmt cli and API allow selection of section sizes By default, section sizes are 4x window size. This new setting allow manual selection of section sizes. The larger they are, the (slightly) better the compression ratio, but also the higher the memory allocation cost, and eventually the lesser the nb of possible threads, since each section is compressed by a single thread. It also introduces a prototype to set generic parameters, ZSTDMT_setMTCtxParameter() The idea is that it's possible to add enums to extend the list of parameters that can be set this way. This is more long-term oriented than a fixed-size struct. Consider it as a test.	2017-01-24 17:08:53 -08:00
Yann Collet	3488a4a473	ZSTDMT now supports frame checksum	2017-01-24 11:48:40 -08:00
Przemyslaw Skibinski	96f152f708	improved ZSTD_compressBlock_opt_extDict_generic	2017-01-24 13:18:50 +01:00
Yann Collet	94364bf87a	refactor ZSTDMT streaming flush code now shared by both ZSTDMT_compressStream() and ZSTDMT_flushStream()	2017-01-23 11:50:44 -08:00
Yann Collet	1cbf251e43	ZSTDMT streaming : fall back to (regular) single thread mode when nbThreads==1	2017-01-23 01:43:58 -08:00
Yann Collet	84581ff8d7	ZSTDMT_compressCCtx : fallback to single-thread mode when nbChunks==1	2017-01-23 01:20:27 -08:00

1 2 3 4 5 ...

1393 Commits