AuroraMiddleware/zstd - zstd - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Nick Terrell	280a236e9e	Add ZSTD_CCtx(Param)?_getParameter() function Closes #1096.	2018-04-12 11:50:12 -07:00
Yann Collet	04212178b5	doc : clarified advanced API usage sticky parameters only work with `ZSTD_compress_generic()`	2018-04-10 11:40:36 -07:00
Yann Collet	ad5ba6cdcf	updated comment on parameters that can be changed during compression	2018-04-09 17:39:07 -07:00
Yann Collet	1da629f2ad	Merge pull request #1104 from terrelln/fast-train Allow negative compression levels in training	2018-04-09 14:16:20 -07:00
Nick Terrell	569e2abccd	Allow negative compression levels in training * Set `dictCLevel` in `zstdcli.c`. * Only set to default level if the compression level `== 0`, not `<= 0`.	2018-04-09 12:12:03 -07:00
Yann Collet	4195b36dd7	Merge pull request #1100 from bket/stable_sort zstd requires a stable sort.	2018-04-05 11:39:27 -07:00
Yann Collet	f35b8ba9da	updated ZSTD_p_chainLog description	2018-04-05 11:05:11 -07:00
Björn Ketelaars	462aed6811	zstd requires a stable sort. On OpenBSD qsort() is not guaranteed to be stable, their mergesort() is. This fixes issue #1088. All the hard work has been done by @terrelln.	2018-04-05 07:59:16 +02:00
Yann Collet	55f67502f4	Merge pull request #1098 from terrelln/nd-mt Only load extra table positions for CDicts	2018-04-02 15:38:20 -07:00
Nick Terrell	295ab0dbfa	Only load extra table positions for CDicts Zstdmt uses prefixes to load the overlap between segments. Loading extra positions makes compression non-deterministic, depending on the previous job the context was used for. Since loading extra position takes extra time as well, only do it when creating a `ZSTD_CDict`. Fixes #1077.	2018-04-02 14:41:30 -07:00
Yann Collet	5b616fa269	Merge pull request #1090 from bket/openbsd Fix building zstd on OpenBSD.	2018-04-02 14:15:26 -07:00
Björn Ketelaars	9d3048346d	Fix building zstd on OpenBSD.	2018-03-31 10:46:20 +02:00
Yann Collet	8be984ec45	fixed comments as suggested by @terrelln	2018-03-30 20:09:27 -07:00
Yann Collet	e6e848bfe9	added ZSTD_getFrameHeader_advanced() makes it possible to request frame header from a magicless frame	2018-03-29 17:51:08 -06:00
Yann Collet	a6694838e1	added more code documentation for ZSTD_getFrameHeader()	2018-03-29 15:24:17 -06:00
René Rebe	21eb26d664	fixed legacy/zstd_v* with older gcc version, by guarding builtin_* like in other files	2018-03-25 20:35:15 +02:00
Yann Collet	ad15c1b724	added __has_attribute() define for non-clang compilers	2018-03-23 19:04:48 -07:00
Yann Collet	52ca7c6c56	make DYNAMIC_BMI2 support of clang conditional to __has_attribute() to support older clang versions such as 3.4	2018-03-23 18:45:42 -07:00
Yann Collet	29b021f9a0	Merge pull request #1067 from facebook/targetLength removed limit ZSTD_TARGETLENGTH_MAX	2018-03-22 10:38:33 -07:00
Nick Terrell	ad344033df	Fix broken assertion The `avgJobSize` must not be lower than 256 KB for single-pass mode. In `zstd.h` we say the minimum value for `ZSTD_p_jobSize` is 1 MB, so ensure that we always pick a size >= 1 MB. Found by libFuzzer fuzzer tests with large input limits.	2018-03-21 16:20:30 -07:00
Yann Collet	153bc1c004	removed limit ZSTD_TARGETLENGTH_MAX this makes it possible to specify extremely large negative compression levels, achieving the side effect as "no compression". It will also be possible to define larger targetlength for ultra compression mode. There is no adverse side effect due to removing this limit.	2018-03-21 15:50:05 -07:00
Yann Collet	a99c4a3621	Merge branch 'dev' into advancedDecompress	2018-03-21 06:08:28 -07:00
Yann Collet	87b0cf05bd	Merge pull request #1057 from facebook/lrmSettings LRM parameters	2018-03-21 05:59:39 -07:00
Yann Collet	d1bf609abf	Merge pull request #1059 from terrelln/mt-ldm Integrate ldm with zstdmt	2018-03-20 17:50:20 -07:00
Yann Collet	e0cb8d19c6	fixed legacy test case	2018-03-20 17:48:22 -07:00
Yann Collet	878728dc26	fixed several comments by @terrelln	2018-03-20 16:35:14 -07:00
Yann Collet	e1c52faace	Merge pull request #1060 from facebook/compressImpl merge bmi2 implementation of encodeSequence into zstd_compress.c	2018-03-20 16:19:42 -07:00
Yann Collet	6cda8c932c	added test with ZSTD_decompress_generic() + ZSTD_DCtx_refPrefix() also : clarified stage condition to accept new parameters, fixed initializers correspondingly.	2018-03-20 16:16:13 -07:00
Yann Collet	0dadb6b70d	implemented ZSTD_DCtx_refPrefix*()	2018-03-20 15:45:56 -07:00
Yann Collet	569b8ba4d9	implemented ZSTD_DCtx_refDDict()	2018-03-20 15:43:49 -07:00
Nick Terrell	a3b76a77ef	Quiet appveyor warnings	2018-03-20 15:34:40 -07:00
Yann Collet	6873fec658	changed dictMore for dictContentType which seems clearer to describe what the variable/argument is about.	2018-03-20 15:13:14 -07:00
Yann Collet	31b54b6eea	updated ZSTD_initStaticDDict() prototype can also specify dictContentType.	2018-03-20 14:52:02 -07:00
Nick Terrell	136b9e2392	Fix external sequence corner cases * Clear external sequences when we reset the `ZSTD_CCtx`. * Skip external sequences when a block is too small to compress.	2018-03-20 14:50:28 -07:00
Yann Collet	353117c5d7	implemented ZSTD_DCtx_loadDictionary*() this required updating ZSTD_createDDict_advanced() to accept a dictContentType parameter (raw, full, auto).	2018-03-20 13:40:29 -07:00
Yann Collet	451357f37f	Merge pull request #1058 from facebook/cctxParams updated CCtxParams API	2018-03-20 12:36:12 -07:00
Yann Collet	2ed5af0766	merge bmi2 implementation of encodeSequence into zstd_compress.c	2018-03-19 19:10:31 -07:00
Nick Terrell	d19f803a3b	Fix window size for 1 worker + flushing	2018-03-19 18:56:39 -07:00
Nick Terrell	24d9edbdd8	Set ldmParams to 0 when disabled	2018-03-19 18:23:54 -07:00
Nick Terrell	4b92574feb	Fix corner cases exposed by zstreamtest	2018-03-19 17:54:04 -07:00
Nick Terrell	94c77710a9	Integrate ldm with zstdmt Integrate ldm into zstdmt by running it in serial and in order in the first step of each job, in the same place as the hash gets updated. The input buffer is sized to fit the whole LDM window and 2 full buffers of slack. Input buffers cannot be reused until the LDM step is done with them. After the LDM step is finished, the jobs don't actually have access to the full window, only the overlap. Tested on a few different multi-GB files with and without sanitizers, and with different numbers of threads.	2018-03-19 16:29:03 -07:00
Nick Terrell	aa4dbd09a1	Pull job/overlap log logic into common function (#1055 ) Prepares for LDM integration by separating the job size and overlap logic into helper functions.	2018-03-19 15:56:36 -07:00
Yann Collet	c8b3d389fd	updated CCtxParams API to respect naming convention : ZSTD_CCtxParams_*()	2018-03-19 15:07:26 -07:00
Yann Collet	6f4d0778a5	make it possible to express compression parameters in any order	2018-03-19 14:41:23 -07:00
Nick Terrell	2253d01b27	Move XXH64_update() into worker threads * Computes the XXH hash in the worker threads. * Workers get a sequence number and wait until ther number shows up. On error, ensures that its sequence is finished, so future threads don't get blocked. * Sets up for ldm integration, which will go in the same spot.	2018-03-19 11:08:27 -07:00
Yann Collet	9618c0c804	make it possible to specify LDM parameters in any order	2018-03-19 11:07:04 -07:00
Yann Collet	ec0959e701	Merge branch 'dev' into mt-single	2018-03-18 01:06:31 -07:00
Nick Terrell	4af1fafeb8	Restore setting loadedDictEnd Setting `loadedDictEnd` was accidently removed from `ZSTD_loadDictionaryContent()`, which means that dictionary compression will only be able to reference the parts of the dictionary within the window. The spec allows us to reference the entire dictionary so long as even one byte is in the window. `ZSTD_enforceMaxDist()` incorrectly always allowed offsets up to `loadedDictEnd` beyond the window, even once the dictionary was out of range. When overflow protection kicked in, the check `current > loadedDictEnd + maxDist` is incorrect if `loadedDictEnd` isn't reset back to zero. `current` could be reset below the value, which would incorrectly allow references beyond the window. This bug is present in `master`, but is very hard to trigger, since it requires both dictionaries and data which triggers overflow correction.	2018-03-16 14:54:06 -07:00
Yann Collet	cbc71e40f6	moving LRM parameters out of experimental section into "normal" range, start pinned at 160.	2018-03-15 17:22:40 -07:00
Nick Terrell	f15a17e19f	Use a single buffer in zstdmt Summary: Allocate a single input buffer large enough to house each job, as well as enough space for the IO thread to write 2 extra buffers. One goes in the `POOL` queue, and one to fill, and then block on a full `POOL` queue. Since we can't overlap with the prefix, we allocate space for 3 extra input buffers. Test Plan: * CI * With and without ASAN/UBSAN run zstdmt with different number of threads on two large binaries, and verify that their checksums match. * Test on the tip of the zstdmt ldm integration. Reviewers: cyan Differential Revision: https://phabricator.intern.facebook.com/D7284007 Tasks: T25664120	2018-03-15 16:21:33 -07:00
Yann Collet	192542b63c	Merge pull request #1047 from facebook/hufCompress removed huf_compress_impl.h	2018-03-15 14:14:03 -07:00
Nick Terrell	a271399c97	Expose reference external sequence API Summary: * Expose the reference external sequences API for zstdmt. Allows external sequences of any length, which get split when necessary. * Reset the LDM window when the context is reset. * Store the maximum number of LDM sequences. * Sequence generation now returns the number of last literals. * Fix sequence generation to not throw out the last literals when blocks of more than 1 MB are encountered. Expose reference external sequence API * Expose the reference external sequences API for zstdmt. * Allows external sequences of any length, which get split when necessary. * Reset the LDM window when the context is reset. * Store the maximum number of LDM sequences. * Sequence generation now returns the number of last literals. * Fix sequence generation to not throw out the last literals when blocks of more than 1 MB are encountered. Test Plan: * CI * Test the zstdmt ldm integration stacked on top of this diff Reviewers: cyan Differential Revision: https://phabricator.intern.facebook.com/D7283968 Tasks: T25664120	2018-03-14 18:07:53 -07:00
Nick Terrell	1908c92c46	Merge remote-tracking branch 'upstream/dev' into extern-seq * upstream/dev: Fix overflow protection with wlog=31	2018-03-14 17:26:31 -07:00
Yann Collet	a909c293c6	Merge branch 'dev' into hufCompress	2018-03-14 16:11:25 -07:00
Nick Terrell	a9a6dcba63	Expose reference external sequence API * Expose the reference external sequences API for zstdmt. Allows external sequences of any length, which get split when necessary. * Reset the LDM window when the context is reset. * Store the maximum number of LDM sequences. * Sequence generation now returns the number of last literals. * Fix sequence generation to not throw out the last literals when blocks of more than 1 MB are encountered.	2018-03-14 12:29:31 -07:00
Nick Terrell	33fb966e56	Fix overflow protection with wlog=31 The overflow protection is broken when the window log is `> (3U << 29)`, so 31. It doesn't work when `current` isn't around `1U << windowLog` ahead of `lowLimit`, and the the assertion `current > newCurrent` fails. This happens when the same context is used many times over, but with a large window log, like in zstdmt. Fix it by triggering correction based on `nextSrc - base` instead of `lowLimit`. The added test fails before the patch, and passes after.	2018-03-14 11:45:44 -07:00
Yann Collet	4c5cbac179	Merge pull request #1041 from facebook/fasterFast Negative compression levels	2018-03-13 21:32:46 -07:00
Yann Collet	50f763ec44	fixed several comments are underlined by @terrelln	2018-03-13 14:23:14 -07:00
Yann Collet	a95a88af57	removed huf_compress_impl.h re-imported all functions inside huf_compress.c for easier source editing. Also updated a bunch of code comments for clarification.	2018-03-13 14:14:05 -07:00
Yann Collet	bd7bb94361	Merge pull request #1044 from baldurk/remove-utf8-characters Remove non-ASCII characters in header file comments	2018-03-13 13:22:07 -07:00
Baldur Karlsson	430a2fec19	Remove non-ASCII characters in header file comments * Replaced a non-breaking space and an en dash with a plain space and a hyphen. * This means the files are simple ASCII and less likely to run into codepage issues.	2018-03-13 20:05:53 +00:00
Yann Collet	530eeb41a7	Merge pull request #1039 from facebook/zstd_decompress Removed zstd_decompress_impl.h	2018-03-12 18:21:46 -07:00
Yann Collet	2291b85a1e	changed ZSTD_p_literalCompression into ZSTD_p_compressLiterals prefer verb+object construction	2018-03-12 11:44:10 -07:00
Yann Collet	a57d43d4d4	updated documentation of targetLength	2018-03-12 11:35:01 -07:00
Yann Collet	6a9b41b731	create command --fast[=#] access negative compression levels from command line for both compression and benchmark modes. also : ensure proper propagation of parameters through ZSTD_compress_generic() interface. added relevant cli tests.	2018-03-11 20:01:23 -07:00
Yann Collet	a146ee04ae	added negative compression levels negative compression level trade compression ratio for more compression speed. They turn off huffman compression of literals, and use row 0 as baseline with a stepSize = -cLevel. added associated test in fuzzer also added : new advanced parameter ZSTD_p_literalCompression	2018-03-11 05:21:53 -07:00
Yann Collet	facc09aa03	minor compression level adaptation level 12 compresses slightly more and faster due to better btlazy2 mode	2018-03-11 03:06:52 -07:00
Yann Collet	fe321f9e2a	re-integrate ZSTD_decompressSequencesLong() into zstd_decompress.c removed zstd_decompress_impl.h	2018-03-09 19:48:06 -08:00
Yann Collet	89a2ebb971	incorporated ZSTD_decompressSequences() into zstd_decompress()	2018-03-09 19:35:57 -08:00
Yann Collet	cdb1f1433e	incorporated ZSTD_initFseState() inside zstd_decompress.c	2018-03-09 18:16:10 -08:00
Yann Collet	a166eae1ba	incorporate ZSTD_decodeSequenceLong() within zstd_decompress.c	2018-03-09 18:11:14 -08:00
Yann Collet	17626ba56e	restored ZSTD_decodeSequence() into zstd_decompress.c	2018-03-09 18:03:25 -08:00
Yann Collet	51169575a8	Merge pull request #1036 from terrelln/thread-void [threading] Cast unused arguments to void	2018-03-07 12:14:05 -08:00
Nick Terrell	7e103cdaf5	[threading] Cast unused arguments to void	2018-03-06 18:36:40 -08:00
Yann Collet	db147ea620	improved comments following @terrelln suggestions	2018-03-06 18:15:26 -08:00
Yann Collet	06ca9c7d7c	fixed 0-seq blocks in block-decompression mode	2018-03-06 01:50:19 -08:00
Yann Collet	9a91afe6ef	long offset mode : new default threshold for 32-bit	2018-03-05 16:41:08 -08:00
Yann Collet	7bd7a3ad43	long offset mode : new default threshold for 64-bits mode	2018-03-05 16:16:49 -08:00
Yann Collet	c0393a538f	fixed counting long distance weights	2018-03-05 15:12:10 -08:00
Yann Collet	41bd10446e	Merge branch 'dev' into longOffsetMode	2018-03-05 13:10:10 -08:00
Yann Collet	cb789d2df8	re-inserted offset evaluation	2018-03-05 13:08:59 -08:00
Yann Collet	b91ddf0ae6	Merge branch 'dev' into longOffsetMode	2018-03-05 11:59:54 -08:00
Yann Collet	d02b44cf55	DYNAMIC_BMI2 enabled for clang clang only claims compatibility with gcc 4.2. Consequently, recent patch which reserved DYNAMIC_BMI2 for gcc >= 4.8 also disabled it for clang. fix : __clang__ is now enough to enable DYNAMIC_BMI2 (associated with other existing conditions : x64/x64, !bmi2)	2018-03-04 16:05:59 -08:00
Yann Collet	45b09e7625	limit DYNAMIC_BMI2 to gcc >= 4.8 attribute bmi2 not supported by gcc 4.4	2018-03-01 15:02:18 -08:00
Yann Collet	b01552a07a	force inlining of HUF_decodeSymbol*() functions which was not done properly by gcc 4.8 resulting in major performance difference. ex : zstd -b1 silesia.tar before : dec 680 MB/s after : dec 710 MB/s (without bmi2) after : dec 770 MB/s (with DYNAMIC_BMI2)	2018-03-01 11:31:45 -08:00
Yann Collet	ccb7184a76	Merge pull request #1026 from terrelln/lrm-window LDM manages its own window round buffer	2018-02-27 17:09:10 -08:00
Nick Terrell	0a0e64c641	LDM manages its own window round buffer	2018-02-27 12:13:23 -08:00
Yann Collet	2c4d3f339a	Merge pull request #1025 from facebook/huf Huf	2018-02-27 09:57:01 -08:00
Yann Collet	33a3f18848	fixed wrong size test	2018-02-26 18:27:51 -08:00
Yann Collet	89741653ab	added error code workSpace_tooSmall	2018-02-26 15:11:50 -08:00
Yann Collet	6cdf690441	minor cleaning of huff0 Update code documentation, and properly names a few "magic constants". Also, HUF_compress_internal() gets a cleaner way to determine size of tables inside workspace.	2018-02-26 14:52:23 -08:00
Nick Terrell	6b88d592fd	Reduce ZSTD_CHAINLOG_MAX to 29 in 32-bit mode	2018-02-26 13:30:24 -08:00
Nick Terrell	7e5e226cbf	Split the window state into substructure	2018-02-26 13:29:57 -08:00
Yann Collet	50bc2ce95e	Merge pull request #1021 from terrelln/lrm-split Split block compresser out of long range matcher	2018-02-23 17:36:51 -08:00
Yann Collet	653383f74a	minor nit from Mac XCode	2018-02-22 15:44:26 -08:00
Nick Terrell	7e2bf4ebad	Remove long range matcher immediate repcode check The compression ratio gets about 0.01% worse on the files I tested, but the code is much simpler.	2018-02-22 15:18:47 -08:00
Nick Terrell	af866b3a58	Split block compresser out of long range matcher * `ZSTD_ldm_generateSequences()` generates the LDM sequences and stores them in a table. It should work with any chunk size, but is currently only called one block at a time. * `ZSTD_ldm_blockCompress()` emits the pre-defined sequences, and instead of encoding the literals directly, it passes them to a secondary block compressor. The code to handle chunk sizes greater than the block size is currently commented out, since it is unused. The next PR will uncomment exercise this code. * During optimal parsing, ensure LDM `minMatchLength` is at least `targetLength`. Also don't emit repcode matches in the LDM block compressor. Enabling the LDM with the optimal parser now actually improves the compression ratio. * The compression ratio is very similar to before. It is very slightly different, because the repcode handling is slightly different. If I remove immediate repcode checking in both branches the compressed size is exactly the same. * The speed looks to be the same or better than before. Up Next (in a separate PR) -------------------------- Allow sequence generation to happen prior to compression, and produce more than a block worth of sequences. Expose some API for zstdmt to consume. This will test out some currently untested code in `ZSTD_ldm_blockCompress()`.	2018-02-22 15:18:41 -08:00
Yann Collet	0fd4df6ed3	Implemented BMI2 functions directly within huf_decompress.c This makes it easier to edit for maintenance and evolutions (I plan to experiment modifications in huffman decompression functions). The methology followed seems broadly applicable to other BMI2 modules. Performance was tracked rigorously at each step, there is no noticeable loss (nor win) of performance compared to `#include` version. Note however that 4X decoder variants tend to be extremely sensitive to code alignment. This source code resulted in pretty good performance for gcc 7.2 and 7.3, but future changes (even in other parts of the code) might trigger the issue again.	2018-02-22 10:51:47 -08:00
Yann Collet	9c5a8040a9	fixed huf_compress workspace size	2018-02-21 11:34:49 -08:00
Yann Collet	010ba5f71f	Merge pull request #1017 from terrelln/c-bmi2 [compress] Support BMI2	2018-02-20 15:34:59 -08:00

1 2 3 4 5 ...

2242 Commits