AuroraMiddleware/zstd - zstd - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Yann Collet	71f012e5bf	zstdcli: fixed minor warning when bench module not enabled one variable defined but not used	2017-12-01 17:42:46 -08:00
Yann Collet	a1b24e6262	Merge pull request #938 from terrelln/time Use util.h for timing	2017-12-01 16:40:38 -08:00
Nick Terrell	dab8cfa3c7	Combine definitions of SEC_TO_MICRO	2017-11-30 19:40:53 -08:00
Nick Terrell	9a2f6f477b	Use util.h for timing	2017-11-30 14:57:25 -08:00
Yann Collet	2f22a6ec50	Merge branch 'dev' into opt3	2017-11-28 15:03:58 -08:00
Yann Collet	0a0a212934	zstd_opt: changed cost formula There was a flaw in the formula which compared literal cost with match cost : at a given position, a non-null literal suite is going to be part of next sequence, while if position ends a previous match, to immediately start another match, next sequence will have a litlength of zero. A litlength of zero has a non-null cost. It follows that literals cost should be compared to match cost + litlength==0. Not doing so gave a structural advantage to matches, which would be selected more often. I believe that's what led to the creation of the strange heuristic which added a complex cost to matches. The heuristic was actually compensating. It was probably created through multiple trials, settling for best outcome on a given scenario (I suspect silesia.tar). The problem with this heuristic is that it's hard to understand, and unfortunately, any future change in the parser would impact the way it should be calculated and its effects. The "proper" formula makes it possible to remove this heuristic. Now, the problem is : in a head to head comparison, it's sometimes better, sometimes worse. Note that all differences are small (< 0.01 ratio). In general, the newer formula is better for smaller files (for example, calgary.tar and enwik7). I suspect that's because starting statistics are pretty poor (another area of improvement). However, for silesia.tar specifically, it's worse at level 22 (while being better at level 17, so even compression level has an impact ...). It's a pity that zstd -22 gets worse on silesia.tar. That being said, I like that the new code gets rid of strange variables, which were introducing complexity for any future evolution (faster variants being in mind). Therefore, in spite of this detrimental side effect, I tend to be in favor of it.	2017-11-28 14:07:03 -08:00
W. Felix Handte	baff9dd15e	Fix LZ4 Compression Buffer Overflow Fixes issue where, when `zstd --format=lz4` is fed an input larger than 128KB, the read overruns the input buffer. This changes Zstd to use LZ4 with chained 64KB blocks. This is technically a breaking change in that some third party LZ4 implementations may not support linked blocks. However, progress should not be allowed to be stopped by such petty concerns as backwards compatibility!	2017-11-28 12:07:26 -05:00
Yann Collet	743b23878e	install: changed variable MANDIR into MAN1DIR MANDIR still exists, and is now the parent of MAN1DIR	2017-11-27 13:47:35 -08:00
Yann Collet	2fd765498a	updated man page following patch #931 by @scottchiefbaker	2017-11-24 17:20:54 -08:00
Yann Collet	c857ee850a	minor update	2017-11-24 16:44:28 -08:00
Scott Baker	31a191b178	Include information about the benchmark output/methodology Addresses #930	2017-11-22 20:34:25 -08:00
Yann Collet	daebc7fe26	bench: slightly adjusted display format adapt accuracy depending on value. makes it possible to have higher accuracy for small value, notably small compression speed. This capability is expected to be useful while modifying optimal parser.	2017-11-18 15:54:32 -08:00
Nick Terrell	a6052af0e8	[zstd] Fix rare bug with signal handler	2017-11-17 16:38:56 -08:00
Yann Collet	5b957ba899	minor interface adjustments	2017-11-17 01:21:40 -08:00
Yann Collet	d898fb7ba6	bench: added cli command `-S` to benchmark multiple files separately Currently, all files are joined by default, they are compressed separately but benchmarked together, providing a single final result. Benchmarking files separately make it possible to accurately measure difference for each file. This is expected to be useful while tuning optimal parser.	2017-11-17 00:22:55 -08:00
Yann Collet	8accfa7fcc	bench: realTime is a global parameter like most parameters not directly related to compression	2017-11-17 00:02:37 -08:00
Yann Collet	9a11f70dc3	merged repcode search into BT match search this version has same speed as branch `opt` which is itself 5-10% slower than branch `dev` (no identified reason) It does not compress exactly the same as `opt` or `dev`, maybe because it doesn't stop search after repcodes, leading to sometimes better compression, sometimes worse (by a small margin). warning : _extDict path does not work for the time being This means that benchmark module works, but file module will fail with large files (and high compression level). Objective is to fuse _extDict path into current one, in order to have a single parser to maintain.	2017-11-13 02:23:48 -08:00
Yann Collet	6f1dfa8adf	removed line with `//` comment this is for a different topic (better parameter adaptation for small files + dictionary and/or custome parameters)	2017-11-01 17:01:45 -07:00
Yann Collet	428e8b3bf4	fix : ZSTD_compress_generic(,,,ZSTD_e_end) automatically sets pledgedSrcSize as per documentation, on ZSTD_setPledgedSrcSize() : > If all data is provided and consumed in a single round, > this value (pledgedSrcSize) is overriden by srcSize instead. This wasn't applied before compression level is transformed into compression parameters. As a consequence, small input missed compression parameters adaptation. It seems to work fine now : compression was compared with ZSTD_compress_advanced(), results were the same.	2017-11-01 13:15:23 -07:00
Nick Terrell	b495140f67	Update BUCK files * Correct XXH namespace (Fixes #901) * Multithreading always enabled * GZIP/LZ4/LZMA always enabled * Legacy support always fully enabled	2017-10-25 12:47:57 -07:00
Yann Collet	91535d71ec	fixed missing zstdmt_compress.h dependency we lose a warning message : when a job size is chosen < minimum job size for multithreading, it is automatically resized to minimum size. If this information is really useful, it should be present in zstd.h now.	2017-10-19 12:09:34 -07:00
Yann Collet	eac42534fe	bench: fixed Visual warning regarding struct initialization also : removed dependency on zstdmt_compress.h removed several unused macros fileio : small code refactoring to reduce some variable scope	2017-10-19 11:56:14 -07:00
Yann Collet	d3b9547aa4	IO and bench : ZSTD_NEWAPI is the only remaining code path removed the other 2 code paths (single thread, and ZSTDMT ones) keeping only the new advanced API, for easier code coverage. It shall also fix identified issue with Visual Studio which doesn't have ZSTD_NEWAPI defined.	2017-10-18 17:01:53 -07:00
Yann Collet	300e1df0a3	fixed wrong test to display compression status	2017-10-18 11:41:52 -07:00
Yann Collet	18b795374a	UTIL_getFileSize() returns UTIL_FILESIZE_UNKNOWN on failure UTIL_getFileSize() used to return zero on failure. This made it impossible to distinguish a failure from a genuine empty file. Both cases where coalesced. Adding UTIL_FILESIZE_UNKNOWN constant has many consequences on user code, since in many places, the `0` was assumed to mean "error". This is no longer the case, and the error code must be actively checked.	2017-10-17 16:14:25 -07:00
Yann Collet	32c9f715ae	fixed : Visual build compressing stdin with multi-threading enabled fails It was multiple reasons stacked : - Visual use a different code path, because ZSTD_NEWAPI is not defined - fileio.c sends `0` as `pledgedSrcSize` to mean `ZSTD_CONTENTSIZE_UNKNOWN` (fixed) - ZSTDMT_resetCCtx() interpreted `0` as "empty" instead of "unknown" (fixed)	2017-10-17 14:07:43 -07:00
Yann Collet	fc8d293460	dictionary compression use correct file size estimation when determining compression parameters to compress one file only. For multiple files, it still "bets" that files are going to be small. There was also a bug recently added in ZSTD_CCtx_loadDictionary_advanced() making it incapable to use pledgedSrcSize to determine compression parameters.	2017-10-14 01:21:43 -07:00
Yann Collet	9ef32b3cf1	minor : zstd -l -v display each file name	2017-10-14 00:02:32 -07:00
Yann Collet	dd18d73e7e	fileio: content size is enabled by default	2017-10-13 16:32:18 -07:00
Nick Terrell	6dd958eea2	[zstdcli] Add window size to verbose list ``` > zstd --list -v file1 file2 file3 * zstd command line interface 64-bits v1.3.2, by Yann Collet * Window Size: 512.00 KB (524288 B) Compressed Size: 0.02 KB (19 B) Check: XXH64 Window Size: 8192.00 KB (8388608 B) Compressed Size: 0.02 KB (19 B) Check: XXH64 Window Size: 512.00 KB (524288 B) Compressed Size: 0.01 KB (15 B) Check: None ```	2017-10-04 12:26:28 -07:00
Yann Collet	3b27ed41fd	Merge branch 'srcSize' into dev	2017-10-02 16:34:14 -07:00
Yann Collet	4946993f87	removed isRegularFile parameter no longer useful : size of src is determined for each file.	2017-10-02 12:29:25 -07:00
Yann Collet	7f580f9ee8	interruption handler and variable are static	2017-10-02 11:39:05 -07:00
Yann Collet	fe5444bc66	removed the statement for all versions of Visual Studio	2017-10-02 02:02:16 -07:00
Yann Collet	51d82d5516	same error in Visual Studio 2012 ...	2017-10-02 01:12:40 -07:00
Yann Collet	ed7ae4c9bd	The issue also impacts Visual Studio 2010	2017-10-02 00:45:28 -07:00
Yann Collet	6e7ba3df2f	added (void)sig to avoid compilers complaining that sig is not used.	2017-10-02 00:19:47 -07:00
Yann Collet	82bc200f82	conditionnally removed invocation that generates a buggy warning with Visual Studio 2008	2017-10-02 00:02:24 -07:00
Yann Collet	bd18095edc	blindfix for Visual : minor casting issue should not happen since SIGIGN is provided by <signal.h>, so it should work "ouf of the box"	2017-10-01 15:32:48 -07:00
Yann Collet	00fc1ba8dd	cli: add Ctrl-C support, requested by @mike155 in #854 Now, pressing Ctrl-C during compression or decompression will erase operation artefact (unfinished destination file) before leaving execution.	2017-10-01 12:10:26 -07:00
Yann Collet	e580dc6a4a	Merge pull request #860 from felixhandte/zstd-lz4-support-tests Add Default LZ4 Support When Available	2017-09-29 22:32:54 -07:00
Yann Collet	8afb151c9b	cli: fixed wrong initialization in MT mode It's not good to mix old and new API ZSTD_resetCStream() doesn't just set pledgedSrcSize : it also sets the CCtx for a single thread compression. Problem is, when 2+ threads are defined in cctx->requestedParams, ZSTD_compress_generic() will want to start MT compression, since initialization is supposed to have already happened (thanks to ZSTD_resetCStream()) except that the underlying ZSTDMT_CCtx* object is not created, resulting in a segfault. This is an invalid construction (correct one is to use ZSTD_CCtx_setPledgedSrcSize()). I haven't found a nice way to mitigate this impact if someone makes the same mistake. At some point, removing the old API to keep only the new API within fileio.c will limit these risks.	2017-09-29 22:14:37 -07:00
Yann Collet	fbd5ab7027	minor fix : no longer use fake srcSize during resource creation srcSize is read and provided at each file, not at resource creation. This used to be useful with older API, because it could not re-adapt parameters between sessions. At some point, it will be better to remove the old code, and only keep the new_api. It works fine by now.	2017-09-29 19:40:27 -07:00
Yann Collet	db1668a43b	fix : srcSize written in frame header when multiple files compressed This information used to be disabled when nbFiles>1. It was badly initialized later in the code, resulting in an error.	2017-09-29 18:05:18 -07:00
Yann Collet	8afcc80e07	decode more data before triggering error fixes #874 : when a frame is not properly terminated by a "last block" signal, zstd -d used to detect it immediately and error out. This version will decode and flush the last block, and only then issue an error.	2017-09-29 15:54:09 -07:00
W. Felix Handte	dc27c36495	Update documentation to reflect other format support	2017-09-28 19:43:12 -04:00
W. Felix Handte	d0519d4b0c	Add CLI Program Name Detection for LZ4	2017-09-28 19:18:15 -04:00
Yann Collet	54a827fff0	Merge branch 'dev' into newFormats Fixed conflicts in zstdmt_compress.c	2017-09-27 16:39:40 -07:00
Yann Collet	3182ea2e64	Merge pull request #866 from facebook/list improved --list display	2017-09-27 16:34:29 -07:00
Nick Terrell	c233bdbaee	Increase maximum window size * Maximum window size in 32-bit mode is 1GB, since allocations for 2GB fail on my Mac. * Maximum window size in 64-bit mode is 2GB, since that is the largest power of 2 that works with the overflow prevention. * Allow `--long=windowLog` to set the window log, along with `--zstd=wlog=#`. These options also set the window size during decompression, but don't override `--memory=#` if it is set. * Present a helpful error message when the window size is too large during decompression. * The long range matcher defaults to a hash log 7 less than the window log, which keeps it at 20 for window log 27. * Keep the default long range matcher window size and the default maximum window size at 27 for the API and CLI. * Add tests that use the maximum window size and hash size for compression and decompression.	2017-09-26 14:00:01 -07:00
Yann Collet	3095ca8c56	fixed minor conversion warnings for g++ on Linux U64 is not considered equivalent to unsigned long long	2017-09-26 13:53:50 -07:00
Yann Collet	56f1f0e3dd	write summary for --list on multiple files	2017-09-26 11:21:36 -07:00
Yann Collet	62568c9a42	added capability to generate magic-less frames decoder not implemented yet	2017-09-25 14:26:26 -07:00
W. Felix Handte	360238733a	Adds LZ4 support by default if LZ4 is available Simple makefile change + quick typename change Test: make clean make # successfully produces binary without lz4 support make clean # with flags to pick up my lz4 build make MOREFLAGS="-L/home/felixh/prog/lz4/lib -I/home/felixh/prog/lz4/lib" # successfully produces binary with lz4 support echo "TEST TEST TEST THIS IS A TEST STRING PLEASE TEST THIS PLEASE OK THANK YOU" \| \ ./lz4/lz4 \| \ LD_LIBRARY_PATH=/home/felixh/prog/lz4/lib ./zstd/zstd -d # successfully prints TEST TEST TEST THIS IS A TEST STRING PLEASE TEST THIS PLEASE OK THANK YOU	2017-09-22 13:28:56 -07:00
Yann Collet	b0c0e3a3fb	Merge pull request #853 from terrelln/blog [zstdcli] Fix LDM advanced options parsing	2017-09-18 15:21:23 -07:00
Nick Terrell	1fe762e236	[zstdcli] Fix LDM advanced options parsing	2017-09-18 14:49:35 -07:00
Yann Collet	92889709f9	fix #851 : sudo zstd -t file.zst changes /dev/null permissions reported by @mike155	2017-09-18 13:41:54 -07:00
Yann Collet	1722055799	add comment on using -B# to split input file for dictionary training	2017-09-15 16:23:50 -07:00
Yann Collet	c68d17f2da	ensures that sampleSizes table is large enough as recommended by @terrelln	2017-09-15 15:31:31 -07:00
Yann Collet	25a60488dd	fixed 64-to-32 conversion warnings	2017-09-15 11:55:13 -07:00
Yann Collet	a9694231ca	fixed minor conversion warning	2017-09-15 10:16:26 -07:00
Yann Collet	086b9597d9	added ability to split input files for dictionary training using command -B# This is the same behavior as benchmark module, which can also split input into arbitrary size blocks, using -B#.	2017-09-14 16:45:10 -07:00
Yann Collet	77c137b3ae	minor comment refactor	2017-09-14 15:12:57 -07:00
Yann Collet	f1571dad8f	Merge pull request #838 from stellamplau/ldm-mergeDev Add long distance matcher	2017-09-13 13:24:08 -07:00
Yann Collet	8f26dc3f9c	blindfix for Visual LARGE_INTEGER is not an integer : https://msdn.microsoft.com/en-us/library/windows/desktop/aa383713(v=vs.85).aspx Do not take any risk with the structure definition : use int init = 0; like Mac code	2017-09-12 21:21:17 -07:00
Yann Collet	bc41c7f0eb	fixed minor prototype warning	2017-09-12 19:32:26 -07:00
Yann Collet	c95c0c9725	modified util::time API for easier invocation. - no longer expose frequency timer : it's either useless, or stored internally in a static variable (init is only necessary once). - UTIL_getTime() provides result by function return.	2017-09-12 18:12:46 -07:00
Nick Terrell	6ab4d5e904	[bench] Use higher resolution timer on POSIX The timer used was only accurate up to 0.01 seconds. This timer is accurate up to 1 ns. It is a monotonic timer that measures the real time difference, not on CPU time.	2017-09-12 16:46:44 -07:00
Stella Lau	eb3327c10a	Merge branch 'dev' of https://github.com/facebook/zstd into ldm-mergeDev	2017-09-11 15:00:01 -07:00
Yann Collet	3128e03be6	updated license header to clarify dual-license meaning as "or"	2017-09-08 00:09:23 -07:00
Yann Collet	baa37c3362	programs/Makefile : better support for GNU conventions see https://www.gnu.org/prep/standards/html_node/Command-Variables.html	2017-09-06 16:53:59 -07:00
Stella Lau	eeff55dfa8	Merge remote-tracking branch 'upstream/dev' into ldm-mergeDev	2017-09-06 15:56:32 -07:00
Stella Lau	8c33cfe0bc	Add ldm documentation in README	2017-09-06 15:21:01 -07:00
Stella Lau	c706de5395	Rename and add short ldm parameters in cli	2017-09-05 21:11:18 -07:00
Stella Lau	643d28c701	Add ldm options to 'man zstd'	2017-09-05 11:27:15 -07:00
Stella Lau	67d4a6161c	Add ldmBucketSizeLog param	2017-09-02 21:55:29 -07:00
Stella Lau	a1f04d518d	Move hashEveryLog to cctxParams and update cli	2017-09-01 15:05:47 -07:00
Yann Collet	0558850735	bench stops immediately on decoding error	2017-09-01 11:46:15 -07:00
Stella Lau	17d8e0bdcc	Merge remote-tracking branch 'upstream/longRangeMatcher' into ldm-integrate	2017-09-01 10:19:38 -07:00
Stella Lau	8081becadc	Add long distance matching as a CCtxParam	2017-09-01 09:18:58 -07:00
Yann Collet	d7ad99b2ab	Merge branch 'longRangeMatcher' into dev	2017-08-31 18:08:37 -07:00
Yann Collet	c7818fc676	Merge branch 'modTests' into dev fixed conflict	2017-08-31 17:00:16 -07:00
Yann Collet	4299c27132	improved console log of utils.h removed a warning when compiling on Windows	2017-08-31 16:58:47 -07:00
Yann Collet	8e298382a8	changed target allarch into allzstd allzstd contains only zstd-related tests. allmost = allzstd + zwrapper tests (which require zlib)	2017-08-31 14:30:52 -07:00
Stella Lau	6a546efb8c	Add long distance matcher Move last literals section to ZSTD_block_internal	2017-08-31 12:53:19 -07:00
Yann Collet	b0cb081dc8	last batch of header files changed to reflect new license (#825 ) only remains to update contrib/linux-kernel (@terrelln)	2017-08-31 12:20:50 -07:00
Stella Lau	c88fb9267f	Replace 'byReference' with enum	2017-08-29 11:55:02 -07:00
Bernhard M. Wiedemann	cf689b84f9	Sort input file list in order to make builds reproducible in spite of indeterministic filesystem readdir order. See https://reproducible-builds.org/ for why this is good.	2017-08-26 17:08:00 +02:00
Stella Lau	6f1a21c7e9	Remove formatting-only changes	2017-08-23 10:24:19 -07:00
Yann Collet	232d62b637	fixed a few headers that were too hastily copy/pasted during last license change	2017-08-21 11:24:32 -07:00
Stella Lau	91b30dbe84	Remove test parameter	2017-08-21 10:09:06 -07:00
Stella Lau	f181f33bdf	Disable tests and refactor	2017-08-21 01:59:08 -07:00
Stella Lau	023b24e6d4	Add cctx param tests	2017-08-20 22:55:07 -07:00
Yann Collet	1c108c811e	cli : Display supported formats on -vV command Requested and inspired by patch from @ib (#771)	2017-08-19 13:33:50 -07:00
Yann Collet	2ecd34ee5e	fixed unused variables warnings	2017-08-19 01:23:49 -07:00
Yann Collet	23706fb743	updated doc on compilation variables	2017-08-19 01:14:36 -07:00
Yann Collet	9203003d5f	fixed zstd-nolegacy and added it to allVariants for CI testings	2017-08-19 01:01:53 -07:00
Yann Collet	4b387729b6	fixed zstd-small and added it to shortest for CI tests	2017-08-19 00:48:29 -07:00
Yann Collet	7729ab83bb	Merge branch 'dev' into variants	2017-08-19 00:37:06 -07:00
Yann Collet	32e943b3ef	Merge branch 'dev' of github.com:facebook/zstd into dev	2017-08-19 00:36:37 -07:00

1 2 3 4 5 ...

1180 Commits