AuroraMiddleware/zstd - zstd - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Haydn Trigg	c6351021e4	Visual Studio 2017 build scripts	2018-03-11 00:15:31 +10:30
Yann Collet	8c6dbf490b	updated NEWS in preparation for v1.3.4	2018-03-09 16:22:34 -08:00
Yann Collet	5414b8ea01	Merge branch 'dev' of github.com:facebook/zstd into dev	2018-03-09 11:53:24 -08:00
Yann Collet	e916b9090e	gen_html: changed CFLAGS for CXXFLAGS since it's associated with $(CXX)	2018-03-09 11:52:14 -08:00
Yann Collet	51169575a8	Merge pull request #1036 from terrelln/thread-void [threading] Cast unused arguments to void	2018-03-07 12:14:05 -08:00
Yann Collet	0379d83951	Merge pull request #1034 from facebook/longOffsetMode Dynamic selection of long offset mode	2018-03-07 10:26:35 -08:00
Nick Terrell	7e103cdaf5	[threading] Cast unused arguments to void	2018-03-06 18:36:40 -08:00
Yann Collet	db147ea620	improved comments following @terrelln suggestions	2018-03-06 18:15:26 -08:00
Yann Collet	51262bd832	Merge pull request #1033 from facebook/benchDecode fix benchmark issue when measuring decoding speed only	2018-03-06 17:55:23 -08:00
Yann Collet	06ca9c7d7c	fixed 0-seq blocks in block-decompression mode	2018-03-06 01:50:19 -08:00
Yann Collet	9a91afe6ef	long offset mode : new default threshold for 32-bit	2018-03-05 16:41:08 -08:00
Yann Collet	7bd7a3ad43	long offset mode : new default threshold for 64-bits mode	2018-03-05 16:16:49 -08:00
Yann Collet	c0393a538f	fixed counting long distance weights	2018-03-05 15:12:10 -08:00
Yann Collet	a70f7e10fa	Merge branch 'benchDecode' into longOffsetMode	2018-03-05 14:09:00 -08:00
Yann Collet	03e7e14192	fix benchmark issue when measuring only decoding speed zstd bench module can focus on decompression speed _only_. This is useful when trying to measure performance on large input data compressed using a high level as compression time becomes problematic (too long). This mode is triggered by command : zstd -b -d Problem was : in such a mode, measured decoding speed was > 10% slower than in nominal mode (compression + decompression), making decompression benchmark mode much less useful. This patch fixes the issue. It's not completely clear why, but moving the `memcpy()` operation sooner in the pipeline fixed it. I can still measure some difference, but it is in the < 2% range, so it's much more tolerable. also : it doesn't matter anymore in which order are selected commands `-b` and `-d`. The combination always triggers bench_decodeOnly mode.	2018-03-05 13:57:41 -08:00
Yann Collet	41bd10446e	Merge branch 'dev' into longOffsetMode	2018-03-05 13:10:10 -08:00
Yann Collet	cb789d2df8	re-inserted offset evaluation	2018-03-05 13:08:59 -08:00
Yann Collet	99afe72576	Merge pull request #1032 from facebook/bmi2 Enable DYNAMIC_BMI2 for clang	2018-03-05 13:03:24 -08:00
Yann Collet	b91ddf0ae6	Merge branch 'dev' into longOffsetMode	2018-03-05 11:59:54 -08:00
Yann Collet	403741130d	Merge pull request #1029 from cemeyer/dev FIO_addFInfo: Fully initialize output 'total' struct	2018-03-05 11:49:48 -08:00
Yann Collet	d02b44cf55	DYNAMIC_BMI2 enabled for clang clang only claims compatibility with gcc 4.2. Consequently, recent patch which reserved DYNAMIC_BMI2 for gcc >= 4.8 also disabled it for clang. fix : __clang__ is now enough to enable DYNAMIC_BMI2 (associated with other existing conditions : x64/x64, !bmi2)	2018-03-04 16:05:59 -08:00
Yann Collet	3ba307b240	Merge pull request #1031 from facebook/inline48 force_inline HUF_decodeSymbol*()	2018-03-01 17:52:15 -08:00
Yann Collet	45b09e7625	limit DYNAMIC_BMI2 to gcc >= 4.8 attribute bmi2 not supported by gcc 4.4	2018-03-01 15:02:18 -08:00
Yann Collet	b01552a07a	force inlining of HUF_decodeSymbol*() functions which was not done properly by gcc 4.8 resulting in major performance difference. ex : zstd -b1 silesia.tar before : dec 680 MB/s after : dec 710 MB/s (without bmi2) after : dec 770 MB/s (with DYNAMIC_BMI2)	2018-03-01 11:31:45 -08:00
Conrad Meyer	606374269c	FIO_addFInfo: Fully initialize output 'total' struct Silence a Coverity warning about 'windowSize' being uninitialized. (Yes, nothing that calls this routine actually uses the windowSize value. Still, appeasing Coverity is pretty harmless in this case.)	2018-02-28 15:23:05 -08:00
Yann Collet	564cb1b640	update doc/README.md on provided tools to test 3rd party implementations	2018-02-27 17:37:05 -08:00
Yann Collet	ccb7184a76	Merge pull request #1026 from terrelln/lrm-window LDM manages its own window round buffer	2018-02-27 17:09:10 -08:00
Nick Terrell	0a0e64c641	LDM manages its own window round buffer	2018-02-27 12:13:23 -08:00
Yann Collet	2c4d3f339a	Merge pull request #1025 from facebook/huf Huf	2018-02-27 09:57:01 -08:00
Yann Collet	33a3f18848	fixed wrong size test	2018-02-26 18:27:51 -08:00
Yann Collet	d18d43aaf9	Merge pull request #1024 from terrelln/window-split Split the window state into substructure	2018-02-26 17:18:33 -08:00
Yann Collet	89741653ab	added error code workSpace_tooSmall	2018-02-26 15:11:50 -08:00
Yann Collet	6cdf690441	minor cleaning of huff0 Update code documentation, and properly names a few "magic constants". Also, HUF_compress_internal() gets a cleaner way to determine size of tables inside workspace.	2018-02-26 14:52:23 -08:00
Nick Terrell	6b88d592fd	Reduce ZSTD_CHAINLOG_MAX to 29 in 32-bit mode	2018-02-26 13:30:24 -08:00
Nick Terrell	7e5e226cbf	Split the window state into substructure	2018-02-26 13:29:57 -08:00
Yann Collet	50bc2ce95e	Merge pull request #1021 from terrelln/lrm-split Split block compresser out of long range matcher	2018-02-23 17:36:51 -08:00
Yann Collet	653383f74a	minor nit from Mac XCode	2018-02-22 15:44:26 -08:00
Nick Terrell	7e2bf4ebad	Remove long range matcher immediate repcode check The compression ratio gets about 0.01% worse on the files I tested, but the code is much simpler.	2018-02-22 15:18:47 -08:00
Nick Terrell	af866b3a58	Split block compresser out of long range matcher * `ZSTD_ldm_generateSequences()` generates the LDM sequences and stores them in a table. It should work with any chunk size, but is currently only called one block at a time. * `ZSTD_ldm_blockCompress()` emits the pre-defined sequences, and instead of encoding the literals directly, it passes them to a secondary block compressor. The code to handle chunk sizes greater than the block size is currently commented out, since it is unused. The next PR will uncomment exercise this code. * During optimal parsing, ensure LDM `minMatchLength` is at least `targetLength`. Also don't emit repcode matches in the LDM block compressor. Enabling the LDM with the optimal parser now actually improves the compression ratio. * The compression ratio is very similar to before. It is very slightly different, because the repcode handling is slightly different. If I remove immediate repcode checking in both branches the compressed size is exactly the same. * The speed looks to be the same or better than before. Up Next (in a separate PR) -------------------------- Allow sequence generation to happen prior to compression, and produce more than a block worth of sequences. Expose some API for zstdmt to consume. This will test out some currently untested code in `ZSTD_ldm_blockCompress()`.	2018-02-22 15:18:41 -08:00
Yann Collet	4fb071ec3c	Merge pull request #1022 from facebook/bmi2IntoC Implemented BMI2 functions directly within huf_decompress.c	2018-02-22 14:30:43 -08:00
Yann Collet	0fd4df6ed3	Implemented BMI2 functions directly within huf_decompress.c This makes it easier to edit for maintenance and evolutions (I plan to experiment modifications in huffman decompression functions). The methology followed seems broadly applicable to other BMI2 modules. Performance was tracked rigorously at each step, there is no noticeable loss (nor win) of performance compared to `#include` version. Note however that 4X decoder variants tend to be extremely sensitive to code alignment. This source code resulted in pretty good performance for gcc 7.2 and 7.3, but future changes (even in other parts of the code) might trigger the issue again.	2018-02-22 10:51:47 -08:00
Yann Collet	4d6632c8f3	Merge pull request #1020 from facebook/betterBench updated fullbench measurement methodology	2018-02-21 14:51:39 -08:00
Yann Collet	6e481504ee	fullbench includes assert.h as it is missing for Windows	2018-02-21 11:42:23 -08:00
Yann Collet	9c5a8040a9	fixed huf_compress workspace size	2018-02-21 11:34:49 -08:00
Yann Collet	364ce19463	update fullbench measurement methodology to use less calls to time(), like bench.c. also upgraded accuracy to nanosecond.	2018-02-21 09:43:32 -08:00
Yann Collet	993ffffba3	Merge pull request #1019 from facebook/betterBench improve benchmark measurement for small inputs	2018-02-21 05:47:08 -08:00
Yann Collet	25d00d10fc	fixed minor conversion warning	2018-02-20 16:52:28 -08:00
Yann Collet	010ba5f71f	Merge pull request #1017 from terrelln/c-bmi2 [compress] Support BMI2	2018-02-20 15:34:59 -08:00
Yann Collet	3538a535bf	use TIMELOOP_NANOSEC as suggested by @terrelln	2018-02-20 15:33:56 -08:00
Yann Collet	d3364aa39e	improve benchmark measurement for small inputs by invoking time() once per batch, instead of once per compression / decompression. Batch is dynamically resized so that each round lasts approximately 1 second. Also : increases time accuracy to nanosecond	2018-02-20 14:58:40 -08:00

1 2 3 4 5 ...

4836 Commits