AuroraMiddleware/zstd - zstd - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Nick Terrell	48a6427d22	[libzstd] Fix ZSTD_compress2() for multithreaded compression `ZSTD_compress2()` wouldn't wait for multithreaded compression to finish. We didn't find this because ZSTDMT will block when it can compress all in one go, but it can't do that if it doesn't have enough output space, or if `ZSTD_c_rsyncable` is enabled. Since we will already sometimes block when using `ZSTD_e_end`, I've changed `ZSTD_e_end` and `ZSTD_e_flush` to guarantee maximum forward progress. This simplifies the API, and helps users avoid the easy bug that was made in `ZSTD_compress2()` * Found by the libfuzzer fuzzers. * Added a test case that catches the problem. * I will make the fuzzers sometimes allocate less than `ZSTD_compressBound()` output space.	2019-04-09 16:24:17 -07:00
Nick Terrell	7a1fde2957	[fuzzer] Add dictionary fuzzers	2019-04-08 21:07:28 -07:00
Nick Terrell	462918560c	[fuzzer] Fix stream_round_trip for the new options	2019-04-08 21:06:19 -07:00
Nick Terrell	f871b5144e	[fuzz] Use the new advanced API	2019-04-08 20:01:38 -07:00
Nick Terrell	e649fad7aa	[dictBuilder] Fix displayLevel for corpus warning Pass the displaylevel into the corpus warning, because it is used in fast cover and cover, so it needs to respect the local level.	2019-04-08 20:00:18 -07:00
Nick Terrell	bfcd5b81d7	[libzstd] Don't check the dictID in fuzzing mode When `FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION` is defined don't check the dictID. This check makes the fuzzers job harder, and it is at the very beginning.	2019-04-08 19:57:41 -07:00
Nick Terrell	1a90133b15	Merge pull request #1575 from terrelln/zstdmt [libzstd] Remove ZSTDMT from the shared library	2019-04-08 16:51:40 -07:00
Nick Terrell	947548c24f	Remove double the from README	2019-04-08 16:50:18 -07:00
Nick Terrell	641e594309	[libzstd] Remove ZSTDMT from the shared object * Remove ZSTDMT from the shared object by default. * Provide a macro `ZSTD_LEGACY_MULTITHREADED_API` to override it. * Document it in `lib/README.md`.	2019-04-07 18:47:52 -07:00
Nick Terrell	d5910a5d94	Merge pull request #1574 from terrelln/examples Stabilize ZSTD_getDictID_*() functions and clean up examples	2019-04-05 23:29:32 -07:00
Nick Terrell	1d0c1707d1	[examples] Clean up and comment the examples	2019-04-05 21:02:07 -07:00
Nick Terrell	1dfe37fea9	[libzstd] Stabilize ZSTD_getDictID_*() functions	2019-04-05 18:59:30 -07:00
Nick Terrell	ce388fe4d2	[libzstd] Fix return value docs for ZSTD_compressStream2()	2019-04-05 17:44:07 -07:00
Nick Terrell	a63aaaa2cc	Merge pull request #1573 from terrelln/regression [regression] Update results.csv for level 1 change	2019-04-05 11:06:35 -07:00
Nick Terrell	dbc8a59a0a	Merge pull request #1569 from terrelln/stable Stabilize the advanced API	2019-04-05 10:47:48 -07:00
Nick Terrell	50c634b86e	[regression] Update results.csv for level 1 change	2019-04-05 10:46:22 -07:00
Nick Terrell	7231ea72a8	[libzstd] Reword the streaming docs for the new API	2019-04-03 19:21:05 -07:00
Nick Terrell	cf7d601bf5	Move the dictionary API and mark the legacy API * Move the dictionary API below the streaming API * Mark the legacy streaming API as redundant	2019-04-03 19:16:40 -07:00
Nick Terrell	d7d89513d6	Stabilize advance API This commit moves the candidate advanced API to the stable section. It makes some minor whitespace changes, but it doesn't change any of the wording of the documentation. I'll put up a separate PR that tweaks some of the documentation once this lands, so that it is easier to review. NOTE: Even though these functions are now in stable, they aren't stable until the next release (in under 1 month). It is possible that they change until then.	2019-04-03 18:43:20 -07:00
Nick Terrell	0827edeace	[libzstd] Bump the library version to 1.4.0 Bumps the library version to 1.4.0 in preparation to stabilize the advanced API.	2019-04-03 18:43:20 -07:00
Nick Terrell	72a3fbc0e4	Merge pull request #1562 from terrelln/2fast [libzstd] Speed up single segment zstd_fast by 5%	2019-04-03 18:08:15 -07:00
Nick Terrell	56261001ea	Merge pull request #1567 from terrelln/examples2 [examples] Update streaming_decompression.c	2019-04-03 11:27:49 -07:00
Yann Collet	816a3f47c7	Merge pull request #1568 from terrelln/examples3 Update streaming_memory_usage.c and fix ZSTD_estimateCStreamSize_usingCCtxParams()	2019-04-03 09:07:13 -07:00
Nick Terrell	cdc8ae2e9b	[examples] Update streaming_memory_usage.c Update to use the new streaming API. Making progress on Issue #1548. Tested that the checks don't fail. Tested with window log 9-32. The lowest and highest fail as expected.	2019-04-02 19:20:57 -07:00
Nick Terrell	00679da22b	[libzstd] Setting ZSTD_d_maxWindowLog to 0 means default	2019-04-02 19:20:52 -07:00
Nick Terrell	95624b77e4	[libzstd] Speed up single segment zstd_fast by 5% This PR is based on top of PR #1563. The optimization is to process two input pointers per loop. It is based on ideas from [igzip] level 1, and talking to @gbtucker. \| Platform \| Silesia \| Enwik8 \| \|-------------------------\|-------------\|--------\| \| OSX clang-10 \| +5.3% \| +5.4% \| \| i9 5 GHz gcc-8 \| +6.6% \| +6.6% \| \| i9 5 GHz clang-7 \| +8.0% \| +8.0% \| \| Skylake 2.4 GHz gcc-4.8 \| +6.3% \| +7.9% \| \| Skylake 2.4 GHz clang-7 \| +6.2% \| +7.5% \| Testing on all Silesia files on my Intel i9-9900k with gcc-8 \| Silesia File \| Ratio Change \| Speed Change \| \|--------------\|--------------\|--------------\| \| silesia.tar \| +0.17% \| +6.6% \| \| dickens \| +0.25% \| +7.0% \| \| mozilla \| +0.02% \| +6.8% \| \| mr \| -0.30% \| +10.9% \| \| nci \| +1.28% \| +4.5% \| \| ooffice \| -0.35% \| +10.7% \| \| osdb \| +0.75% \| +9.8% \| \| reymont \| +0.65% \| +4.6% \| \| samba \| +0.70% \| +5.9% \| \| sao \| -0.01% \| +14.0% \| \| webster \| +0.30% \| +5.5% \| \| xml \| +0.92% \| +5.3% \| \| x-ray \| -0.00% \| +1.4% \| Same tests on Calgary. For brevity, I've only included files where compression ratio regressed or was much better. \| Calgary File \| Ratio Change \| Speed Change \| \|--------------\|--------------\|--------------\| \| calgary.tar \| +0.30% \| +7.1% \| \| geo \| -0.14% \| +25.0% \| \| obj1 \| -0.46% \| +15.2% \| \| obj2 \| -0.18% \| +6.0% \| \| pic \| +1.80% \| +9.3% \| \| trans \| -0.35% \| +5.5% \| We gain 0.1% of compression ratio on Silesia. We gain 0.3% of compression ratio on enwik8. I also tested on the GitHub and hg-commands datasets without a dictionary, and we gain a small amount of compression ratio on each, as well as speed. I tested the negative compression levels on Silesia on my Intel i9-9900k with gcc-8: \| Level \| Ratio Change \| Speed Change \| \|-------\|--------------\|--------------\| \| -1 \| +0.13% \| +6.4% \| \| -2 \| +4.6% \| -1.5% \| \| -3 \| +7.5% \| -4.8% \| \| -4 \| +8.5% \| -6.9% \| \| -5 \| +9.1% \| -9.1% \| Roughly, the negative levels now scale half as quickly. E.g. the new level 16 is roughly equivalent to the old level 8, but a bit quicker and smaller. If you don't think this is the right trade off, we can change it to multiply the step size by 2, instead of adding 1. I think this makes sense, because it gives a bit slower ratio decay. [igzip]: https://github.com/01org/isa-l/tree/master/igzip	2019-04-02 19:02:50 -07:00
Nick Terrell	de58910b5a	[examples] Update streaming_decompression.c Update to use the new streaming API. Making progress on Issue #1548. Tested that it can decompress files produced by `streaming_compression`. Tested that it can decompress two frames concatenated together. Tested that it fails on corrupted data.	2019-04-02 18:52:59 -07:00
Nick Terrell	882ceb86bc	Merge pull request #1566 from terrelln/examples [examples] Update multiple_streaming_compression.c	2019-04-02 17:13:10 -07:00
Nick Terrell	56682a7709	Fix ZSTD_estimateCStreamSize_usingCCtxParams() It wasn't using the ZSTD_CCtx_params correctly. It must actualize the compression parameters by calling ZSTD_getCParamsFromCCtxParams() to get the real window log. Tested by updating the streaming memory usage example in the next commit. The CHECK() failed before this patch, and passes after. I also added a unit test to zstreamtest.c that failed before this patch, and passes after.	2019-04-01 18:02:52 -07:00
Nick Terrell	04325cbc2f	Fix indentation	2019-04-01 17:33:49 -07:00
Nick Terrell	fb13d757af	[examples] Update multiple_streaming_compression.c Update to use the new streaming API. Making progress on Issue #1548. Tested that multiple files could be compressed, and that the output is the same as calling `streaming_compression` multiple times with the same compression level, and that it can be decompressed.	2019-04-01 16:41:06 -07:00
Nick Terrell	425ce5547c	Merge pull request #1563 from terrelln/dms-sep [libzstd] Split out zstd_fast dict match state function	2019-03-29 16:19:21 -06:00
Nick Terrell	f00407b640	Split out zstd_fast dict match state function	2019-03-29 10:39:16 -06:00
Nick Terrell	6625f3b390	Merge pull request #1561 from shakeelrao/fix-typo Update comments in zstd.h and fileio.c	2019-03-28 23:42:16 -06:00
shakeelrao	dca73db30c	fix srcSize typo and add new UTIL func to comment	2019-03-28 17:50:34 -07:00
Nick Terrell	dcc6c7e9ae	Merge pull request #1556 from terrelln/dictbuilder [cover] Improvements for small or homogeneous data	2019-03-25 15:08:32 -07:00
Nick Terrell	440f390cba	Merge pull request #1557 from terrelln/examples [examples] Update streaming_compression to the new API	2019-03-25 15:07:35 -07:00
Nick Terrell	7186a50775	Merge pull request #1559 from shakeelrao/reject-dict [CLI] ensure dictionary and input file are different	2019-03-25 15:06:58 -07:00
shakeelrao	44f77b5c71	Add whitespace to test case	2019-03-24 03:42:11 -07:00
shakeelrao	b25d7eacf2	Rename test	2019-03-24 03:40:03 -07:00
shakeelrao	2b4491d81a	Add CLI test to validate error	2019-03-24 00:47:13 -07:00
shakeelrao	5333e41ab3	Add NULL check for dict	2019-03-24 00:23:50 -07:00
shakeelrao	8ea219d8c6	Modify error msg	2019-03-23 21:59:30 -07:00
shakeelrao	1290933d19	Implement file check	2019-03-23 21:53:13 -07:00
shakeelrao	e5811e5520	Extract file comparison into utility func	2019-03-23 19:04:56 -07:00
Nick Terrell	f5cbee988b	[examples] Update streaming_compression to the new API	2019-03-23 15:59:26 -07:00
Nick Terrell	d97605ad85	Merge pull request #1558 from nehaljwani/fix-version-soversion-libzstd [libzstd] Specify soversion and version corectly for CMake build	2019-03-23 13:32:39 -07:00
Nehal J Wani	7ac2052dbc	[libzstd] Specify soversion and version correctly for CMake build Fixes #1512	2019-03-23 17:37:37 +05:30
Nick Terrell	d0f5ba36fb	[cover] Improvements for small or homogeneous data * The algorithm would bail as soon as it found one epoch that contained no new segments. Change it so it now has to fail >= 10 times in a row (10 for fastcover, 10-100 for cover). * The algorithm uses the `maxDict` size to decide the epoch size. When this size is absurdly large, it causes tiny epochs. Lower bound the epoch size at 10x the segment size, and warn the user that their training set is too small. Fixes #1554	2019-03-22 14:14:46 -07:00
Nick Terrell	0c7668cd06	Merge pull request #1555 from terrelln/load-dict [lib] Allow ZSTD_CCtx_loadDictionary() to be called before parameters are set	2019-03-21 17:52:57 -07:00

... 2 3 4 5 6 ...

6565 Commits