Commit Graph

639 Commits

Author SHA1 Message Date
Yann Collet
31a0abbfda updated pzstd and largeNbDicts to use the new FileNamesTable* abstraction 2019-11-06 09:10:05 -08:00
Yann Collet
09b1844d9b
Merge pull request #1784 from bimbashrestha/fse_block_bound_err
Rearranging assert and allowing 4 extra for FSE_BLOCKBOUND()
2019-09-12 19:09:27 -07:00
Bimba Shrestha
43da5bf27e Rearranging assert and allowing 4 extra for FSE_BLOCKBOUND() 2019-09-12 14:43:50 -07:00
Carl Woffenden
88975e8c25 Minor: documented sizes smaller 2019-09-02 18:15:31 +02:00
Carl Woffenden
8ac29cc825 Correctness and tidy
Test compilation performed with warnings. Author and license added. Test for failing grep on ancient OSX versions. Replaced the test image with something less noisy (which compresses better).
2019-09-02 18:02:50 +02:00
Yann Collet
64102f08da Merge branch 'dev' into decTest 2019-08-29 09:48:12 -07:00
Carl Woffenden
72e51ac246 C99 and older GCC fixes 2019-08-29 11:16:57 +02:00
Yann Collet
4b3a8fe1c4 fix create_ script for sh 2019-08-28 13:23:48 -07:00
Yann Collet
9589e8e4bb
Merge pull request #1749 from facebook/rmadapt
removed adaptive-compression
2019-08-28 12:26:29 -07:00
Yann Collet
8af941d2d7 Merge branch 'dev' into decTest 2019-08-28 12:17:29 -07:00
Carl Woffenden
cdf73e915e Rewrote the scripts to sh instead of bash 2019-08-28 19:20:42 +02:00
Yann Collet
f61e8a231f minor script renaming, for clarity 2019-08-27 16:01:39 -07:00
Yann Collet
517aeb89dc changed contrib project name for clarity 2019-08-27 15:50:47 -07:00
Yann Collet
5ed1b1e11d removed adaptive-compression
the functionality is already integrated into `zstd` through `--adapt` command
2019-08-27 14:47:40 -07:00
Carl Woffenden
51868964ef Fixed test failure when Emscripten not present 2019-08-27 17:12:57 +02:00
Carl Woffenden
6213b7b3b4 Minor repetition 2019-08-27 16:57:23 +02:00
Carl Woffenden
59052d5fd8 Typo 2019-08-27 16:55:03 +02:00
Carl Woffenden
ec12721538 Added clarification 2019-08-27 15:53:26 +02:00
Carl Woffenden
6712a644fa Added reasoning 2019-08-27 15:51:14 +02:00
Carl Woffenden
4f2a8b752a Typo 2019-08-27 15:38:34 +02:00
Carl Woffenden
a57de4ac89 Added test script; tidied and documented
The test script combines the sources then builds and runs an example. A futher example is built if the Emscripten compiler is available on the system. Documentation covers building.
2019-08-27 15:36:06 +02:00
Carl Woffenden
7c6fa81579 Added Emscripten example, removed Buck, minor tidy
Work-in-progress. Added simple Emscripten WebGL example that adds 25kB when build with Zstd. Removed Buck (will replace). Minor correctness.
2019-08-26 21:28:19 +02:00
Carl Woffenden
ea8f6d2a07 Able to test combine script; minor tidy 2019-08-26 07:48:57 +02:00
Carl Woffenden
d760e35ebc Preparing to run tests
Combine script more robust and can output to a specified file. Initial buck files added (work in progress).
2019-08-25 22:49:01 +02:00
Carl Woffenden
36a59336da Minor fix for files with spaces. Typo. 2019-08-23 23:09:13 +02:00
Carl Woffenden
0a49353a46 Added generator script and simple test
The script will combine decompressor sources into a single file. The example shows this in use.
2019-08-23 18:43:29 +02:00
Felix Handte
2314906b68
Merge pull request #1699 from felixhandte/seekable-gitignore
Add New Seekable Compression Example to .gitignore
2019-07-24 19:07:55 -04:00
Yann Collet
0d38ee3c30
Merge pull request #1690 from piguin/dev
fix compiling errors with clang-8
2019-07-24 15:37:05 -07:00
W. Felix Handte
15da57820d Add New Seekable Compression Example to .gitignore 2019-07-24 18:22:20 -04:00
Sean Purcell
671d533ea7 Fix seekable decompression in-memory api 2019-07-21 23:22:25 -04:00
Qin Li
04a9d6b828 fix compiling errors with clang-8
Compiling with clang-8 fails with the following errors:

largeNbDicts.c:562:37: error: implicit conversion turns floating-point
number into integer: 'const double' to 'U64' (aka 'unsigned long')
[-Werror,-Wfloat-conversion]
        U64 const dTime_ns = result.nanoSecPerRun;
                  ~~~~~~~~   ~~~~~~~^~~~~~~~~~~~~

zstdcli.c:300:5: error: '@return' command used in a comment that is
not attached to a function or method declaration
[-Werror,-Wdocumentation]
 * @return 1 means that cover parameters were correct
   ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

zstdcli.c:301:5: error: '@return' command used in a comment that is
not attached to a function or method declaration
[-Werror,-Wdocumentation]
 * @return 0 in case of malformed parameters
   ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2019-07-18 19:41:00 -07:00
Josh Soref
a880ca239b Spelling (#1582)
* spelling: accidentally

* spelling: across

* spelling: additionally

* spelling: addresses

* spelling: appropriate

* spelling: assumed

* spelling: available

* spelling: builder

* spelling: capacity

* spelling: compiler

* spelling: compressibility

* spelling: compressor

* spelling: compression

* spelling: contract

* spelling: convenience

* spelling: decompress

* spelling: description

* spelling: deflate

* spelling: deterministically

* spelling: dictionary

* spelling: display

* spelling: eliminate

* spelling: preemptively

* spelling: exclude

* spelling: failure

* spelling: independence

* spelling: independent

* spelling: intentionally

* spelling: matching

* spelling: maximum

* spelling: meaning

* spelling: mishandled

* spelling: memory

* spelling: occasionally

* spelling: occurrence

* spelling: official

* spelling: offsets

* spelling: original

* spelling: output

* spelling: overflow

* spelling: overridden

* spelling: parameter

* spelling: performance

* spelling: probability

* spelling: receives

* spelling: redundant

* spelling: recompression

* spelling: resources

* spelling: sanity

* spelling: segment

* spelling: series

* spelling: specified

* spelling: specify

* spelling: subtracted

* spelling: successful

* spelling: return

* spelling: translation

* spelling: update

* spelling: unrelated

* spelling: useless

* spelling: variables

* spelling: variety

* spelling: verbatim

* spelling: verification

* spelling: visited

* spelling: warming

* spelling: workers

* spelling: with
2019-04-12 11:18:11 -07:00
Yann Collet
59a7116cc2 benchfn dependencies reduced to only timefn
benchfn used to rely on mem.h, and util,
which in turn relied on platform.h.
Using benchfn outside of zstd required to bring all these dependencies.

Now, dependency is reduced to timefn only.
This required to create a separate timefn from util,
and rewrite benchfn and timefn to no longer need mem.h.

Separating timefn from util has a wide effect accross the code base,
as usage of time functions is widespread.
A lot of build scripts had to be updated to also include timefn.
2019-04-10 12:37:03 -07:00
Peter (Stig) Edwards
4a9e0502e6
-Wformat-security not needed with -Wformat=2 2019-02-01 09:29:08 +00:00
Peter (Stig) Edwards
2b7120ec71
-Wformat-security not needed with -Wformat=2 2019-02-01 09:28:41 +00:00
Dmitry V. Levin
8b2210411a contrib/pzstd/Makefile: fix build of tests
Apparently, Options.o cannot be linked in without $(PROGDIR)/util.o
2018-12-28 19:02:22 +00:00
Yann Collet
ededcfca57 fix confusion between unsigned <-> U32
as suggested in #1441.

generally U32 and unsigned are the same thing,
except when they are not ...

case : 32-bit compilation for MIPS (uint32_t == unsigned long)

A vast majority of transformation consists in transforming U32 into unsigned.
In rare cases, it's the other way around (typically for internal code, such as seeds).

Among a few issues this patches solves :
- some parameters were declared with type `unsigned` in *.h,
  but with type `U32` in their implementation *.c .
- some parameters have type unsigned*,
  but the caller user a pointer to U32 instead.

These fixes are useful.

However, the bulk of changes is about %u formating,
which requires unsigned type,
but generally receives U32 values instead,
often just for brevity (U32 is shorter than unsigned).
These changes are generally minor, or even annoying.

As a consequence, the amount of code changed is larger than I would expect for such a patch.

Testing is also a pain :
it requires manually modifying `mem.h`,
in order to lie about `U32`
and force it to be an `unsigned long` typically.
On a 64-bit system, this will break the equivalence unsigned == U32.
Unfortunately, it will also break a few static_assert(), controlling structure sizes.
So it also requires modifying `debug.h` to make `static_assert()` a noop.
And then reverting these changes.

So it's inconvenient, and as a consequence,
this property is currently not checked during CI tests.
Therefore, these problems can emerge again in the future.

I wonder if it is worth ensuring proper distinction of U32 != unsigned in CI tests.
It's another restriction for coding, adding more frustration during merge tests,
since most platforms don't need this distinction (hence contributor will not see it),
and while this can matter in theory, the number of platforms impacted seems minimal.

Thoughts ?
2018-12-21 18:09:41 -08:00
Yann Collet
34f01e600f fixed multiple conversions
from 64-bit to 32-bit
2018-12-13 14:02:22 -08:00
Yann Collet
9c3265a53f
Merge pull request #1417 from facebook/advancedAPI
Advanced API
2018-12-10 18:48:15 -08:00
Yann Collet
3583d19c4e changed parameter names from ZSTD_p_* to ZSTD_c_*
for naming consistency
2018-12-05 17:26:02 -08:00
Lzu Tao
beb13bd87e Move contrib/meson to build/meson 2018-12-01 23:18:59 +07:00
Lzu Tao
c0e71cae55 Add enable_lz4 build option and fix lzma dependency 2018-12-01 23:18:59 +07:00
Lzu Tao
5c4965c351 Add pedantic flag 2018-12-01 23:18:59 +07:00
Lzu Tao
6f3f1a8d3a No install zstd_manual.html 2018-12-01 23:18:59 +07:00
Lzu Tao
f660825d9f Install missed zstdgrep and zstdless 2018-12-01 23:18:59 +07:00
Lzu Tao
3f27e2a072 Install zstdmt.1 manpage [skip ci] 2018-12-01 23:18:59 +07:00
Lzu Tao
d3134a3ed3 Rename meson variables 2018-12-01 23:18:59 +07:00
Lzu Tao
1985e427c7 Add manpage install warning [skip ci]
We link new manpages with gz compressed format of the target manpage.
I have not tested it on Windows. So just place a warning here.
2018-12-01 23:18:59 +07:00
Lzu Tao
9c862c6a53 Fix manpage symlinks [skip ci] 2018-12-01 23:18:59 +07:00
Lzu Tao
d79df2a370 Apply new InstallSymlink script 2018-12-01 23:18:59 +07:00
Lzu Tao
ef2e761937 Helper script to install symlink in meson 2018-12-01 23:18:59 +07:00
Lzu Tao
3175188407 No need these helpers 2018-12-01 23:18:59 +07:00
Lzu Tao
337f914dc8 Fix lib soversion and no install cover.h header 2018-12-01 23:18:59 +07:00
Lzu Tao
c9f0144302 Fix meson tests build 2018-12-01 23:18:59 +07:00
Lzu Tao
5a36a57cf5 Bump to 1.3.8 and fix run_command function
The run_command is run from an unspecified directory. Therefore we cannot assume
which directory it is running our command.
2018-12-01 23:18:59 +07:00
Lzu Tao
8a160680d1 Update legacy support to 5 2018-12-01 23:18:59 +07:00
Lzu Tao
f727808731 Minor fix for meson build
Use files function instead of constructing path with meson.current_source_dir()
2018-12-01 23:18:59 +07:00
Lzu Tao
9a721e5216 Update meson build system
NOTE: This commit only tested on Linux (Ubuntu 18.04). Windows
build may not work as expected.

* Use meson >= 0.47.0 cause we use install_man function
* Add three helper Python script:
  * CopyFile.py: To copy file
  * CreateSymlink.py: To make symlink (both Windows and Unix)
  * GetZstdLibraryVersion.py: Parse lib/zstd.h to get zstd version
  These help emulating equivalent functions in CMake and Makefile.
* Use subdir from meson to split meson.build
  * Add contrib build
  * Fix other build
* Add new build options
  * build_programs: Enable programs build
  * build_contrib: Enable contrib build
  * build_tests: Enable tests build
  * use_static_runtime: Link to static run-time libraries on MSVC
  * zlib_support: Enable zlib support
  * lzma_support: Enable lzma support
2018-11-28 01:08:34 +07:00
Lzu Tao
9bd8f6a00c Rename and update build instruction in README file to README.md 2018-11-28 01:08:34 +07:00
Lzu Tao
2abd5139a5 Add meson build guide 2018-11-28 01:08:34 +07:00
Yann Collet
5adbad4059 Merge branch 'dev' into advancedAPI 2018-11-14 13:00:37 -08:00
Yann Collet
b83d1e7714 removed some static const variables
and replaced by traditional macro constants.

Unfortunately, C doesn't consider `static const` to mean "constant"
2018-11-13 16:56:32 -08:00
Yann Collet
b830ccca5c changed benchfn api
to use structure for function parameters
as it expresses much clearer than a long list of parameters,
since each parameter can now be named.
2018-11-13 13:12:50 -08:00
Yann Collet
d38063f8ae separated bench module into benchfn and benchzstd
it shall be possible to use benchfn
without any dependency on zstd.
2018-11-13 11:01:59 -08:00
Yann Collet
483759a3de Improves decompression speed when using cold dictionary
by triggering the prefetching decoder path
(which used to be dedicated to long-range offsets only).

Figures on my laptop :
no content prefetch : ~300 MB/s (for reference)
full content prefetch : ~325 MB/s (before this patch)
new prefetch path : ~375 MB/s (after this patch)

The benchmark speed is already significant,
but another side-effect is that this version
prefetch less data into memory,
since it only prefetches what's needed, instead of the full dictionary.

This is supposed to help highly active environments
such as active databases,
that can't be properly measured in benchmark environment (too clean).

Also :
fixed the largeNbDict test program
which was working improperly when setting nbBlocks > nbFiles.
2018-11-08 17:00:23 -08:00
Rohit Jain
705e0b18ab Making changes to make it compile on my laptop 2018-10-11 15:51:57 -07:00
Yann Collet
123fac6b6d fix pzstd compatibility with mingw
some details changed with introduction of gcc7
2018-09-21 17:36:00 -07:00
Yann Collet
00ce26725b
Merge pull request #1324 from ko-zu/fixclangcode
Fix largeNbDicts bench for clangbuild
2018-09-17 14:10:17 -07:00
Nick Terrell
8f27e8cf3d
Merge pull request #1322 from azat-archive/seekable-fixes-pull
Fixes read write past end of input buffer.
2018-09-17 11:04:51 -07:00
ko-zu
b053bec2f4 Fix largeNbDicts bench for clangbuild
Remove unsigned to size_t promotion to fix implicit down conversion errors in clangbuild target.
2018-09-17 13:09:08 +09:00
Azat Khuzhin
d707692e05
seekable_decompression: support offset greater then UNIT_MAX 2018-09-16 18:05:32 +03:00
Azat Khuzhin
b52867a97f
zstdseek_decompress: fix decompression with data left in input buffer 2018-09-16 18:05:32 +03:00
Yann Collet
c49ccbc8e7 largeNbDicts : can select a nb of blocks
will automatically truncate or repeat input as needed,
to create the requested nb of blocks.
default: nb of files, eventually increased appropriately if blockSize is set
2018-09-12 11:31:28 -07:00
Yann Collet
50b216146f
Merge pull request #1304 from facebook/largeNbDicts
contrib/largeNbDicts
2018-09-06 09:50:56 -07:00
Yann Collet
c57a856d64 fixed minor static analyzer warning 2018-09-05 14:33:51 -07:00
Yann Collet
1d487d587f updated documentation 2018-09-04 14:57:45 -07:00
Yann Collet
11b8b8c100 silenced false-positive scan-build warning 2018-08-31 10:01:06 -07:00
Yann Collet
0ff67511e6 fixed link order for old compilers 2018-08-30 16:43:28 -07:00
Yann Collet
f76253bb70 minor : createDictionaryBuffer() can create dictionaries of different sizes 2018-08-30 16:24:44 -07:00
Yann Collet
39c55a118f fixed minor compatibility issues with older compilers 2018-08-30 16:00:57 -07:00
Yann Collet
39ef91a599 -std=c99 for largeNbDicts 2018-08-30 14:59:23 -07:00
Yann Collet
4086b2871b largeNbDicts compatible with multiple source files
splitting is disabled by default, but can be re-enabled using usual command -B#
update commands to look like zstd ones
2018-08-30 14:38:49 -07:00
Yann Collet
a5a77965d3 make all includes contrib/largeNbDicts 2018-08-29 16:17:22 -07:00
Yann Collet
d89fa814c1 added a README
for documentation
2018-08-28 18:19:19 -07:00
Yann Collet
6444c50035 increases randomness of ddict ptrs 2018-08-28 18:13:46 -07:00
Yann Collet
6c398df241 level, block size and nb dicts can be set on command line 2018-08-28 18:05:31 -07:00
Yann Collet
0c66a44d1b first working test program
measures :
- compression ratio with / without dictionary
- create one dictionary per block
- memory budget for dictionaries
- decompression speed, using one different dictionary per block

current limitations :
- only one file
- 4K blocks only
- automatic dictionary built with 4K size

dictionary can be selected on command line, with -D
2018-08-28 15:47:07 -07:00
Yann Collet
274b60e6e6 largeNbDicts can compress and compare dict vs noDict 2018-08-27 17:08:44 -07:00
Yann Collet
6782725155 first sketch for largeNbDicts test program 2018-08-26 19:29:12 -07:00
Jennifer Liu
9d6ed9def3 Merge fastCover into DictBuilder (#1274)
* Minor fix

* Run non-optimize FASTCOVER 5 times in benchmark

* Merge fastCover into dictBuilder

* Fix mixed declaration issue

* Add fastcover to symbol.c

* Add fastCover.c and cover.h to build

* Change fastCover.c to fastcover.c

* Update benchmark to run FASTCOVER in dictBuilder

* Undo spliting fastcover_param into cover_param and f

* Remove convert param functions

* Assign f to parameter

* Add zdict.h to Makefile in lib

* Add cover.h to BUCK

* Cast 1 to U64 before shifting

* Remove trimming of zero freq head and tail in selectSegment and rebenchmark

* Remove f as a separate parameter of tryParam

* Read 8 bytes when d is 6

* Add trimming off zero frequency head and tail

* Use best functions from COVER and remove trimming part(which leads to worse compression ratio after previous bugs were fixed)

* Add finalize= argument to FASTCOVER to specify percentage of training samples passed to ZDICT_finalizeDictionary

* Change nbDmer to always read 8 bytes even when d=6

* Add skip=# argument to allow skipping dmers in computeFrequency in FASTCOVER

* Update comments and benchmarking result

* Change default method of ZDICT_trainFromBuffer to ZDICT_optimizeTrainFromBuffer_fastCover

* Add dictType enum and fix bug about passing zParam when converting to coverParam

* Combine finalize and skip into a single parameter

* Update acceleration parameters and benchmark on 3 sample sets

* Change default splitPoint of FASTCOVER to 0.75 and benchmark first 3 sample sets

* Initialize variables outside of for loop in benchmark.c

* Update benchmark result for hg-manifest

* Remove cover.h from install-includes

* Add explanation of f

* Set default compression level for trainFromBuffer to 3

* Add assertion of fastCoverParams in DiB_trainFromFiles

* Add checkTotalCompressedSize function + some minor fixes

* Add test for multithreading fastCovr

* Initialize segmentFreqs in every FASTCOVER_selectSegment and move mutex_unnlock to end of COVER_best_finish

* Free segmentFreqs

* Initialize segmentFreqs before calling FASTCOVER_buildDictionary instead of in FASTCOVER_selectSegment

* Add FASTCOVER_MEMMULT

* Minor fix

* Update benchmarking result
2018-08-23 12:06:20 -07:00
Yann Collet
36d6165a2d Makefile: added variable SCANBUILD
so that a different version of scan-build can be selected
2018-08-16 16:44:13 -07:00
Yann Collet
42a02ab745 fixed minor warnings issued by scan-build 2018-08-15 14:36:02 -07:00
Jennifer Liu
0acb0abd1e Add non-optimize FASTCOVER (#1260)
* Add non-optimize FASTCOVER

* Minor fix

* Pass param as value instead of pointer
2018-08-01 11:06:16 -07:00
Jennifer Liu
4e29bc2469 Use CDict instead of CCtx in analyzeEntropy 2018-07-31 10:36:45 -07:00
Jennifer Liu
31229e527b Increment frequency for every dmer occurence within same sample instead of at most once per sample 2018-07-30 12:54:22 -07:00
Jennifer Liu
51b109c1b5 Delete old benchmarking result 2018-07-27 17:31:33 -07:00
Jennifer Liu
53ef22a4bc Undo deleting clean in make 2018-07-27 16:56:50 -07:00
Jennifer Liu
96d84ee235 Revert test.sh 2018-07-27 16:54:05 -07:00
Jennifer Liu
61262f6c0d Save segmentFreqs in ctx instead of malloc and memset in SelectSegment 2018-07-27 16:51:38 -07:00
Jennifer Liu
49b398e93f Use same param after optimizing cover and fastCover and record k and d for benchmarking 2018-07-27 13:39:19 -07:00