Commit Graph

1302 Commits

Author SHA1 Message Date
Yann Collet
50b216146f
Merge pull request #1304 from facebook/largeNbDicts
contrib/largeNbDicts
2018-09-06 09:50:56 -07:00
Jennifer Liu
21721b75a3 Change default f to 20 2018-09-04 17:15:14 -07:00
Yann Collet
39c55a118f fixed minor compatibility issues with older compilers 2018-08-30 16:00:57 -07:00
Jennifer Liu
f87383507d Update comment about default dictionary builder 2018-08-30 15:46:39 -07:00
Yann Collet
4086b2871b largeNbDicts compatible with multiple source files
splitting is disabled by default, but can be re-enabled using usual command -B#
update commands to look like zstd ones
2018-08-30 14:38:49 -07:00
Yann Collet
0c66a44d1b first working test program
measures :
- compression ratio with / without dictionary
- create one dictionary per block
- memory budget for dictionaries
- decompression speed, using one different dictionary per block

current limitations :
- only one file
- 4K blocks only
- automatic dictionary built with 4K size

dictionary can be selected on command line, with -D
2018-08-28 15:47:07 -07:00
Yann Collet
b37a0a6bde
Merge pull request #1298 from facebook/bench
Refactored bench.c
2018-08-28 12:25:02 -07:00
Yann Collet
0491037db9 Merge branch 'bench' into largeNbDicts 2018-08-28 11:26:46 -07:00
Yann Collet
55affc09de timedFn : measurement delay is programmable
instead of hard-coded 1 second per measurement
2018-08-28 11:26:27 -07:00
Yann Collet
d97e92dfad Merge branch 'bench' into largeNbDicts 2018-08-27 12:12:51 -07:00
Yann Collet
01dcd0fd17 bench: minor api update, for consistency
BMK_benchTimedFn()
BMK_isCompleted_TimedFn() uses TimedFnState
2018-08-26 21:30:18 -07:00
Yann Collet
6782725155 first sketch for largeNbDicts test program 2018-08-26 19:29:12 -07:00
Yann Collet
c3a4baaf6e fixed minor warnings
valgrind: memory leak of a few bytes in fullbench
static analyzer: uninitialized data passed as result
2018-08-24 23:25:35 -07:00
Yann Collet
2279f3d127 bench: reduce nb of return type
runOutcome is enough
removed timedFnOutcome
2018-08-24 17:28:38 -07:00
Yann Collet
6ce7b08f17 fix minor warnings
gcc : prototype with 0 parameter must be labelled (void)
visual : const property must be identical in both declaration and implementation
2018-08-24 15:59:57 -07:00
Yann Collet
4da5bdf482 fixed zstd -b speed result
the benchmark was displaying the speed of last run
instead of the best of all previous runs.
2018-08-23 18:13:49 -07:00
Yann Collet
1f9ec13621 introduced MB_UNIT
so that all benchmarking programs use the same speed scale
2018-08-23 16:03:30 -07:00
Yann Collet
d39a25c5ed update fullbench.c to work with new bench.h 2018-08-23 15:00:09 -07:00
Yann Collet
2e45badff4 refactored bench.c
for clarity and safety, especially at interface level
2018-08-23 14:21:18 -07:00
Jennifer Liu
9d6ed9def3 Merge fastCover into DictBuilder (#1274)
* Minor fix

* Run non-optimize FASTCOVER 5 times in benchmark

* Merge fastCover into dictBuilder

* Fix mixed declaration issue

* Add fastcover to symbol.c

* Add fastCover.c and cover.h to build

* Change fastCover.c to fastcover.c

* Update benchmark to run FASTCOVER in dictBuilder

* Undo spliting fastcover_param into cover_param and f

* Remove convert param functions

* Assign f to parameter

* Add zdict.h to Makefile in lib

* Add cover.h to BUCK

* Cast 1 to U64 before shifting

* Remove trimming of zero freq head and tail in selectSegment and rebenchmark

* Remove f as a separate parameter of tryParam

* Read 8 bytes when d is 6

* Add trimming off zero frequency head and tail

* Use best functions from COVER and remove trimming part(which leads to worse compression ratio after previous bugs were fixed)

* Add finalize= argument to FASTCOVER to specify percentage of training samples passed to ZDICT_finalizeDictionary

* Change nbDmer to always read 8 bytes even when d=6

* Add skip=# argument to allow skipping dmers in computeFrequency in FASTCOVER

* Update comments and benchmarking result

* Change default method of ZDICT_trainFromBuffer to ZDICT_optimizeTrainFromBuffer_fastCover

* Add dictType enum and fix bug about passing zParam when converting to coverParam

* Combine finalize and skip into a single parameter

* Update acceleration parameters and benchmark on 3 sample sets

* Change default splitPoint of FASTCOVER to 0.75 and benchmark first 3 sample sets

* Initialize variables outside of for loop in benchmark.c

* Update benchmark result for hg-manifest

* Remove cover.h from install-includes

* Add explanation of f

* Set default compression level for trainFromBuffer to 3

* Add assertion of fastCoverParams in DiB_trainFromFiles

* Add checkTotalCompressedSize function + some minor fixes

* Add test for multithreading fastCovr

* Initialize segmentFreqs in every FASTCOVER_selectSegment and move mutex_unnlock to end of COVER_best_finish

* Free segmentFreqs

* Initialize segmentFreqs before calling FASTCOVER_buildDictionary instead of in FASTCOVER_selectSegment

* Add FASTCOVER_MEMMULT

* Minor fix

* Update benchmarking result
2018-08-23 12:06:20 -07:00
Yann Collet
77e805e3db bench: changed creation/reset function to timedFnState
for consistency
2018-08-21 18:19:27 -07:00
Yann Collet
801e3bcd97
Merge pull request #1290 from edenzik/ezik/1119-safe-strcpy-in-fileio
Fixed unsafe string copy and concat in `fileio.c`.
2018-08-21 13:18:44 -07:00
Eden Zik
78af534f82 Fixed unsafe string copy and concat in fileio.c.
Per warnings from flawfinder: "Does not check for buffer overflows when
copying to destination [MS-banned] (CWE-120). Consider using snprintf,
strcpy_s, or strlcpy (warning: strncpy easily misused).".

Replaced called to strcpy and strcat in `fileio.c` to calls with a
specified size (`strncpy` and `strncat`).

Tested the changes on OSX, Linux, Windows.
On OSX + Linux, changes were tested with ASAN. The following flags were
used: 'check_initialization_order=1:strict_init_order=1:detect_odr_violation=1:detect_stack_use_after_return=1'

To reproduce warning:
./flawfinder.py ./programs/fileio.c
2018-08-20 22:15:24 -04:00
Yann Collet
42a02ab745 fixed minor warnings issued by scan-build 2018-08-15 14:36:02 -07:00
George Lu
e89f1fb45c Fix scan-build warnings in bench.c 2018-08-14 14:44:47 -07:00
Yann Collet
973a8d42c7
Merge pull request #1236 from GeorgeLu97/paramgrillconstraints
ParamgrillConstraints
2018-08-13 15:44:50 -07:00
Yann Collet
754942cb79 fixed assert() condition 2018-08-09 15:57:19 -07:00
Yann Collet
79a35ac20d minor code comments improvements 2018-08-09 15:16:31 -07:00
Yann Collet
51e71a5ec7 added zstdgrep documentation
presenting `zstdgrep` limit regarding dictionary compression
with workaround recommended by @tobwen (#1268)
2018-08-09 12:28:25 -07:00
George Lu
bfe8392e23 Remove ctx from benchMem 2018-08-09 12:07:57 -07:00
George Lu
8278a49cb6 const srcPtrs 2018-08-09 10:42:58 -07:00
George Lu
3d230db853 Change speed representation from floating point to integral 2018-08-09 10:42:58 -07:00
George Lu
dd270b2f75 Renaming / Style fixes 2018-08-09 10:42:58 -07:00
George Lu
e148db366e Separate capacity vs size
Also:
Make suggested fixes
-varInds_t
-reorder some arguments
-remove code duplication
-update README / -h
-Fix memory leaks
2018-08-09 10:42:58 -07:00
George Lu
df026e159f Fix windows implicit casting bugs 2018-08-09 10:42:58 -07:00
George Lu
7b5b3d7ae3 BenchMem with block compressed sizes passed back up 2018-08-09 10:42:58 -07:00
George Lu
3adc217ea4 Total Changes:
Add different constraint types (decompression speed, compression memory, parameter constraints)
Separate search space by strategy + strategy selection
Memoize results
Real random restarts
Support multiple files
Support Dictionary inputs
Debug Macro for extra printing
2018-08-09 10:42:58 -07:00
George Lu
eb21b7f482 Not crashing 2018-08-09 10:42:58 -07:00
George Lu
5f49034520 Working V1 2018-08-09 10:42:58 -07:00
George Lu
cffb6da339 Parses additional parameters
Additional constraint checking

Minor fixes

more param parsing

Add Memory

Change paramVariation

work on feasibility

reformat bench

Changed Paramgrill to use bench.c benchmarking

customlevel macro

Printing Flag

Minor changes

Explicit casting

Makefile fix

casting, type fix

Printing Flag

Minor Changes

comments, helper fn's
2018-08-09 10:42:58 -07:00
Yann Collet
5808027abf Merge branch 'dev' into fix1241 2018-08-03 16:08:33 -07:00
Yann Collet
2fdab1629b fix unused variable warning 2018-08-03 08:30:01 -07:00
Yann Collet
5203f01774 fix : zstd cli can be built with build macro ZSTD_NOBENCH
which disables bench.c module
2018-08-03 07:54:29 -07:00
cyan4973
3f535007e4 fix %zu support under minGW
and relevant test on Appveyor
2018-07-30 16:56:18 +02:00
George Lu
09ccd977c3 no zero 2018-07-26 15:17:58 -07:00
Yann Collet
effa84c8d1
Merge pull request #1230 from terrelln/train-out
zstdcli: Allow -o before --train
2018-07-18 16:34:10 +02:00
Nick Terrell
4e706d7f2c fileio: Error in compression on read errors
We can write a corrupted file if the input file errors during a read.
We should return a non-zero error code in this case.
2018-07-17 15:26:30 -07:00
Nick Terrell
58b8219475 zstdcli: Allow -o before --train
Only set the default value if `outFileName` is unset.

Fixes #1227.
2018-07-16 12:45:34 -07:00
Nick Terrell
45821fac0c
Merge pull request #1225 from jennifermliu/dev
Split samples when building dictionary for COVER
2018-07-13 13:26:15 -07:00
Jennifer Liu
612b346ed5 Add explanation for split=100 2018-07-11 15:50:28 -07:00