measures :
- compression ratio with / without dictionary
- create one dictionary per block
- memory budget for dictionaries
- decompression speed, using one different dictionary per block
current limitations :
- only one file
- 4K blocks only
- automatic dictionary built with 4K size
dictionary can be selected on command line, with -D
Add different constraint types (decompression speed, compression memory, parameter constraints)
Separate search space by strategy + strategy selection
Memoize results
Real random restarts
Support multiple files
Support Dictionary inputs
Debug Macro for extra printing
Additional constraint checking
Minor fixes
more param parsing
Add Memory
Change paramVariation
work on feasibility
reformat bench
Changed Paramgrill to use bench.c benchmarking
customlevel macro
Printing Flag
Minor changes
Explicit casting
Makefile fix
casting, type fix
Printing Flag
Minor Changes
comments, helper fn's
Seperate syntheticTest and fileTableTest (now renamed as benchFiles)
Add incremental display to benchMem
Change to only iterMode for benchFunction
Make Synthetic test's compressibility configurable from cli (using -P#)
-Remove global variables
-Remove gv setting functions
-Add advancedParams struct
-Add defaultAdvancedParams();
-Change return type of bench Files
-Change cli to use new interface
-Changed error returns to own struct value
-Change default compression benchmark to use decompress_generic
-Add CustomBench function
-Add Documentation for new functions
- do not test level 0, as it is converted into level 3,
which feels strange when compressing multiple levels
- Use direct synchronous mode when a single worker is requested.
access negative compression levels from command line
for both compression and benchmark modes.
also : ensure proper propagation of parameters
through ZSTD_compress_generic() interface.
added relevant cli tests.
zstd bench module can focus on decompression speed _only_.
This is useful when trying to measure performance
on large input data compressed using a high level
as compression time becomes problematic (too long).
This mode is triggered by command : zstd -b -d
Problem was : in such a mode,
measured decoding speed was > 10% slower
than in nominal mode (compression + decompression),
making decompression benchmark mode much less useful.
This patch fixes the issue.
It's not completely clear why, but
moving the `memcpy()` operation sooner in the pipeline fixed it.
I can still measure some difference, but it is in the < 2% range,
so it's much more tolerable.
also : it doesn't matter anymore in which order are selected
commands `-b` and `-d`.
The combination always triggers bench_decodeOnly mode.
by invoking time() once per batch, instead of once per compression / decompression.
Batch is dynamically resized so that each round lasts approximately 1 second.
Also : increases time accuracy to nanosecond
This makes it easier to explain that nbWorkers=0 --> single-threaded mode,
while nbWorkers=1 --> asynchronous mode (one mode thread on top of the "main" caller thread).
No need for an additional asynchronous mode flag.
nbWorkers>=2 works the same as nbThreads>=2 previously.
There was a flaw in the formula
which compared literal cost with match cost :
at a given position,
a non-null literal suite is going to be part of next sequence,
while if position ends a previous match, to immediately start another match,
next sequence will have a litlength of zero.
A litlength of zero has a non-null cost.
It follows that literals cost should be compared to match cost + litlength==0.
Not doing so gave a structural advantage to matches, which would be selected more often.
I believe that's what led to the creation of the strange heuristic which added a complex cost to matches.
The heuristic was actually compensating.
It was probably created through multiple trials, settling for best outcome on a given scenario (I suspect silesia.tar).
The problem with this heuristic is that it's hard to understand,
and unfortunately, any future change in the parser would impact the way it should be calculated and its effects.
The "proper" formula makes it possible to remove this heuristic.
Now, the problem is : in a head to head comparison, it's sometimes better, sometimes worse.
Note that all differences are small (< 0.01 ratio).
In general, the newer formula is better for smaller files (for example, calgary.tar and enwik7).
I suspect that's because starting statistics are pretty poor (another area of improvement).
However, for silesia.tar specifically, it's worse at level 22 (while being better at level 17, so even compression level has an impact ...).
It's a pity that zstd -22 gets worse on silesia.tar.
That being said, I like that the new code gets rid of strange variables,
which were introducing complexity for any future evolution (faster variants being in mind).
Therefore, in spite of this detrimental side effect, I tend to be in favor of it.