for --patch-from origin
and --filelist list
Also : removed some constrained syntax tests,
as the new argument parsing syntax is more permissive.
For example :
zstd file -of dest
used to be disallowed.
It's now allowed, and understood as:
zstd file -o dest -f
Allow compression to use dictionaries with missing symbols in their
entropy tables. We set the FSE repeat mode to check when there are
missing symbols, and set the FSE repeat mode to valid when all symbols
are present.
Note that when not all symbols are present, the heuristics which favor
dictionary tables for lower compression levels won't activate.
Tested by manually creating a dictionary with missing symbols of every
type, and validing that the compressor rejects it before this change,
and accepts it after this change. Also, I ran the `dictionary_loader`
fuzzer for >1 hour of CPU time without running into cases where
compression succeeds, but decompression fails.
Fixes#2174.
* adding long support for patch-from
* adding refPrefix to dictionary_decompress
* adding refPrefix to dictionary_loader
* conversion nit
* triggering log mode on chainLog < fileLog and removing old threshold
* adding refPrefix to dictionary_round_trip
* adding docs
* adding enableldm + forceWindow test for dict
* separate patch-from logic into FIO_adjustParamsForPatchFromMode
* moving memLimit adjustment to outside ifdefs (need for decomp)
* removing refPrefix gate on dictionary_round_trip
* rebase on top of dev refPrefix change
* making sure refPrefx + ldm is < 1% of srcSize
* combining notes for patch-from
* moving memlimit logic inside fileio.c
* adding display for optimal parser and long mode trigger
* conversion nit
* fuzzer found heap-overflow fix
* another conversion nit
* moving FIO_adjustMemLimitForPatchFromMode outside ifndef
* making params immutable
* moving memLimit update before createDictBuffer call
* making maxSrcSize unsigned long long
* making dictSize and maxSrcSize params unsigned long long
* error on files larger than 4gb
* extend refPrefix test to include round trip
* conversion to size_t
* making sure ldm is at least 10x better
* removing break
* including zstd_compress_internal and removing redundant macros
* exposing ZSTD_cycleLog()
* using cycleLog instead of chainLog
* add some more docs about user optimizations
* formatting
playTests.sh didn't work when `ZSTD_BIN` or `DATAGEN_BIN` had a space in
the path name. This happens for me because I split the cmake build
directory by compiler name, like "Clang 9.0.0".
The fix is to replace all instances of `$ZSTD` with the `zstd()`
function, and the replace `$DATAGEN` with `datagen()`. This will allow
us to change how we call zstd/datagen in the future without having to
change every callsite.
* Adding new cli endpoint --diff-from=
* Appveyor conversion nit
* Using bool set trick instead of direct set
* Removing --diff-from and only leaving --diff-from=#
* Throwing error when both dictFileName vars are set
* Clean up syntax
* Renaming diff-from to patch-from
* Revering comma separated syntax clean up
* Updating playtests with patch-from
* Uncommenting accidentally commented
* Updating remaining docs and var names to be patch-from instead of diff-from
* Constifying
* Using existing log2 function and removing newly created one
* Argument order (moving prefs to end)
* Using comma separated syntax
* Moving to outside #ifndef