The problem was already masked,
due to no longer accepting tiny blocks for statistics.
But in case it could still happen with not-so-tiny blocks,
there is a stricter control which ensures that
nothing was already loaded prior to statistics collection.
depending on initialization,
the first byte of a new frame was invalidated or not.
As a consequence, one match opportunity was available or not,
resulting in slightly different compressed sizes
(on average, 1 or 2 bytes once every 20 frames).
It impacted ratio comparison between one-shot and streaming modes.
This fix makes the first byte of a new frame always a valid match.
Now compressed size is always the same.
It also improves compressed size by a negligible amount.
Compare the input and output files by their inode number and
refuse to open the output file if the input file is the same.
This doesn't work when (de)compressing multiple files to a single
file, but that is a very uncommon use case, mostly used for
benchmarking by me.
Fixes#1422.
* Fix `ZSTD_estimateCCtxSize()` with negative levels.
* Fix `ZSTD_estimateCStreamSize()` with negative levels.
* Add a unit test to test for this error.
When srcSize is small,
the nb of symbols produced is likely too small to warrant dedicated probability tables.
In which case, predefined distribution tables will be used instead.
There is a cheap algorithm in btultra initialization :
it presumes default distribution will be used if srcSize <= 1024.
btultra2 now uses the same threshold to shut down probability estimation,
since measured frequencies won't be used at entropy stage,
and therefore relying on them to determine sequence cost is misleading,
resulting in worse compression ratios.
This fixes btultra2 performance issue on very small input.
Note that, a proper way should be
to determine which symbol is going to use predefined probaility
and which symbol is going to use dynamic ones.
But the current algorithm is unable to make a "per-symbol" decision.
So this will require significant modifications.
The `--no-progress` flag disables zstd's progress bars, but leaves
the summary.
I've added simple tests to `playTests.sh` to make sure the parsing
works.
When we switched `ZSTD_SKIPPABLEHEADERSIZE` to a macro, the places where we do:
MEM_readLE32(ptr) + ZSTD_SKIPPABLEHEADERSIZE
can now overflow `(unsigned)-8` to `0` and we infinite loop. We now check
the frame size and reject sizes that overflow a U32.
Note that this bug never made it into a release, and was only in the dev branch
for a few days.
Credit to OSS-Fuzz