When ZSTD_decompressStream() detects
that there is enough space in dst
to complete decompression in a single pass,
delegates to ZSTD_decompress(),
for an extra ~5% speed boost
There used to be a (very small) chance that
loading prefix from previous segment
would be confused with a real zstd dictionary.
For that to happen, the prefix needs to start
with the same value as dictionary magic.
That's 1 chance in 4 billions if all values have equal probability.
But in fact, since some values are more common (0x00000000 for example)
others are less common, and dictionary magic was selected to be one of them,
so probabilities are likely even lower.
Anyway, this risk is no down to zero
by adding a new CCtx parameter : ZSTD_p_forceRawDict
Current parameter policy : the parameter "stick" to its CCtx,
so any dictionary loading after ZSTD_p_forceRawDict is set
will be loaded in "raw" ("content only") mode,
even if CCtx is re-used multiple times with multiple different dictionary.
It's up to the user to reset this value differently if it needs so.
Reproduction steps:
```
make zstreamtest CC=clang CFLAGS="-O3 -g -fsanitize=memory -fsanitize-memory-track-origins"
./zstreamtest -vv -t4178 -i4178 -s4531
```
How to get to the error in gdb (may be a more efficient way):
* 2 breaks at zstd_compress.c:2418 -- in ZSTD_compressContinue_internal()
* 2 breaks at zstd_compress.c:2276 -- in ZSTD_compressBlock_internal()
* 1 break at zstd_compress.c:1547
Why the error occurred:
When `zc->forceWindow == 1`, after calling `ZSTD_loadDictionaryContent()` we
have `zc->loadedDictEnd == zc->nextToUpdate == 0`. But, we've really loaded up
to `iend` into the dictionary. Then in `ZSTD_compressBlock_internal()` we see
that `current > zc->nextToUpdate + 384`, so we load the last 192 bytes a second
time. In this case the bytes we are loading are a block of all 0s, starting in
the previous block. So when we are loading the last 192 bytes, we find a `match`
in the future, 183 bytes beyond `ip`. Since the block is all 0s, the match
extends to the end of the block. But in `ZSTD_count()` we only check that
`pIn < pInLoopLimit`, but since `pMatch > pIn`, `pMatch` eventually points past
the end of the buffer, causing the MSAN failure.
The fix:
The line changed sets sets `zc->nextToUpdate` to the end of the dictionary.
This is the behavior that existed before `ZSTD_p_forceWindow` was introduced.
This fixes the exposing test case. Since the code doesn't fail without
`zc->forceWindow`, it makes sense that this works. I've run the command
`./zstreamtest -T2mn` 64 times without failures. CI should also verify nothing
obvious broke.
- Add ZSTD_findDecompressedSize
- Traverses multiple frames to find total output size
- Add ZSTD_getFrameContentSize
- Gets the decompressed size of a single frame by reading header
- Deprecate ZSTD_getDecompressedSize