ZSTD_decompress() can decompress multiple frames sent as a single input.
But the input size must be the exact sum of all compressed frames, no more.
In the case of a mistake on srcSize, being larger than required,
ZSTD_decompress() will try to decompress a new frame after current one, and fail.
As a consequence, it will issue an error code, ERROR(prefix_unknown).
While the error is technically correct
(the decoder could not recognise the header of _next_ frame),
it's confusing, as users will believe that the first header of the first frame is wrong,
which is not the case (it's correct).
It makes it more difficult to understand that the error is in the source size, which is too large.
This patch changes the error code provided in such a scenario.
If (at least) a first frame was successfully decoded,
and then following bytes are garbage values,
the decoder assumes the provided input size is wrong (too large),
and issue the error code ERROR(srcSize_wrong).
On my laptop:
Before:
./zstd32 -b --zstd=wlog=27 silesia.tar enwik8 -S
3#silesia.tar : 211984896 -> 66683478 (3.179), 97.6 MB/s , 400.7 MB/s
3#enwik8 : 100000000 -> 35643153 (2.806), 76.5 MB/s , 303.2 MB/s
After:
./zstd32 -b --zstd=wlog=27 silesia.tar enwik8 -S
3#silesia.tar : 211984896 -> 66683478 (3.179), 97.4 MB/s , 435.0 MB/s
3#enwik8 : 100000000 -> 35643153 (2.806), 76.2 MB/s , 338.1 MB/s
Mileage vary, depending on file, and cpu type.
But a generic rule is : x86 benefits less from "long-offset mode" than x64,
maybe due to register pressure.
On "entropy", long-mode is _never_ a win for x86.
On my laptop though, it may, depending on file and compression level
(enwik8 benefits more from "long-mode" than silesia).
ZSTD_create?Dict() is required to produce a ?Dict* return type
because `free()` does not accept a `const type*` argument.
If it wasn't for this restriction, I would have preferred to create a `const ?Dict*` object
to emphasize the fact that, once created, a dictionary never changes
(hence can be shared concurrently until the end of its lifetime).
There is no such limitation with initStatic?Dict() :
as stated in the doc, there is no corresponding free() function,
since `workspace` is provided, hence allocated, externally,
it can only be free() externally.
Which means, ZSTD_initStatic?Dict() can return a `const ZSTD_?Dict*` pointer.
Tested with `make all`, to catch initStatic's users,
which, incidentally, also updated zstd.h documentation.
it still fallbacks to single-thread blocking invocation
when input is small (<1job)
or when invoking ZSTDMT_compress(), which is blocking.
Also : fixed a bug in new block-granular compression routine.
It used to stop on reaching extDict, for simplification.
As a consequence, there was a small loss of performance each time the round buffer would restart from beginning.
It's not a large difference though, just several hundreds of bytes on silesia.
This patch fixes it.
It does not feel "right" from a dependency perspective.
ZSTD_initDCtx_internal() is triggered once, on DCtx creation,
while ZSTD_decompressBegin() is invoked at the beginning of each new frame,
and is also a user-facing prototype.
Downside : a DCtx must be init before first usage !
This was always the intention by the way, and is documented as such.
This stage is automatically done within ZSTD_decompress() and variants,
and also within ZSTD_decompressStream().
Only ZSTD_decompressContinue() is impacted,
it must be preceded by a ZSTD_decompressBegin(), as detailed in doc.
A test has been fixed, to no longer rely on undocumented assumption that ZSTD_decompressBegin() is invoked during init.