fileio.c was continually pushing more content without giving a chance to flush compressed one.
It would block the job queue when input data was accumulated too fast (requiring to define many threads).
Fixed : fileio flushes whatever it can after each input attempt.
Sections 2+ read a bit of data from previous section
in order to improve compression ratio.
This also costs some CPU, to reference read data.
Read data is currently fixed to window>>3 size
By default, section sizes are 4x window size.
This new setting allow manual selection of section sizes.
The larger they are, the (slightly) better the compression ratio,
but also the higher the memory allocation cost,
and eventually the lesser the nb of possible threads,
since each section is compressed by a single thread.
It also introduces a prototype to set generic parameters,
ZSTDMT_setMTCtxParameter()
The idea is that it's possible to add enums
to extend the list of parameters that can be set this way.
This is more long-term oriented than a fixed-size struct.
Consider it as a test.
In some (rare) cases, job list could be blocked by a first job still being processed,
while all following ones are completed, waiting to be flushed.
In such case, the current job-table implementation is unable to accept new job.
As a consequence, a call to ZSTDMT_compressStream() can be useless (nothing read, nothing flushed),
with the risk to trigger a busy-wait on the caller side
(needlessly loop over ZSTDMT_compressStream() ).
In such a case, ZSTDMT_compressStream() will block until the first job is completed and ready to flush.
It ensures some forward progress by guaranteeing it will flush at least a part of the completed job.
Energy-wasting busy-wait is avoided.
Like ZSTD_initCStream_usingDict(),
ZSTDMT_initCStream_usingDict() now keep a copy of dict internally.
This way, dict can be released :
it does not longer have to outlive all future compression sessions.
Correctly compress with custom params and dictionary
Added relevant fuzzer test in zstreamtest
Also :
new macro ZSTDMT_SECTION_LOGSIZE_MIN, which sets a minimum size for a full job
(note : a flush() command can still generate a partial job anytime)
Also : fixed corner case, where nb of jobs completed becomes > jobQueueSize
which is possible when many flushes are issued
while there is not enough dst buffer to flush completed ones.
MT compression generates a single frame.
Multi-threading operates by breaking the frames into independent sections.
But from a decoder perspective, there is no difference :
it's just a suite of blocks.
Problem is, decoder preserves repCodes from previous block to start decoding next block.
This is also valid between sections, since they are no different than changing block.
Previous version would incorrectly initialize repcodes to their default value at the beginning of each section.
When using them, there was a mismatch between encoder (default values) and decoder (values from previous block).
This change ensures that repcodes won't be used at the beginning of a new section.
It works by setting them to 0.
This only works with regular (single segment) variants : extDict variants will fail !
Fortunately, sections beyond the 1st one belong to this category.
To be checked : btopt strategy.
This change was only validated from fast to btlazy2 strategies.