keep one in compress_frameChunk(),
so that it's tested at every loop
in case some user simply some large mulit-GB input in a single invocation.
Add one in ZSTD_compressBlock(),
since compressBlock() explicitly skips frameChunk().
experimental function ZSTD_compressBlock() is designed for very small data in mind,
for situation where saving the ~12 bytes of frame header can actually make a difference.
Some systems though may have to deal with small and large data entangled.
If it's larger than a block (> 128KB), compressBlock() cannot compress them in one round.
That's why it's possible to compress in multiple rounds.
This is a chain of compressed blocks.
Some users push this capability to the limit, encoding gigantic chain of blocks.
On crossing the 4GB limit, some internal overflow occurs.
This fix moves the overflow correction mechanism higher in the call chain,
so that it's applied also to gigantic chains of blocks.
Added a test case in fuzzer.c, which crashes before the fix, and pass now.
which can be probed using new function ZSTD_minCLevel().
Also : redefined ZSTD_TARGETLENGTH_MIN/MAX for consistency
used the opportunity to bump version number to v1.3.6
We could undersize the literals buffer by up to 11 bytes,
due to a combination of 2 bugs:
* The literals buffer didn't have `WILDCOPY_OVERLENGTH` extra
space, like it is supposed to.
* We didn't check the literals buffer size in `ZSTD_sufficientBuff()`.
tells in a non-blocking way if there is something ready to flush right now.
only works with multi-threading for the time being.
Useful to know if flush speed will be limited by lack of production.
CDicts were previously guaranteed to be generated with `lowLimit=dictLimit=0`.
This is no longer true, and so the old length and index calculations are no
longer valid. This diff fixes them to handle non-zero start indices in CDicts.
When the primary normalization method fails, and
`(1 << tableLog) == (maxSymbolValue + 1)`, and every symbol gets assigned
normalized weight 1 or -1 in the first loop, then the next division can
raise `SIGFPE`.
The correct parameters are used once, but once `ZSTD_resetCStream()` is
called the default parameters (level 3) are used. Fix this by setting
`requestedParams` in the `ZSTD_initCStream*()` functions.
The added tests both fail before this patch and pass after.
[zstdmt] Fix jobsize bugs
* `ZSTDMT_serialState_reset()` should use `targetSectionSize`, not `jobSize` when sizing the seqstore.
Add an assert that checks that we sized the seqstore using the right job size.
* `ZSTDMT_compressionJob()` should check if `rawSeqStore.seq == NULL`.
* `ZSTDMT_initCStream_internal()` should not adjust `mtctx->params.jobSize` (clamping to MIN/MAX is okay).
It's not necessary to ensure that no job is ongoing.
The pool is only expanded, existing threads are preserved.
In case of error, the only option is to return NULL and terminate the thread pool anyway.
In the new advanced API, adjust the parameters even if they are explicitly
set. This mainly applies to the `windowLog`, and accordingly the `hashLog`
and `chainLog`, when the source size is known.
There were 2 competing set of debug functions
within zstd_internal.h and bitstream.h.
They were mostly duplicate, and required care to avoid messing with each other.
There is now a single implementation, shared by both.
Significant change :
The macro variable ZSTD_DEBUG does no longer exist,
it has been replaced by DEBUGLEVEL,
which required modifying several source files.
in this version, literal compression is always disabled for ZSTD_fast strategy.
Performance parity between ZSTD_compress_advanced() and ZSTD_compress_generic()
result of ZSTD_compress_advanced()
is different from ZSTD_compress_generic()
when using negative compression levels
because the disabling of huffman compression is not passed in parameters.
The (pretty old) code inside ZSTD_compress()
was making some pretty bold assumptions
on what's inside a CCtx and how to init it.
This is pretty fragile by design.
CCtx content evolve.
Knowledge of how to handle that should be concentrate in one place.
A side effect of this strategy
is that ZSTD_compress() wouldn't check for BMI2 capability,
and is therefore missing out some potential speed opportunity.
This patch makes ZSTD_compress() use
the same initialization and release functions
as the normal creator / destructor ones.
Measured on my laptop, with a custom version of bench
manually modified to use ZSTD_compress() (instead of the advanced API) :
This patch :
1#silesia.tar : 211984896 -> 73651053 (2.878), 312.2 MB/s , 723.8 MB/s
2#silesia.tar : 211984896 -> 70163650 (3.021), 226.2 MB/s , 649.8 MB/s
3#silesia.tar : 211984896 -> 66996749 (3.164), 169.4 MB/s , 636.7 MB/s
4#silesia.tar : 211984896 -> 65998319 (3.212), 136.7 MB/s , 619.2 MB/s
dev branch :
1#silesia.tar : 211984896 -> 73651053 (2.878), 291.7 MB/s , 727.5 MB/s
2#silesia.tar : 211984896 -> 70163650 (3.021), 216.2 MB/s , 655.7 MB/s
3#silesia.tar : 211984896 -> 66996749 (3.164), 162.2 MB/s , 633.1 MB/s
4#silesia.tar : 211984896 -> 65998319 (3.212), 130.6 MB/s , 618.6 MB/s
when parameters are "equivalent",
the context is re-used in continue mode,
hence needed workspace size is not recalculated.
This incidentally also evades the size-down check and action.
This patch intercepts the "continue mode"
so that the size-down check and action is actually triggered.
recently introduce into the new dictionary mode.
The bug could be reproduced with this command :
./zstreamtest -v --opaqueapi --no-big-tests -s4092 -t639
error was in function ZSTD_count_2segments() :
the beginning of the 2nd segment corresponds to prefixStart
and not the beginning of the current block (istart == src).
This would result in comparing the wrong byte.
removed "cached" structure.
prices are now saved in the optimal table.
Primarily done for simplification.
Might improve speed by a little.
But actually, and surprisingly, also improves ratio in some circumstances.
recent experienced showed that
default distribution table for offset
can get it wrong pretty quickly with the nb of symbols,
while it remains a reasonable choice much longer for lengths symbols.
Changed the formula,
so that dynamic threshold is now 32 symbols for offsets.
It remains at 64 symbols for lengths.
Detection based on defaultNormLog
zstd rejects blocks which do not compress by at least a certain amount.
In which case, such block is simply emitted uncompressed (even if a little bit of compression could be achieved).
This is better for decompression speed, hence for energy.
The logic is controlled by ZSTD_minGain().
The rule is applied uniformly, at all compression levels.
This change makes btultra accepts blocks with poor compression ratios.
We presume that users of btultra mode prefers compression ratio over some decompress speed gains.
The threshold for minimum gain is lowered for btultra
from s>>6 (~1.5% minimum gain)
to s>>7 (~0.8% minimum gain).
This is a prudent change.
Not sure if it's large enough.
ensure that, when frequency[symbol]==0,
result is (tableLog + 1) bits
with both upper-bit and fractional-bit estimates.
Also : enable BIT_DEBUG in /tests
Work around bug in zstd decoder
Pull request #1144 exercised a new path in the zstd decoder that proved to
be buggy. Avoid the extremely rare bug by emitting an uncompressed block.
Estimate the cost for using FSE modes `set_basic`, `set_compressed`, and
`set_repeat`, and select the one with the lowest cost.
* The cost of `set_basic` is computed using the cross-entropy cost
function `ZSTD_crossEntropyCost()`, using the normalized default count
and the count.
* The cost of `set_repeat` is computed using `FSE_bitCost()`. We check the
previous table to see if it is able to represent the distribution.
* The cost of `set_compressed` is computed with the entropy cost function
`ZSTD_entropyCost()`, together with the cost of writing the normalized
count `ZSTD_NCountCost()`.