Building the zstd CLI costs time.
Some part of it is incompressible, leading to substantial iteration delay when testing code modifications.
That's mainly because all source files from the library must be rebuilt from source every time.
The main reason we don't build the CLI from library object files
is that we can't just build the object directly in the lib/ directory
(which they would by default)
since they use different compilation flags.
Specifically, the CLI enables multithreading, while the library doesn't (by default).
This is solved in this commit, by generating the object files locally.
Now, the CLI and the library can employ different sets of flags, without tripping over each other.
All library object files are generated directly into programs/ dir.
This works because no 2 source files have the same name.
Now, modifying a file doesn't require to recompile the entire lib, just the modified files.
The recipe is also compatible with `-j` parallel build, leading to large build time reductions on multi-core systems.
previous recipe would build object files directly within programs/
which could be in competition with other local builds happening in programs/ at the same time.
fixed by generating the relevant object file locally.
There are compilation environments in aarch64 where NEON isn't
available. While these environments could define ZSTD_NO_INTRINSICS,
it's more fail-safe to use the more specific symbol to know if NEON
extensions are available.
__ARM_NEON is the proper symbol, defined in ARM C Language Extensions
Release 2.1 (https://developer.arm.com/documentation/ihi0053/d/). Some
sources suggest __ARM_NEON__, but that's the obsolete spelling from
prior versions of the standard.
Signed-off-by: Warner Losh <imp@bsdimp.com>
The problem occurs in this scenario:
1. We find a synchronization point.
2. We attmept to create the job.
3. We fail because the job table is full: `mtctx->nextJobID > mtctx->doneJobID + mtctx->jobIDMask`.
4. We call `ZSTDMT_compressStream_generic` again.
5. We forget that we're at a sync point already, and we continue looking
for the next sync point.
This fix is to detect if we're currently paused at a sync point, and if
we are then don't load any more input.
Caught by zstreamtest. I modified it to make the bug occur more often
(~1/100K -> ~1/200) and verified that it is fixed after. I then ran a
few hundred thousand unmodified zstreamtest iterations to verify.
When zstdmt cannot get a buffer and `ZSTD_e_end` is passed an empty
compression job can be created. Additionally, `mtctx->frameEnded` can be
set to 1, which could potentially cause problems like unterminated blocks.
The fix is to adjust to `ZSTD_e_flush` even when we can't get a buffer.
* Run compression twice and check the compressed data is byte-identical.
The compression loop had to be rewritten to ensure deteriminism. It is
guaranteed by always making maximal forward progress.
* When nbWorkers > 0, change the number of workers 1/8 of the time.
* Run in single-pass mode 1/4 of the time.
I've run a few hundred thousand iterations of zstreamtest and have seen
no deteriminism issues so far. Before the zstdmt fix that skips the
single-pass shortcut non-determinism showed up in a few hundred
iterations.
This commit leaves only the functions used by zstd_compress.c. All other
functions have been removed from the API. The ZSTDMT unit tests in
fuzzer.c and zstreamtest.c have been rewritten to use the ZSTD API. And
the --mt zstreamtest tests have been ripped out.
Simplifies the code and removes blocking from zstdmt.
At this point we could completely delete
`ZSTDMT_compress_advanced_internal()`. However I'm leaving it in because
I think we want to do that in the zstd-1.5.0 release, in case anyone is
still using the ZSTDMT API, even though it is not installed by default.
Fixes#2327.
Pass in the `ZSTD_cParamMode_e` to select how we define our cparams.
Based on the mode we either take the `dictSize` into account or we set
it to `0`. See the documentation for `ZSTD_cParamMode_e`.
Some of the modes currently share the same behavior. But they have
distinct modes because they are drastically different cases. E.g.
compression + reprocessing the dictionary and creating a cdict.
Additionally, when downsizing the hashLog and chainLog take the
(adjusted) dictionary size into account, since the size of the
dictionary gets added onto the window size.
Adds a simple test to ensure that we aren't downsizing too far.
The DDS structure can't be copied into the working tables like the DMS.
So it doesn't need to account for the source size when sizing its
parameters, just the dictionary size.
Conditions to trigger:
* CDict is loaded as raw content.
* CDict starts with the zstd dictionary magic number.
* The CDict is reprocessed (not attached or copied).
* The new API is used (streaming or `ZSTD_compress2()`).
Bug: The dictionary is loaded as a zstd dictionary, not a raw content
dictionary, because the dict content type is set to `ZSTD_dct_auto`.
Fix: Pass in the dictionary content type from cdict creation to the call
to `ZSTD_compress_insertDictionary()`.
Test: Added a test case that exposes the bug, and fixed the raw
content tests to not modify the `dictBuffer`, which makes all future
tests with the `dictBuffer` raw content, which doesn't seem intentional.