Commit Graph

1027 Commits

Author SHA1 Message Date
Yann Collet
ededcfca57 fix confusion between unsigned <-> U32
as suggested in #1441.

generally U32 and unsigned are the same thing,
except when they are not ...

case : 32-bit compilation for MIPS (uint32_t == unsigned long)

A vast majority of transformation consists in transforming U32 into unsigned.
In rare cases, it's the other way around (typically for internal code, such as seeds).

Among a few issues this patches solves :
- some parameters were declared with type `unsigned` in *.h,
  but with type `U32` in their implementation *.c .
- some parameters have type unsigned*,
  but the caller user a pointer to U32 instead.

These fixes are useful.

However, the bulk of changes is about %u formating,
which requires unsigned type,
but generally receives U32 values instead,
often just for brevity (U32 is shorter than unsigned).
These changes are generally minor, or even annoying.

As a consequence, the amount of code changed is larger than I would expect for such a patch.

Testing is also a pain :
it requires manually modifying `mem.h`,
in order to lie about `U32`
and force it to be an `unsigned long` typically.
On a 64-bit system, this will break the equivalence unsigned == U32.
Unfortunately, it will also break a few static_assert(), controlling structure sizes.
So it also requires modifying `debug.h` to make `static_assert()` a noop.
And then reverting these changes.

So it's inconvenient, and as a consequence,
this property is currently not checked during CI tests.
Therefore, these problems can emerge again in the future.

I wonder if it is worth ensuring proper distinction of U32 != unsigned in CI tests.
It's another restriction for coding, adding more frustration during merge tests,
since most platforms don't need this distinction (hence contributor will not see it),
and while this can matter in theory, the number of platforms impacted seems minimal.

Thoughts ?
2018-12-21 18:09:41 -08:00
Nick Terrell
a24f73bece [regression] Update results.csv 2018-12-20 17:40:48 -08:00
Yann Collet
41b45b84a1
Merge pull request #1465 from facebook/noFilePresent
fixed : detection of non-existing file
2018-12-20 17:21:04 -08:00
Yann Collet
6e9512a70c
Merge pull request #1463 from yijinfb/getenv
Add support for environment variable ZSTD_CLEVEL in CLI
2018-12-20 15:17:00 -08:00
Yann Collet
e129174d1d fixed shadowing of variable time
some standard lib do define `time` as a global variable
shadowing local declarations ...
2018-12-20 14:54:05 -08:00
Yann Collet
72dbf1bcd0 removed strncpy() from util.c
as Visual surprisingly complains about their usage.
Replaced by memcpy()
2018-12-20 12:27:12 -08:00
Yann Collet
105fa953cb use strerror() to generate error message
as suggested by @terrelln .

also:
- hopefully fixed Windows version
- changed the test, so that it passes on non-english OS stdlib errors.
2018-12-20 09:16:40 -08:00
Yann Collet
173ef9dea2 fixed : detection of non-existing file
better error message
with test
2018-12-19 18:30:57 -08:00
Yann Collet
a835e9cb81
Merge pull request #1461 from terrelln/regression
[regression] Add more configs
2018-12-19 17:53:15 -08:00
Yann Collet
bb7e6018af
Merge pull request #1462 from facebook/btultra2.3
fixed ossfuzz 11849
2018-12-19 17:48:11 -08:00
Yann Collet
2898afab52 fixed OSSfuzz 11849
The problem was already masked,
due to no longer accepting tiny blocks for statistics.

But in case it could still happen with not-so-tiny blocks,
there is a stricter control which ensures that
nothing was already loaded prior to statistics collection.
2018-12-19 16:54:15 -08:00
Yi Jin
26a9ae3f5f refactor readU32FromChar(...), improve init_cLevel(...), and add env var ZSTD_CLEVEL tests 2018-12-19 16:45:42 -08:00
Nick Terrell
6e6315ae46 [regression] Add more configs
* Add configs that test multithreading, LDM, and setting explicit
  parameters.
* Update the `compress cctx` method to accept `ZSTD_parameters`.
* Compile against the multithreaded `libzstd.a`.
* Update `results.csv` for the new configs.

Unless you think there are more configs/methods I should test, I think
we have a fairly wide set of configs/methods, so I'll pause adding
more for now.
2018-12-19 16:36:26 -08:00
Yann Collet
2f67ac3dce
Merge pull request #1460 from facebook/btultra2.2
fixed: compression ratio discrepancy
2018-12-19 15:00:15 -08:00
Yann Collet
78c4ea4930 added tests case 2018-12-19 14:10:27 -08:00
Nick Terrell
3a4634f2af
Merge pull request #1459 from terrelln/destroy
[zstdcli] Refuse to overwrite input file
2018-12-18 17:03:54 -08:00
Nick Terrell
cd2c8defad [zstdcli] Refuse to overwrite input file
Compare the input and output files by their inode number and
refuse to open the output file if the input file is the same.

This doesn't work when (de)compressing multiple files to a single
file, but that is a very uncommon use case, mostly used for
benchmarking by me.

Fixes #1422.
2018-12-18 15:29:54 -08:00
Nick Terrell
d7def456d8 [libzstd] Fix estimate with negative levels
* Fix `ZSTD_estimateCCtxSize()` with negative levels.
* Fix `ZSTD_estimateCStreamSize()` with negative levels.
* Add a unit test to test for this error.
2018-12-18 14:24:49 -08:00
Yann Collet
517d8c984c
Merge pull request #1449 from facebook/ovlog_def
overlapLog default values
2018-12-18 09:45:53 -08:00
Nick Terrell
bdfcaecc0a [zstdcli] Add --no-progress flag
The `--no-progress` flag disables zstd's progress bars, but leaves
the summary.

I've added simple tests to `playTests.sh` to make sure the parsing
works.
2018-12-14 11:50:25 -08:00
Yann Collet
96adc846c5 fixed tests
with correct pointer type
2018-12-13 16:50:19 -08:00
Nick Terrell
75fa3f2eb7
Merge pull request #1446 from terrelln/overflow
[libzstd] Fix infinite loop in decompression
2018-12-13 16:21:15 -08:00
Nick Terrell
aaea4ef924 [libzstd] Fix infinite loop in decompression
When we switched `ZSTD_SKIPPABLEHEADERSIZE` to a macro, the places where we do:

    MEM_readLE32(ptr) + ZSTD_SKIPPABLEHEADERSIZE

can now overflow `(unsigned)-8` to `0` and we infinite loop. We now check
the frame size and reject sizes that overflow a U32.

Note that this bug never made it into a release, and was only in the dev branch
for a few days.

Credit to OSS-Fuzz
2018-12-13 15:13:19 -08:00
Yann Collet
1993f5d412 fixed ovlog tests
and updated man page
2018-12-12 21:09:14 -08:00
Yann Collet
f2f86d369b Merge branch 'btultra2' into ovlog_def 2018-12-12 20:58:14 -08:00
Yann Collet
9a92ed401d updated compression results.csv
and fixed nit
2018-12-12 20:30:09 -08:00
Yann Collet
9792acda3b Merge branch 'dev' into btultra2 2018-12-12 20:18:27 -08:00
Yann Collet
eee789b7ea continued: changed to overlapLog
in deeper code layer.
for consistency.
2018-12-11 17:41:42 -08:00
Yann Collet
9b784dec7f changed parameter name to ZSTD_c_overlapLog
from overlapSizeLog.

Reasoning :
`overlapLog` is already used everwhere, in the code, command line and documentation.
`ZSTD_c_overlapSizeLog` feels unnecessarily different.
2018-12-11 16:55:33 -08:00
Nick Terrell
8c99e311cf Reset the cctx for documentation/safety 2018-12-11 15:57:56 -08:00
Nick Terrell
fcfea057a1 [regression] add more methods 2018-12-11 13:10:22 -08:00
Yann Collet
9c3265a53f
Merge pull request #1417 from facebook/advancedAPI
Advanced API
2018-12-10 18:48:15 -08:00
Yann Collet
5e6aaa3abb fixed btultra2 usage with prefix
notably while using multi-threading
2018-12-10 18:45:03 -08:00
Yann Collet
b71bfb6cf2 paramgrill: add status line
get information on which config is currently tested
so that console get animated during long tests.
2018-12-07 16:02:24 -08:00
Yann Collet
27b253fadc added tests for strategy=9 (btultra2) 2018-12-07 14:20:54 -08:00
Yann Collet
e68c2d86e7 refactor paramgrill for clarity
restored ability to copy/paste the resulting compression level table into zstd_compress.c .
2018-12-07 14:07:54 -08:00
Yann Collet
39e28982cf introduced constants ZSTD_STRATEGY_MIN and ZSTD_STRATEGY_MAX 2018-12-06 16:16:16 -08:00
Yann Collet
be9e561da4 changed ZSTD_c_compressionStrategy into ZSTD_c_strategy
also : fixed paramgrill, and limit conditions
2018-12-06 15:00:52 -08:00
Yann Collet
3583d19c4e changed parameter names from ZSTD_p_* to ZSTD_c_*
for naming consistency
2018-12-05 17:26:02 -08:00
Yann Collet
3e042d5cc0 ZSTD_decompressDCtx() is compatible with sticky parameters 2018-12-04 17:30:58 -08:00
Yann Collet
d7da3fc90a merge dedicated dParam setters 2018-12-04 17:06:48 -08:00
Yann Collet
2fb8d1a392 fixed declaration-after-statement warnings 2018-12-04 15:54:01 -08:00
Yann Collet
aec945f0dc implemented ZSTD_dParam_getBounds()
and ZSTD_DCtx_setParameter()
2018-12-04 15:35:37 -08:00
Yann Collet
34e146f548 advanced decompression function replaces by normal streaming one
advanced parameters compatible with ZSTD_decompressStream().
2018-12-04 10:28:36 -08:00
Yann Collet
6ced8f7c7c joined normal streaming API with advanced one 2018-12-03 14:22:38 -08:00
Nick Terrell
e859862341 [regression] Add dictionary support
Dictionaries are prebuilt and saved as part of the data object.
The config decides whether or not to use the dictionary if it is
available. Configs that require dictionaries are only run with
data that have dictionaries. The method will skip configs that are
irrelevant, so for example ZSTD_compress() will skip configs with
dictionaries.

I've also trimmed the silesia source to 1MB per file (12 MB total),
and added 500 samples from the github data set with a dictionary.

I've intentionally added an extra line to the `results.csv` to make
the nightly build fail, so that we can see how CircleCI reports it.

Full list of changes:

* Add pre-built dictionaries to the data.
* Add `use_dictionary` and `no_pledged_src_size` flags to the config.
* Add a config using a dictionary for every level.
* Add a config that specifies no pledged source size.
* Support dictionaries and streaming in the `zstdcli` method.
* Add a context-reuse method using `ZSTD_compressCCtx()`.
* Clean up the formatting of the `results.csv` file to align columns.
* Add `--data`, `--config`, and `--method` flags to constrain each
  to a particular value. This is useful for debugging a failure
  or debugging a particular config/method/data.
2018-11-30 18:23:01 -08:00
Yann Collet
d8e215cbee created ZSTD_compress2() and ZSTD_compressStream2()
ZSTD_compress_generic() is renamed ZSTD_compressStream2().

Note that, for the time being,
the "stable" API and advanced one use different parameter planes :
setting parameters using the advanced API does not influence ZSTD_compressStream()
and using ZSTD_initCStream() does not influence parameters for ZSTD_compressStream2().
2018-11-30 11:25:56 -08:00
Yann Collet
090bc808a8
Merge pull request #1432 from terrelln/regression
[regression] Add initial regression test framework
2018-11-29 16:06:40 -08:00
Nick Terrell
4aaa36f74b [regression] Add initial regression test framework
The regression tests run nightly or on the `regression`
branch for convenience. The results get uploaded as the
artifacts of the job. If they change, check the diff
printed in the job. If all is well, download the new
results and commit them to the repo.

This code will only run on a UNIX like platform. It
could be made to run on Windows, but I don't think that
it is necessary. It also uses C99.

* data: This module defines the data to run tests on.
  It downloads data from a URL into a cache directory,
  checks it against a checksum, and unpacks it. It also
  provides helpers for accessing the data.
* config: This module defines the configs to run tests
  with. A config is a set of API parameters and a set of
  CLI flags.
* result: This module is a helper for method that defines
  the result type.
* method: This module defines the compression methods
  to test. It is what runs the regression test using the
  data and the config. It reports the total compressed
  size, or an error/skip.
* test: This is the test binary that runs the tests for
  every (data, config, method) tuple, and prints the
  results to the output file and stderr.
* results.csv: The results that the current commit is
  expected to produce.
2018-11-29 14:33:04 -08:00
Lzu Tao
d095adf9fb Add simple test for zstdgrep 2018-11-29 03:39:47 +07:00