Tests all `.h`, `.c`, `.py`, and `Makefile` files for valid copyright
and license lines. Excludes a small number of exceptions (threading, and
divsufsort).
* Copyright does not contains `present`
* Copyright contains `Facebook, Inc`
* Copyright contains the current year
* License contains exactly the lines we expect
* All copyright lines now have -2020 instead of -present
* All copyright lines include "Facebook, Inc"
* All licenses are now standardized
The copyright in `threading.{h,c}` is not changed because it comes from
zstdmt.
The copyright and license of `divsufsort.{h,c}` is not changed.
* Removing symbols that are not being tested
* Removing symbols used in zstdcli, fileio, dibio and benchzstd
* Removing symbols used in zbuff and add test-zbuff to travis
* Removing remaining symbols and adding unit tests instead
* Removing symbols test entirely
* Make arugments optional and add --dict argument
* Removing accidental print statement
* Change to more likely scenario for dictionary compression benchmark
Super blocks must never violate the zstd block bound of input_size + ZSTD_blockHeaderSize. The individual sub-blocks may, but not the super block. If the superblock violates the block bound we are liable to violate ZSTD_compressBound(), which we must not do. Whenever the super block violates the block bound we instead emit an uncompressed block.
This means we increase the latency because of the single uncompressed block. I fix this by enabling streaming an uncompressed block, so the latency of an uncompressed block is 1 byte. This doesn't reduce the latency of the buffer-less API, but I don't think we really care.
* I added a test case that verifies that the decompression has 1 byte latency.
* I rely on existing zstreamtest / fuzzer / libfuzzer regression tests for correctness. During development I had several correctness bugs, and they easily caught them.
* The added assert that the superblock doesn't violate the block bound will help us discover any missed conditions (though I think I got them all).
Credit to OSS-Fuzz.
* Adding new cli endpoint --diff-from=
* Appveyor conversion nit
* Using bool set trick instead of direct set
* Removing --diff-from and only leaving --diff-from=#
* Throwing error when both dictFileName vars are set
* Clean up syntax
* Renaming diff-from to patch-from
* Revering comma separated syntax clean up
* Updating playtests with patch-from
* Uncommenting accidentally commented
* Updating remaining docs and var names to be patch-from instead of diff-from
* Constifying
* Using existing log2 function and removing newly created one
* Argument order (moving prefs to end)
* Using comma separated syntax
* Moving to outside #ifndef
* Allow zero sized buffers in `stream_decompress`. Ensure that we never have two
zero sized buffers in a row so we guarantee forwards progress.
* Make case 4 in `stream_round_trip` do a zero sized buffers call followed by
a full call to guarantee forwards progress.
* Fix `limitCopy()` in legacy decoders.
* Fix memcpy in `zstdmt_compress.c`.
Catches the bug fixed in PR #1939
The `numFiles` variable wasn't updated, so the fuzzer didn't do anything.
I did two things to fix this:
1. Remove the `numFiles` variable entirely.
2. Error if we can't open a file and print the number of files tested.
* Initial revised automated benchmarking script
* Updating nb_iterations and making loop infinite
* Allowing benchmarking params to be changed from cli
* Renaming old speed test
* Removing numpy dependency for cli
* Change filename and benchmakr on pr level
* Moving build outside loop and adding iterations param
* Moving benchmarking to seperate travis ci test
* Fixing typo and using unused variable
* Added mode labels and updated README accordingly
* Adding new mode 'current' that compraes facebook:dev against current hash
* Typo
* Reverting previous accidental diff
* Typo
* Adding frequency config variable to prevent github from blacklisting
* Added new argument for frequency of fetching new prs
* Updating documentation
* Adding fail logging for superblock flow
* Dividing by targetCBlockSize instead of blockSize
* Adding new const and using more acurate formula for nbBlocks
* Only do dstCapacity check if using superblock
* Remvoing disabling logic
* Updating test to make it catch more extreme case of previou bug
* Also updating comment
* Only taking compressEnd shortcut on non-superblock
Fixes a fuzz issue where dictionary_round_trip failed because the compressor was generating corrupt files thanks to zero weights in the table.
* Only setting loaded dict huf table to valid on non-zero
* Adding hasNoZeroWeights test to fse tables
* Forbiding nbBits != 0 when weight == 0
* Reverting the last commit
* Setting table log to 0 when weight == 0
* Small (invalid) zero weight dict test
* Small (valid) zero weight dict test
* Initializing repeatMode vars to check before zero check
* Removing FSE changes to seperate pr
* Reverting accidentally changed file
* Negating bool, using unsigned, optimization nit
/dev/null permissions were modified when using sudo rights.
This fixes this bug during decompression.
More importantly, this patch adds a test, triggered in TravisCI,
ensuring unaltered /dev/null permissions.
date(1) is used to display the last modification time of a file, which
is not supported on OpenBSD, FreeBSD and Darwin. Instead use stat(1).
Tested on OpenBSD.
* Silently skip dictionaries less than 8 bytes, unless using `ZSTD_dct_fullDict`.
This changes the compressor, which silently skips dictionaries <= 8 bytes.
* Allow repcodes that are equal to the dictionary content size, since it is in bounds.
* Adds the fuzzer
* Adds an additional `InputType` for the fuzzer
I ran the fuzzer for about 10 minutes and it found 2 bugs:
* Catches the original bug without any help
* Catches an additional bug with 8-byte dictionaries
* A copy-paste error made it so we weren't running the advanced/cdict
streaming tests with the old API.
* Clean up the old streaming tests to skip incompatible configs.
* Update `results.csv`.
The tests now catch the bug in #1787.
Compression ratio of fast strategies (levels 1 & 2)
was seriously reduced, due to accidental disabling of Literals compression.
Credit to @QrczakMK, which perfectly described the issue, and implementation details,
making the fix straightforward.
Example : initCStream with level 1 on synthetic sample P50 :
Before : 5,273,976 bytes
After : 3,154,678 bytes
ZSTD_compress (for comparison) : 3,154,550
Fix#1787.
To follow : refactor the test which was supposed to catch this issue (and failed)
* Fix `ZSTD_FRAMEHEADERSIZE_PREFIX` and `ZSTD_FRAMEHEADERSIZE_MIN` to
take a `format` parameter, so it is impossible to get the wrong size.
* Fix the places that called `ZSTD_FRAMEHEADERSIZE_PREFIX` without
taking the format into account, which is now impossible by design.
* Call `ZSTD_frameHeaderSize_internal()` with `dctx->format`.
* The added tests catch both bugs in `ZSTD_decompressFrame()`.
Fixes#1813.