zstd/tests/fuzz
Bimba Shrestha 5b0a452cac
Adding --long support for --patch-from (#1959)
* adding long support for patch-from

* adding refPrefix to dictionary_decompress

* adding refPrefix to dictionary_loader

* conversion nit

* triggering log mode on chainLog < fileLog and removing old threshold

* adding refPrefix to dictionary_round_trip

* adding docs

* adding enableldm + forceWindow test for dict

* separate patch-from logic into FIO_adjustParamsForPatchFromMode

* moving memLimit adjustment to outside ifdefs (need for decomp)

* removing refPrefix gate on dictionary_round_trip

* rebase on top of dev refPrefix change

* making sure refPrefx + ldm is < 1% of srcSize

* combining notes for patch-from

* moving memlimit logic inside fileio.c

* adding display for optimal parser and long mode trigger

* conversion nit

* fuzzer found heap-overflow fix

* another conversion nit

* moving FIO_adjustMemLimitForPatchFromMode outside ifndef

* making params immutable

* moving memLimit update before createDictBuffer call

* making maxSrcSize unsigned long long

* making dictSize and maxSrcSize params unsigned long long

* error on files larger than 4gb

* extend refPrefix test to include round trip

* conversion to size_t

* making sure ldm is at least 10x better

* removing break

* including zstd_compress_internal and removing redundant macros

* exposing ZSTD_cycleLog()

* using cycleLog instead of chainLog

* add some more docs about user optimizations

* formatting
2020-04-17 15:58:53 -05:00
..
.gitignore Update .gitignore 2019-11-20 16:36:40 -08:00
block_decompress.c Fix copyright and license lines 2020-03-26 17:02:06 -07:00
block_round_trip.c Fix copyright and license lines 2020-03-26 17:02:06 -07:00
dictionary_decompress.c Adding --long support for --patch-from (#1959) 2020-04-17 15:58:53 -05:00
dictionary_loader.c Adding --long support for --patch-from (#1959) 2020-04-17 15:58:53 -05:00
dictionary_round_trip.c adding refPrefix 2020-04-06 22:57:49 -07:00
fuzz_data_producer.c Fix copyright and license lines 2020-03-26 17:02:06 -07:00
fuzz_data_producer.h Fix copyright and license lines 2020-03-26 17:02:06 -07:00
fuzz_helpers.h Fix copyright and license lines 2020-03-26 17:02:06 -07:00
fuzz.h Fix copyright and license lines 2020-03-26 17:02:06 -07:00
fuzz.py Fix copyright and license lines 2020-03-26 17:02:06 -07:00
Makefile Fix copyright and license lines 2020-03-26 17:02:06 -07:00
README.md [Fuzz] Improve data generation #1723 2019-09-09 08:43:22 -07:00
regression_driver.c Fix copyright and license lines 2020-03-26 17:02:06 -07:00
simple_compress.c Fix copyright and license lines 2020-03-26 17:02:06 -07:00
simple_decompress.c Fix copyright and license lines 2020-03-26 17:02:06 -07:00
simple_round_trip.c Fix copyright and license lines 2020-03-26 17:02:06 -07:00
stream_decompress.c Fix copyright and license lines 2020-03-26 17:02:06 -07:00
stream_round_trip.c Fix copyright and license lines 2020-03-26 17:02:06 -07:00
zstd_frame_info.c Fix copyright and license lines 2020-03-26 17:02:06 -07:00
zstd_helpers.c Fix copyright and license lines 2020-03-26 17:02:06 -07:00
zstd_helpers.h Fix copyright and license lines 2020-03-26 17:02:06 -07:00

Fuzzing

Each fuzzing target can be built with multiple engines. Zstd provides a fuzz corpus for each target that can be downloaded with the command:

make corpora

It will download each corpus into ./corpora/TARGET.

fuzz.py

fuzz.py is a helper script for building and running fuzzers. Run ./fuzz.py -h for the commands and run ./fuzz.py COMMAND -h for command specific help.

Generating Data

fuzz.py provides a utility to generate seed data for each fuzzer.

make -C ../tests decodecorpus
./fuzz.py gen TARGET

By default it outputs 100 samples, each at most 8KB into corpora/TARGET-seed, but that can be configured with the --number, --max-size-log and --seed flags.

Build

It respects the usual build environment variables CC, CFLAGS, etc. The environment variables can be overridden with the corresponding flags --cc, --cflags, etc. The specific fuzzing engine is selected with LIB_FUZZING_ENGINE or --lib-fuzzing-engine, the default is libregression.a. Alternatively, you can use Clang's built in fuzzing engine with --enable-fuzzer. It has flags that can easily set up sanitizers --enable-{a,ub,m}san, and coverage instrumentation --enable-coverage. It sets sane defaults which can be overridden with flags --debug, --enable-ubsan-pointer-overflow, etc. Run ./fuzz.py build -h for help.

Running Fuzzers

./fuzz.py can run libfuzzer, afl, and regression tests. See the help of the relevant command for options. Flags not parsed by fuzz.py are passed to the fuzzing engine. The command used to run the fuzzer is printed for debugging.

LibFuzzer

# Build the fuzz targets
./fuzz.py build all --enable-fuzzer --enable-asan --enable-ubsan --cc clang --cxx clang++
# OR equivalently
CC=clang CXX=clang++ ./fuzz.py build all --enable-fuzzer --enable-asan --enable-ubsan
# Run the fuzzer
./fuzz.py libfuzzer TARGET <libfuzzer args like -jobs=4>

where TARGET could be simple_decompress, stream_round_trip, etc.

MSAN

Fuzzing with libFuzzer and MSAN is as easy as:

CC=clang CXX=clang++ ./fuzz.py build all --enable-fuzzer --enable-msan
./fuzz.py libfuzzer TARGET <libfuzzer args>

fuzz.py respects the environment variables / flags MSAN_EXTRA_CPPFLAGS, MSAN_EXTRA_CFLAGS, MSAN_EXTRA_CXXFLAGS, MSAN_EXTRA_LDFLAGS to easily pass the extra parameters only for MSAN.

AFL

The default LIB_FUZZING_ENGINE is libregression.a, which produces a binary that AFL can use.

# Build the fuzz targets
CC=afl-clang CXX=afl-clang++ ./fuzz.py build all --enable-asan --enable-ubsan
# Run the fuzzer without a memory limit because of ASAN
./fuzz.py afl TARGET -m none

Regression Testing

The regression test supports the all target to run all the fuzzers in one command.

CC=clang CXX=clang++ ./fuzz.py build all --enable-asan --enable-ubsan
./fuzz.py regression all
CC=clang CXX=clang++ ./fuzz.py build all --enable-msan
./fuzz.py regression all