1# Fuzzing 2 3Each fuzzing target can be built with multiple engines. 4Zstd provides a fuzz corpus for each target that can be downloaded with 5the command: 6 7``` 8make corpora 9``` 10 11It will download each corpus into `./corpora/TARGET`. 12 13## fuzz.py 14 15`fuzz.py` is a helper script for building and running fuzzers. 16Run `./fuzz.py -h` for the commands and run `./fuzz.py COMMAND -h` for 17command specific help. 18 19### Generating Data 20 21`fuzz.py` provides a utility to generate seed data for each fuzzer. 22 23``` 24make -C ../tests decodecorpus 25./fuzz.py gen TARGET 26``` 27 28By default it outputs 100 samples, each at most 8KB into `corpora/TARGET-seed`, 29but that can be configured with the `--number`, `--max-size-log` and `--seed` 30flags. 31 32### Build 33It respects the usual build environment variables `CC`, `CFLAGS`, etc. 34The environment variables can be overridden with the corresponding flags 35`--cc`, `--cflags`, etc. 36The specific fuzzing engine is selected with `LIB_FUZZING_ENGINE` or 37`--lib-fuzzing-engine`, the default is `libregression.a`. 38Alternatively, you can use Clang's built in fuzzing engine with 39`--enable-fuzzer`. 40It has flags that can easily set up sanitizers `--enable-{a,ub,m}san`, and 41coverage instrumentation `--enable-coverage`. 42It sets sane defaults which can be overridden with flags `--debug`, 43`--enable-ubsan-pointer-overflow`, etc. 44Run `./fuzz.py build -h` for help. 45 46### Running Fuzzers 47 48`./fuzz.py` can run `libfuzzer`, `afl`, and `regression` tests. 49See the help of the relevant command for options. 50Flags not parsed by `fuzz.py` are passed to the fuzzing engine. 51The command used to run the fuzzer is printed for debugging. 52 53Here's a helpful command to fuzz each target across all cores, 54stopping only if a bug is found: 55``` 56for target in $(./fuzz.py list); do 57 ./fuzz.py libfuzzer $target -jobs=10 -workers=10 -max_total_time=1000 || break; 58done 59``` 60Alternatively, you can fuzz all targets in parallel, using one core per target: 61``` 62python3 ./fuzz.py list | xargs -P$(python3 ./fuzz.py list | wc -l) -I__ sh -c "python3 ./fuzz.py libfuzzer __ 2>&1 | tee __.log" 63``` 64Either way, to double-check that no crashes were found, run `ls corpora/*crash`. 65If any crashes were found, you can use the hashes to reproduce them. 66 67## LibFuzzer 68 69``` 70# Build the fuzz targets 71./fuzz.py build all --enable-fuzzer --enable-asan --enable-ubsan --cc clang --cxx clang++ 72# OR equivalently 73CC=clang CXX=clang++ ./fuzz.py build all --enable-fuzzer --enable-asan --enable-ubsan 74# Run the fuzzer 75./fuzz.py libfuzzer TARGET <libfuzzer args like -jobs=4> 76``` 77 78where `TARGET` could be `simple_decompress`, `stream_round_trip`, etc. 79 80### MSAN 81 82Fuzzing with `libFuzzer` and `MSAN` is as easy as: 83 84``` 85CC=clang CXX=clang++ ./fuzz.py build all --enable-fuzzer --enable-msan 86./fuzz.py libfuzzer TARGET <libfuzzer args> 87``` 88 89`fuzz.py` respects the environment variables / flags `MSAN_EXTRA_CPPFLAGS`, 90`MSAN_EXTRA_CFLAGS`, `MSAN_EXTRA_CXXFLAGS`, `MSAN_EXTRA_LDFLAGS` to easily pass 91the extra parameters only for MSAN. 92 93## AFL 94 95The default `LIB_FUZZING_ENGINE` is `libregression.a`, which produces a binary 96that AFL can use. 97 98``` 99# Build the fuzz targets 100CC=afl-clang CXX=afl-clang++ ./fuzz.py build all --enable-asan --enable-ubsan 101# Run the fuzzer without a memory limit because of ASAN 102./fuzz.py afl TARGET -m none 103``` 104 105## Regression Testing 106 107The regression test supports the `all` target to run all the fuzzers in one 108command. 109 110``` 111CC=clang CXX=clang++ ./fuzz.py build all --enable-asan --enable-ubsan 112./fuzz.py regression all 113CC=clang CXX=clang++ ./fuzz.py build all --enable-msan 114./fuzz.py regression all 115``` 116 117## Fuzzing a custom sequence producer plugin 118Sequence producer plugin authors can use the zstd fuzzers to stress-test their code. 119See the documentation in `fuzz_third_party_seq_prod.h` for details. 120 121## Adding a new fuzzer 122There are several steps involved in adding a new fuzzer harness. 123 124### Build your harness 1251. Create a new your fuzzer harness `tests/fuzz/your_harness.c`. 126 1272. Add your harness to the Makefile 128 129 2.1 Follow [this example](https://github.com/facebook/zstd/blob/e124e39301381de8f323436a3e4c46539747ba24/tests/fuzz/Makefile#L216) if your fuzzer requires both compression and decompression symbols (prefix `rt_`). If your fuzzer only requires decompression symbols, follow [this example](https://github.com/facebook/zstd/blob/6a0052a409e2604bd40354b76b86272b712edd7d/tests/fuzz/Makefile#L194) (prefix `d_`). 130 131 2.2 Add your target to [`FUZZ_TARGETS`](https://github.com/facebook/zstd/blob/6a0052a409e2604bd40354b76b86272b712edd7d/tests/fuzz/Makefile#L108). 132 1333. Add your harness to [`fuzz.py`](https://github.com/facebook/zstd/blob/6a0052a409e2604bd40354b76b86272b712edd7d/tests/fuzz/fuzz.py#L48). 134 135### Generate seed data 136Follow the instructions above to generate seed data: 137``` 138make -C ../tests decodecorpus 139./fuzz.py gen your_harness 140``` 141 142### Run the harness 143Follow the instructions above to run your harness and fix any crashes: 144``` 145./fuzz.py build your_harness --enable-fuzzer --enable-asan --enable-ubsan --cc clang --cxx clang++ 146./fuzz.py libfuzzer your_harness 147``` 148 149### Minimize and zip the corpus 150After running the fuzzer for a while, you will have a large corpus at `tests/fuzz/corpora/your_harness*`. 151This corpus must be minimized and zipped before uploading to GitHub for regression testing: 152``` 153./fuzz.py minimize your_harness 154./fuzz.py zip your_harness 155``` 156 157### Upload the zip file to GitHub 158The previous step should produce a `.zip` file containing the corpus for your new harness. 159This corpus must be uploaded to GitHub here: https://github.com/facebook/zstd/releases/tag/fuzz-corpora 160 161 162