xref: /aosp_15_r20/external/zstd/tests/fuzz/README.md (revision 01826a4963a0d8a59bc3812d29bdf0fb76416722)
1*01826a49SYabin Cui# Fuzzing
2*01826a49SYabin Cui
3*01826a49SYabin CuiEach fuzzing target can be built with multiple engines.
4*01826a49SYabin CuiZstd provides a fuzz corpus for each target that can be downloaded with
5*01826a49SYabin Cuithe command:
6*01826a49SYabin Cui
7*01826a49SYabin Cui```
8*01826a49SYabin Cuimake corpora
9*01826a49SYabin Cui```
10*01826a49SYabin Cui
11*01826a49SYabin CuiIt will download each corpus into `./corpora/TARGET`.
12*01826a49SYabin Cui
13*01826a49SYabin Cui## fuzz.py
14*01826a49SYabin Cui
15*01826a49SYabin Cui`fuzz.py` is a helper script for building and running fuzzers.
16*01826a49SYabin CuiRun `./fuzz.py -h` for the commands and run `./fuzz.py COMMAND -h` for
17*01826a49SYabin Cuicommand specific help.
18*01826a49SYabin Cui
19*01826a49SYabin Cui### Generating Data
20*01826a49SYabin Cui
21*01826a49SYabin Cui`fuzz.py` provides a utility to generate seed data for each fuzzer.
22*01826a49SYabin Cui
23*01826a49SYabin Cui```
24*01826a49SYabin Cuimake -C ../tests decodecorpus
25*01826a49SYabin Cui./fuzz.py gen TARGET
26*01826a49SYabin Cui```
27*01826a49SYabin Cui
28*01826a49SYabin CuiBy default it outputs 100 samples, each at most 8KB into `corpora/TARGET-seed`,
29*01826a49SYabin Cuibut that can be configured with the `--number`, `--max-size-log` and `--seed`
30*01826a49SYabin Cuiflags.
31*01826a49SYabin Cui
32*01826a49SYabin Cui### Build
33*01826a49SYabin CuiIt respects the usual build environment variables `CC`, `CFLAGS`, etc.
34*01826a49SYabin CuiThe environment variables can be overridden with the corresponding flags
35*01826a49SYabin Cui`--cc`, `--cflags`, etc.
36*01826a49SYabin CuiThe specific fuzzing engine is selected with `LIB_FUZZING_ENGINE` or
37*01826a49SYabin Cui`--lib-fuzzing-engine`, the default is `libregression.a`.
38*01826a49SYabin CuiAlternatively, you can use Clang's built in fuzzing engine with
39*01826a49SYabin Cui`--enable-fuzzer`.
40*01826a49SYabin CuiIt has flags that can easily set up sanitizers `--enable-{a,ub,m}san`, and
41*01826a49SYabin Cuicoverage instrumentation `--enable-coverage`.
42*01826a49SYabin CuiIt sets sane defaults which can be overridden with flags `--debug`,
43*01826a49SYabin Cui`--enable-ubsan-pointer-overflow`, etc.
44*01826a49SYabin CuiRun `./fuzz.py build -h` for help.
45*01826a49SYabin Cui
46*01826a49SYabin Cui### Running Fuzzers
47*01826a49SYabin Cui
48*01826a49SYabin Cui`./fuzz.py` can run `libfuzzer`, `afl`, and `regression` tests.
49*01826a49SYabin CuiSee the help of the relevant command for options.
50*01826a49SYabin CuiFlags not parsed by `fuzz.py` are passed to the fuzzing engine.
51*01826a49SYabin CuiThe command used to run the fuzzer is printed for debugging.
52*01826a49SYabin Cui
53*01826a49SYabin CuiHere's a helpful command to fuzz each target across all cores,
54*01826a49SYabin Cuistopping only if a bug is found:
55*01826a49SYabin Cui```
56*01826a49SYabin Cuifor target in $(./fuzz.py list); do
57*01826a49SYabin Cui    ./fuzz.py libfuzzer $target -jobs=10 -workers=10 -max_total_time=1000 || break;
58*01826a49SYabin Cuidone
59*01826a49SYabin Cui```
60*01826a49SYabin CuiAlternatively, you can fuzz all targets in parallel, using one core per target:
61*01826a49SYabin Cui```
62*01826a49SYabin Cuipython3 ./fuzz.py list | xargs -P$(python3 ./fuzz.py list | wc -l) -I__ sh -c "python3 ./fuzz.py libfuzzer __ 2>&1 | tee __.log"
63*01826a49SYabin Cui```
64*01826a49SYabin CuiEither way, to double-check that no crashes were found, run `ls corpora/*crash`.
65*01826a49SYabin CuiIf any crashes were found, you can use the hashes to reproduce them.
66*01826a49SYabin Cui
67*01826a49SYabin Cui## LibFuzzer
68*01826a49SYabin Cui
69*01826a49SYabin Cui```
70*01826a49SYabin Cui# Build the fuzz targets
71*01826a49SYabin Cui./fuzz.py build all --enable-fuzzer --enable-asan --enable-ubsan --cc clang --cxx clang++
72*01826a49SYabin Cui# OR equivalently
73*01826a49SYabin CuiCC=clang CXX=clang++ ./fuzz.py build all --enable-fuzzer --enable-asan --enable-ubsan
74*01826a49SYabin Cui# Run the fuzzer
75*01826a49SYabin Cui./fuzz.py libfuzzer TARGET <libfuzzer args like -jobs=4>
76*01826a49SYabin Cui```
77*01826a49SYabin Cui
78*01826a49SYabin Cuiwhere `TARGET` could be `simple_decompress`, `stream_round_trip`, etc.
79*01826a49SYabin Cui
80*01826a49SYabin Cui### MSAN
81*01826a49SYabin Cui
82*01826a49SYabin CuiFuzzing with `libFuzzer` and `MSAN` is as easy as:
83*01826a49SYabin Cui
84*01826a49SYabin Cui```
85*01826a49SYabin CuiCC=clang CXX=clang++ ./fuzz.py build all --enable-fuzzer --enable-msan
86*01826a49SYabin Cui./fuzz.py libfuzzer TARGET <libfuzzer args>
87*01826a49SYabin Cui```
88*01826a49SYabin Cui
89*01826a49SYabin Cui`fuzz.py` respects the environment variables / flags `MSAN_EXTRA_CPPFLAGS`,
90*01826a49SYabin Cui`MSAN_EXTRA_CFLAGS`, `MSAN_EXTRA_CXXFLAGS`, `MSAN_EXTRA_LDFLAGS` to easily pass
91*01826a49SYabin Cuithe extra parameters only for MSAN.
92*01826a49SYabin Cui
93*01826a49SYabin Cui## AFL
94*01826a49SYabin Cui
95*01826a49SYabin CuiThe default `LIB_FUZZING_ENGINE` is `libregression.a`, which produces a binary
96*01826a49SYabin Cuithat AFL can use.
97*01826a49SYabin Cui
98*01826a49SYabin Cui```
99*01826a49SYabin Cui# Build the fuzz targets
100*01826a49SYabin CuiCC=afl-clang CXX=afl-clang++ ./fuzz.py build all --enable-asan --enable-ubsan
101*01826a49SYabin Cui# Run the fuzzer without a memory limit because of ASAN
102*01826a49SYabin Cui./fuzz.py afl TARGET -m none
103*01826a49SYabin Cui```
104*01826a49SYabin Cui
105*01826a49SYabin Cui## Regression Testing
106*01826a49SYabin Cui
107*01826a49SYabin CuiThe regression test supports the `all` target to run all the fuzzers in one
108*01826a49SYabin Cuicommand.
109*01826a49SYabin Cui
110*01826a49SYabin Cui```
111*01826a49SYabin CuiCC=clang CXX=clang++ ./fuzz.py build all --enable-asan --enable-ubsan
112*01826a49SYabin Cui./fuzz.py regression all
113*01826a49SYabin CuiCC=clang CXX=clang++ ./fuzz.py build all --enable-msan
114*01826a49SYabin Cui./fuzz.py regression all
115*01826a49SYabin Cui```
116*01826a49SYabin Cui
117*01826a49SYabin Cui## Fuzzing a custom sequence producer plugin
118*01826a49SYabin CuiSequence producer plugin authors can use the zstd fuzzers to stress-test their code.
119*01826a49SYabin CuiSee the documentation in `fuzz_third_party_seq_prod.h` for details.
120*01826a49SYabin Cui
121*01826a49SYabin Cui## Adding a new fuzzer
122*01826a49SYabin CuiThere are several steps involved in adding a new fuzzer harness.
123*01826a49SYabin Cui
124*01826a49SYabin Cui### Build your harness
125*01826a49SYabin Cui1. Create a new your fuzzer harness `tests/fuzz/your_harness.c`.
126*01826a49SYabin Cui
127*01826a49SYabin Cui2. Add your harness to the Makefile
128*01826a49SYabin Cui
129*01826a49SYabin Cui    2.1 Follow [this example](https://github.com/facebook/zstd/blob/e124e39301381de8f323436a3e4c46539747ba24/tests/fuzz/Makefile#L216) if your fuzzer requires both compression and decompression symbols (prefix `rt_`). If your fuzzer only requires decompression symbols, follow [this example](https://github.com/facebook/zstd/blob/6a0052a409e2604bd40354b76b86272b712edd7d/tests/fuzz/Makefile#L194) (prefix `d_`).
130*01826a49SYabin Cui
131*01826a49SYabin Cui    2.2 Add your target to [`FUZZ_TARGETS`](https://github.com/facebook/zstd/blob/6a0052a409e2604bd40354b76b86272b712edd7d/tests/fuzz/Makefile#L108).
132*01826a49SYabin Cui
133*01826a49SYabin Cui3. Add your harness to [`fuzz.py`](https://github.com/facebook/zstd/blob/6a0052a409e2604bd40354b76b86272b712edd7d/tests/fuzz/fuzz.py#L48).
134*01826a49SYabin Cui
135*01826a49SYabin Cui### Generate seed data
136*01826a49SYabin CuiFollow the instructions above to generate seed data:
137*01826a49SYabin Cui```
138*01826a49SYabin Cuimake -C ../tests decodecorpus
139*01826a49SYabin Cui./fuzz.py gen your_harness
140*01826a49SYabin Cui```
141*01826a49SYabin Cui
142*01826a49SYabin Cui### Run the harness
143*01826a49SYabin CuiFollow the instructions above to run your harness and fix any crashes:
144*01826a49SYabin Cui```
145*01826a49SYabin Cui./fuzz.py build your_harness --enable-fuzzer --enable-asan --enable-ubsan --cc clang --cxx clang++
146*01826a49SYabin Cui./fuzz.py libfuzzer your_harness
147*01826a49SYabin Cui```
148*01826a49SYabin Cui
149*01826a49SYabin Cui### Minimize and zip the corpus
150*01826a49SYabin CuiAfter running the fuzzer for a while, you will have a large corpus at `tests/fuzz/corpora/your_harness*`.
151*01826a49SYabin CuiThis corpus must be minimized and zipped before uploading to GitHub for regression testing:
152*01826a49SYabin Cui```
153*01826a49SYabin Cui./fuzz.py minimize your_harness
154*01826a49SYabin Cui./fuzz.py zip your_harness
155*01826a49SYabin Cui```
156*01826a49SYabin Cui
157*01826a49SYabin Cui### Upload the zip file to GitHub
158*01826a49SYabin CuiThe previous step should produce a `.zip` file containing the corpus for your new harness.
159*01826a49SYabin CuiThis corpus must be uploaded to GitHub here: https://github.com/facebook/zstd/releases/tag/fuzz-corpora
160*01826a49SYabin Cui
161*01826a49SYabin Cui
162