1# Getting started with libfuzzer in Chromium 2 3Our current best advice on how to start fuzzing is by using FuzzTest, which 4has its own [getting started guide here]. If you're reading this page, it's 5probably because you've run into limitations of FuzzTest and want to create 6a libfuzzer fuzzer instead. This is a slightly older approach to fuzzing 7Chrome, but it still works well - read on. 8 9This document walks you through the basic steps to start fuzzing and suggestions 10for improving your fuzz targets. If you're looking for more advanced fuzzing 11topics, see the [main page](README.md). 12 13[TOC] 14 15## Getting started 16 17### Simple Example 18 19Before writing any code let us look at a simple 20example of a test that uses input fuzzing. The test is setup to exercise the 21[`CreateFnmatchQuery`](https://source.chromium.org/chromium/chromium/src/+/main:chrome/browser/ash/extensions/file_manager/search_by_pattern.h;drc=4bc4bcef0ab5581a5a27cea986296739582243a6) 22function. The role of this function is to take a user query and produce 23a case-insensitive pattern that matches file names containing the 24query in them. For example, for a query "1abc" the function generates 25"\*1[aA][bB][cC]\*". Unlike a traditional test, an input fuzzing test does not 26care about the output of the tested function. Instead it verifies that no 27matter what string the user enters `CreateFnmatchQuery` does not do something 28unexpected, such as a crash, overriding a memory region, etc. The test 29[create_fnmatch_query_fuzzer.cc](https://source.chromium.org/chromium/chromium/src/+/main:chrome/browser/ash/extensions/file_manager/create_fnmatch_query_fuzzer.cc;drc=1f5a5af3eb1bbdf9e4566c3e6d2051e68de112eb) 30is shown below: 31 32```cpp 33#include <stddef.h> 34#include <stdint.h> 35 36#include <string> 37 38#include "chrome/browser/ash/extensions/file_manager/search_by_pattern.h" 39 40extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) { 41 std::string str = std::string(reinterpret_cast<const char*>(data), size); 42 extensions::CreateFnmatchQuery(str); 43 return 0; 44} 45``` 46 47The code starts by including `stddef.h` for `size_t` definition, `stdint.h` 48for `uint8_t` definition, `string` for `std::string` definition and finally 49the file where `extensions::CreateFnmatchQuery` function is defined. Next 50it declares and defines the `LLVMFuzzerTestOneInput` function, which is 51the function called by the testing framework. The function is supplied with two 52arguments, a pointer to an array of bytes, and the size of the array. These 53bytes are generated by the fuzzing test harness and their specific values 54are irrelevant. The job of the test is to convert those bytes to input 55parameters of the tested function. In our case bytes are converted 56to a `std::string` and given to the `CreateFnmatchQuery` function. If 57the function completes its job and the code successfully returns, the 58`LLVMFuzzerTestOneInput` function returns 0, signaling a successful execution. 59 60The above pattern is typical to fuzzing tests. You create a 61`LLVMFuzzerTestOneInput` function. You then write code that uses the provided 62random bytes to form input parameters to the function you intend to test. Next, 63you call the function, and if it successfully completes, return 0. 64 65To run this test we need to create a `fuzzer_test` target in the appropriate 66`BUILD.gn` file. For the above example, the target is defined as 67 68```python 69fuzzer_test("create_fnmatch_query_fuzzer") { 70 sources = [ "extensions/file_manager/create_fnmatch_query_fuzzer.cc" ] 71 deps = [ 72 ":ash", 73 "//base", 74 "//chrome/browser", 75 "//components/exo/wayland:ui_controls_protocol", 76 ] 77} 78``` 79The source field typically specified just the file that contains the test. The 80dependencies are specific to the tested function. Here we are listing them for 81the completeness. In your test all but `//base` dependencies are unlikely to be 82required. 83 84### Creating your first fuzz target 85 86Having seen a concrete example, let us describe the generic flow of steps to 87create a new fuzzing test. 88 891. In the same directory as the code you are going to fuzz (or next to the tests 90 for that code), create a new `<my_fuzzer>.cc` file. 91 92 *** note 93 **Note:** Do not use the `testing/libfuzzer/fuzzers` directory. This 94 directory was used for initial sample fuzz targets but is no longer 95 recommended for landing new targets. 96 *** 97 982. In the new file, define a `LLVMFuzzerTestOneInput` function: 99 100 ```cpp 101 #include <stddef.h> 102 #include <stdint.h> 103 104 extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) { 105 // Put your fuzzing code here and use |data| and |size| as input. 106 return 0; 107 } 108 ``` 109 1103. In `BUILD.gn` file, define a `fuzzer_test` GN target: 111 112 ```python 113 import("//testing/libfuzzer/fuzzer_test.gni") 114 fuzzer_test("my_fuzzer") { 115 sources = [ "my_fuzzer.cc" ] 116 deps = [ ... ] 117 } 118 ``` 119 120*** note 121**Note:** Most of the targets are small. They may perform one or a few API calls 122using the data provided by the fuzzing engine as an argument. However, fuzz 123targets may be more complex if a certain initialization procedure needs to be 124performed. [quic_session_pool_fuzzer.cc] is a good example of a complex fuzz 125target. 126*** 127 128Once you created your first fuzz target, in order to run it, you must set up 129your build environment. This is described next. 130 131### Setting up your build environment 132 133Generate build files by using the `use_libfuzzer` [GN] argument together with a 134sanitizer. Rather than generating a GN build configuration by hand, we recommend 135that you run the meta-builder tool using [GN config] that corresponds to the 136operating system of the DUT you're deploying to: 137 138```bash 139# AddressSanitizer is the default config we recommend testing with. 140# Linux: 141tools/mb/mb.py gen -m chromium.fuzz -b 'Libfuzzer Upload Linux ASan' out/libfuzzer 142# Chrome OS: 143tools/mb/mb.py gen -m chromium.fuzz -b 'Libfuzzer Upload Chrome OS ASan' out/libfuzzer 144# Mac: 145tools/mb/mb.py gen -m chromium.fuzz -b 'Libfuzzer Upload Mac ASan' out/libfuzzer 146# Windows: 147python tools\mb\mb.py gen -m chromium.fuzz -b "Libfuzzer Upload Windows ASan" out\libfuzzer 148``` 149 150If testing things locally these are the recommended configurations 151 152```bash 153# AddressSanitizer is the default config we recommend testing with. 154# Linux: 155tools/mb/mb.py gen -m chromium.fuzz -b 'Libfuzzer Local Linux ASan' out/libfuzzer 156# Chrome OS: 157tools/mb/mb.py gen -m chromium.fuzz -b 'Libfuzzer Local Chrome OS ASan' out/libfuzzer 158# Mac: 159tools/mb/mb.py gen -m chromium.fuzz -b 'Libfuzzer Local Mac ASan' out/libfuzzer 160# Windows: 161python tools\mb\mb.py gen -m chromium.fuzz -b "Libfuzzer Local Windows ASan" out\libfuzzer 162``` 163 164[`tools/mb/mb.py`](https://source.chromium.org/chromium/chromium/src/+/main:tools/mb/mb.py;drc=c771c017eca9a6a859d245be54c511acafdc9867) 165is "a wrapper script for GN that [..] generate[s] build files for sets of 166canned configurations." The `-m` flag selects the builder group, while the 167`-b` flag selects a specific builder in the builder group. The `out/libfuzzer` 168is the directory to which GN configuration is written. If you wish, you can 169inspect the generated config by running `gn args out/libfuzzer`, once the 170`mb.py` script is done. 171 172*** note 173**Note:** The above invocations may set `use_remoteexec` or `use_rbe` to true. 174However, these args aren't compatible on local workstations yet. So if you run 175into reclient errors when building locally, remove both those args and set 176`use_goma` instead. 177 178You can also invoke [AFL] by using the `use_afl` GN argument, but we 179recommend libFuzzer for local development. Running libFuzzer locally doesn't 180require any special configuration and gives quick, meaningful output for speed, 181coverage, and other parameters. 182*** 183 184It’s possible to run fuzz targets without sanitizers, but not recommended, as 185sanitizers help to detect errors which may not result in a crash otherwise. 186`use_libfuzzer` is supported in the following sanitizer configurations. 187 188| GN Argument | Description | Supported OS | 189|-------------|-------------|--------------| 190| `is_asan=true` | Enables [AddressSanitizer] to catch problems like buffer overruns. | Linux, Windows, Mac, Chrome OS | 191| `is_msan=true` | Enables [MemorySanitizer] to catch problems like uninitialized reads<sup>\[[\*](reference.md#MSan)\]</sup>. | Linux | 192| `is_ubsan_security=true` | Enables [UndefinedBehaviorSanitizer] to catch<sup>\[[\*](reference.md#UBSan)\]</sup> undefined behavior like integer overflow.| Linux | 193 194For more on builder and sanitizer configurations, see the [Integration 195Reference] page. 196 197*** note 198**Hint**: Fuzz targets are built with minimal symbols by default. You can adjust 199the symbol level by setting the `symbol_level` attribute. 200*** 201 202### Running the fuzz target 203 204After you create your fuzz target, build it with autoninja and run it locally. 205To make this example concrete, we are going to use the existing 206`create_fnmatch_query_fuzzer` target. 207 208```bash 209# Build the fuzz target. 210autoninja -C out/libfuzzer chrome/browser/ash:create_fnmatch_query_fuzzer 211# Run the fuzz target. 212./out/libfuzzer/create_fnmatch_query_fuzzer 213``` 214 215Your fuzz target should produce output like this: 216 217``` 218INFO: Seed: 1511722356 219INFO: Loaded 2 modules (115485 guards): 22572 [0x7fe8acddf560, 0x7fe8acdf5610), 92913 [0xaa05d0, 0xafb194), 220INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes 221INFO: A corpus is not provided, starting from an empty corpus 222#2 INITED cov: 961 ft: 48 corp: 1/1b exec/s: 0 rss: 48Mb 223#3 NEW cov: 986 ft: 70 corp: 2/104b exec/s: 0 rss: 48Mb L: 103/103 MS: 1 InsertRepeatedBytes- 224#4 NEW cov: 989 ft: 74 corp: 3/106b exec/s: 0 rss: 48Mb L: 2/103 MS: 1 InsertByte- 225#6 NEW cov: 991 ft: 76 corp: 4/184b exec/s: 0 rss: 48Mb L: 78/103 MS: 2 CopyPart-InsertRepeatedBytes- 226``` 227 228A `... NEW ...` line appears when libFuzzer finds new and interesting inputs. If 229your fuzz target is efficient, it will find a lot of them quickly. A `... pulse 230...` line appears periodically to show the current status. 231 232For more information about the output, see [libFuzzer's output documentation]. 233 234*** note 235**Note:** If you observe an `odr-violation` error in the log, please try setting 236the following environment variable: `ASAN_OPTIONS=detect_odr_violation=0` and 237running the fuzz target again. 238*** 239 240#### Symbolizing a stacktrace 241 242If your fuzz target crashes when running locally and you see non-symbolized 243stacktrace, make sure you add the `third_party/llvm-build/Release+Asserts/bin/` 244directory from Chromium’s Clang package in `$PATH`. This directory contains the 245`llvm-symbolizer` binary. 246 247Alternatively, you can set an `external_symbolizer_path` via the `ASAN_OPTIONS` 248environment variable: 249 250```bash 251ASAN_OPTIONS=external_symbolizer_path=/my/local/llvm/build/llvm-symbolizer \ 252 ./fuzzer ./crash-input 253``` 254 255The same approach works with other sanitizers via `MSAN_OPTIONS`, 256`UBSAN_OPTIONS`, etc. 257 258### Submitting your fuzz target 259 260ClusterFuzz and the build infrastructure automatically discover, build and 261execute all `fuzzer_test` targets in the Chromium repository. Once you land your 262fuzz target, ClusterFuzz will run it at scale. Check the [ClusterFuzz status] 263page after a day or two. 264 265If you want to better understand and optimize your fuzz target’s performance, 266see the [Efficient Fuzzing Guide]. 267 268*** note 269**Note:** It’s important to run fuzzers at scale, not just in your own 270environment, because local fuzzing will catch fewer issues. If you run fuzz 271targets at scale continuously, you’ll catch regressions and improve code 272coverage over time. 273*** 274 275## Optional improvements 276 277### Common tricks 278 279Your fuzz target may immediately discover interesting (i.e. crashing) inputs. 280You can make it more effective with several easy steps: 281 282* **Create a seed corpus**. You can guide the fuzzing engine to generate more 283 relevant inputs by adding the `seed_corpus = "src/fuzz-testcases/"` attribute 284 to your fuzz target and adding example files to the appropriate directory. For 285 more, see the [Seed Corpus] section of the [Efficient Fuzzing Guide]. 286 287 *** note 288 **Note:** make sure your corpus files are appropriately licensed. 289 *** 290 291* **Create a mutation dictionary**. You can make mutations more effective by 292 providing the fuzzer with a `dict = "protocol.dict"` GN attribute and a 293 dictionary file that contains interesting strings / byte sequences for the 294 target API. For more, see the [Fuzzer Dictionary] section of the [Efficient 295 Fuzzer Guide]. 296 297* **Specify testcase length limits**. Long inputs can be problematic, because 298 they are more slowly processed by the fuzz target and increase the search 299 space. By default, libFuzzer uses `-max_len=4096` or takes the longest 300 testcase in the corpus if `-max_len` is not specified. 301 302 ClusterFuzz uses different strategies for different fuzzing sessions, 303 including different random values. Also, ClusterFuzz uses different fuzzing 304 engines (e.g. AFL that doesn't have `-max_len` option). If your target has an 305 input length limit that you would like to *strictly enforce*, add a sanity 306 check to the beginning of your `LLVMFuzzerTestOneInput` function: 307 308 ```cpp 309 if (size < kMinInputLength || size > kMaxInputLength) 310 return 0; 311 ``` 312 313* **Generate a [code coverage report]**. See which code the fuzzer covered in 314 recent runs, so you can gauge whether it hits the important code parts or not. 315 316 **Note:** Since the code coverage of a fuzz target depends heavily on the 317 corpus provided when running the target, we recommend running the fuzz target 318 built with ASan locally for a little while (several minutes / hours) first. 319 This will produce some corpus, which should be used for generating a code 320 coverage report. 321 322#### Disabling noisy error message logging 323 324If the code you’re fuzzing generates a lot of error messages when encountering 325incorrect or invalid data, the fuzz target will be slow and inefficient. 326 327If the target uses Chromium logging APIs, you can silence errors by overriding 328the environment used for logging in your fuzz target: 329 330```cpp 331struct Environment { 332 Environment() { 333 logging::SetMinLogLevel(logging::LOGGING_FATAL); 334 } 335}; 336 337extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) { 338 static Environment env; 339 340 // Put your fuzzing code here and use data+size as input. 341 return 0; 342} 343``` 344 345### Mutating Multiple Inputs 346 347By default, a fuzzing engine such as libFuzzer mutates a single input (`uint8_t* 348data, size_t size`). However, APIs often accept multiple arguments of various 349types, rather than a single buffer. You can use three different methods to 350mutate multiple inputs at once. 351 352#### libprotobuf-mutator (LPM) 353 354If you need to mutate multiple inputs of various types and length, see [Getting 355Started with libprotobuf-mutator in Chromium]. 356 357*** note 358**Note:** This method works with APIs and data structures of any complexity, but 359requires extra effort. You would need to write a `.proto` definition (unless you 360fuzz an existing protobuf) and C++ code to pass the proto message to the API you 361are fuzzing (you'll have a fuzzed protobuf message instead of `data, size` 362buffer). 363*** 364 365#### FuzzedDataProvider (FDP) 366 367[FuzzedDataProvider] is a class useful for splitting a fuzz input into multiple 368parts of various types. 369 370*** note 371**Note:** FDP is much easier to use than LPM, but its downside is that format of 372the corpus becomes inconsistent. This doesn't matter if you don't have [Seed 373Corpus] (e.g. valid image files if you fuzz an image parser). FDP splits your 374corpus files into several pieces to fuzz a broader range of input types, so it 375can take longer to reach deeper code paths that surface more quickly if you fuzz 376only a single input type. 377*** 378 379To use FDP, add `#include <fuzzer/FuzzedDataProvider.h>` to your fuzz target 380source file. 381 382To learn more about `FuzzedDataProvider`, check out the [upstream documentation] 383on it. It gives an overview of the available methods and links to a few example 384fuzz targets. 385 386#### Hash-based argument 387 388If your API accepts a buffer with data and some integer value (i.e., a bitwise 389combination of flags), you can calculate a hash value from (`data, size`) and 390use it to fuzz an additional integer argument. For example: 391 392```cpp 393extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) { 394 std::string str = std::string(reinterpret_cast<const char*>(data), size); 395 std::size_t data_hash = std::hash<std::string>()(str); 396 APIToBeFuzzed(data, size, data_hash); 397 return 0; 398} 399 400``` 401 402*** note 403**Note:** The hash method doesn't have the corpus format issue mentioned in the 404FDP section above, but it can lead to results that aren't as sophisticated as 405LPM or FDP. The hash value derived from the data is a random value, rather than 406a meaningful one controlled by the fuzzing engine. A single bit mutation might 407lead to a new code coverage, but the next mutation would generate a new hash 408value and trigger another code path, without providing any real guidance to the 409fuzzing engine. 410*** 411 412[AFL]: AFL_integration.md 413[AddressSanitizer]: http://clang.llvm.org/docs/AddressSanitizer.html 414[ClusterFuzz status]: libFuzzer_integration.md#Status-Links 415[Efficient Fuzzing Guide]: efficient_fuzzing.md 416[FuzzedDataProvider]: https://cs.chromium.org/chromium/src/third_party/re2/src/re2/fuzzing/compiler-rt/include/fuzzer/FuzzedDataProvider.h 417[Fuzzer Dictionary]: efficient_fuzzing.md#Fuzzer-dictionary 418[GN]: https://gn.googlesource.com/gn/+/master/README.md 419[GN config]: https://cs.chromium.org/chromium/src/tools/mb/mb_config_expectations/chromium.fuzz.json 420[Getting Started with libprotobuf-mutator in Chromium]: libprotobuf-mutator.md 421[Integration Reference]: reference.md 422[MemorySanitizer]: http://clang.llvm.org/docs/MemorySanitizer.html 423[Seed Corpus]: efficient_fuzzing.md#Seed-corpus 424[UndefinedBehaviorSanitizer]: http://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html 425[code coverage report]: efficient_fuzzing.md#Code-coverage 426[upstream documentation]: https://github.com/google/fuzzing/blob/master/docs/split-inputs.md#fuzzed-data-provider 427[libFuzzer's output documentation]: http://llvm.org/docs/LibFuzzer.html#output 428[quic_session_pool_fuzzer.cc]: https://cs.chromium.org/chromium/src/net/quic/quic_session_pool_fuzzer.cc 429[getting started guide here]: getting_started.md 430