Name Date Size #Lines LOC

..--

MakefileH A D25-Apr-2025103 32

README.cmplog.mdH A D25-Apr-20251.1 KiB4532

README.gcc_plugin.mdH A D25-Apr-20254.1 KiB11077

README.injections.mdH A D25-Apr-20251.6 KiB4935

README.instrument_list.mdH A D25-Apr-20254.4 KiB13193

README.laf-intel.mdH A D25-Apr-20251.9 KiB5034

README.llvm.mdH A D25-Apr-202512.2 KiB307215

README.lto.mdH A D25-Apr-202511.2 KiB326236

README.persistent_mode.mdH A D25-Apr-20256.5 KiB198140

SanitizerCoverageLTO.so.ccH A D25-Apr-202556.9 KiB1,8201,171

SanitizerCoveragePCGUARD.so.ccH A D25-Apr-202541.8 KiB1,301856

afl-compiler-rt.o.cH A D25-Apr-202565.4 KiB2,9571,573

afl-gcc-cmplog-pass.so.ccH A D25-Apr-202510.1 KiB405244

afl-gcc-cmptrs-pass.so.ccH A D25-Apr-202511.2 KiB370193

afl-gcc-common.hH A D25-Apr-202513.4 KiB509290

afl-llvm-common.ccH A D25-Apr-202515.3 KiB607357

afl-llvm-common.hH A D25-Apr-20251.6 KiB7054

afl-llvm-dict2file.so.ccH A D25-Apr-202522.5 KiB774472

afl-llvm-lto-instrumentlist.so.ccH A D25-Apr-20254.4 KiB17698

afl-llvm-pass.so.ccH A D25-Apr-202531.2 KiB1,106640

afl-llvm-rt-lto.o.cH A D25-Apr-2025682 287

cmplog-instructions-pass.ccH A D25-Apr-202517.7 KiB716461

cmplog-routines-pass.ccH A D25-Apr-202524.2 KiB797580

cmplog-switches-pass.ccH A D25-Apr-202511 KiB481335

compare-transform-pass.so.ccH A D25-Apr-202524.6 KiB798529

injection-pass.ccH A D25-Apr-20258.7 KiB367247

llvm-alternative-coverage.hH A D25-Apr-2025345 2213

split-compares-pass.so.ccH A D25-Apr-202557.6 KiB1,8891,224

split-switches-pass.so.ccH A D25-Apr-202515.3 KiB562331

README.cmplog.md

1# CmpLog instrumentation
2
3The CmpLog instrumentation enables logging of comparison operands in a shared
4memory.
5
6These values can be used by various mutators built on top of it. At the moment,
7we support the Redqueen mutator (input-2-state instructions only), for details
8see [the Redqueen paper](https://github.com/RUB-SysSec/redqueen).
9
10## Build
11
12To use CmpLog, you have to build two versions of the instrumented target
13program:
14
15* The first version is built using the regular AFL++ instrumentation.
16* The second one, the CmpLog binary, is built with setting `AFL_LLVM_CMPLOG`
17  during the compilation.
18
19For example:
20
21```
22./configure --cc=~/path/to/afl-clang-fast
23make
24cp ./program ./program.afl
25make clean
26export AFL_LLVM_CMPLOG=1
27./configure --cc=~/path/to/afl-clang-fast
28make
29cp ./program ./program.cmplog
30unset AFL_LLVM_CMPLOG
31```
32
33## Use
34
35AFL++ has the new `-c` option that needs to be used to specify the CmpLog binary
36(the second build).
37
38For example:
39
40```
41afl-fuzz -i input -o output -c ./program.cmplog -m none -- ./program.afl @@
42```
43
44Be careful with the usage of `-m` because CmpLog can map a lot of pages.
45

README.gcc_plugin.md

1# GCC-based instrumentation for afl-fuzz
2
3For the general instruction manual, see [docs/README.md](../docs/README.md).
4
5For the LLVM-based instrumentation, see [README.llvm.md](README.llvm.md).
6
7This document describes how to build and use `afl-gcc-fast` and `afl-g++-fast`,
8which instrument the target with the help of gcc plugins.
9
10TL;DR:
11* Check the version of your gcc compiler: `gcc --version`
12* `apt-get install gcc-VERSION-plugin-dev` or similar to install headers for gcc
13  plugins.
14* `gcc` and `g++` must match the gcc-VERSION you installed headers for. You can
15  set `AFL_CC`/`AFL_CXX` to point to these!
16* `make`
17* Just use `afl-gcc-fast`/`afl-g++-fast` normally like you would do with
18  `afl-clang-fast`.
19
20## 1) Introduction
21
22The code in this directory allows to instrument programs for AFL++ using true
23compiler-level instrumentation, instead of the more crude assembly-level
24rewriting approach taken by afl-gcc and afl-clang. This has several interesting
25properties:
26
27- The compiler can make many optimizations that are hard to pull off when
28  manually inserting assembly. As a result, some slow, CPU-bound programs will
29  run up to around faster.
30
31  The gains are less pronounced for fast binaries, where the speed is limited
32  chiefly by the cost of creating new processes. In such cases, the gain will
33  probably stay within 10%.
34
35- The instrumentation is CPU-independent. At least in principle, you should be
36  able to rely on it to fuzz programs on non-x86 architectures (after building
37  `afl-fuzz` with `AFL_NOX86=1`).
38
39- Because the feature relies on the internals of GCC, it is gcc-specific and
40  will *not* work with LLVM (see [README.llvm.md](README.llvm.md) for an
41  alternative).
42
43Once this implementation is shown to be sufficiently robust and portable, it
44will probably replace afl-gcc. For now, it can be built separately and co-exists
45with the original code.
46
47The idea and much of the implementation comes from Laszlo Szekeres.
48
49## 2) How to use
50
51In order to leverage this mechanism, you need to have modern enough GCC (>=
52version 4.5.0) and the plugin development headers installed on your system. That
53should be all you need. On Debian machines, these headers can be acquired by
54installing the `gcc-VERSION-plugin-dev` packages.
55
56To build the instrumentation itself, type `make`. This will generate binaries
57called `afl-gcc-fast` and `afl-g++-fast` in the parent directory.
58
59The gcc and g++ compiler links have to point to gcc-VERSION - or set these by
60pointing the environment variables `AFL_CC`/`AFL_CXX` to them. If the `CC`/`CXX`
61environment variables have been set, those compilers will be preferred over
62those from the `AFL_CC`/`AFL_CXX` settings.
63
64Once this is done, you can instrument third-party code in a way similar to the
65standard operating mode of AFL++, e.g.:
66
67```
68  CC=/path/to/afl/afl-gcc-fast
69  CXX=/path/to/afl/afl-g++-fast
70  export CC CXX
71  ./configure [...options...]
72  make
73```
74
75Note: We also used `CXX` to set the C++ compiler to `afl-g++-fast` for C++ code.
76
77The tool honors roughly the same environmental variables as `afl-gcc` (see
78[docs/env_variables.md](../docs/env_variables.md). This includes
79`AFL_INST_RATIO`, `AFL_USE_ASAN`, `AFL_HARDEN`, and `AFL_DONT_OPTIMIZE`.
80
81Note: if you want the GCC plugin to be installed on your system for all users,
82you need to build it before issuing 'make install' in the parent directory.
83
84## 3) Gotchas, feedback, bugs
85
86This is an early-stage mechanism, so field reports are welcome. You can send bug
87reports to [email protected].
88
89## 4) Bonus feature #1: deferred initialization
90
91See
92[README.persistent_mode.md#3) Deferred initialization](README.persistent_mode.md#3-deferred-initialization).
93
94## 5) Bonus feature #2: persistent mode
95
96See
97[README.persistent_mode.md#4) Persistent mode](README.persistent_mode.md#4-persistent-mode).
98
99## 6) Bonus feature #3: selective instrumentation
100
101It can be more effective to fuzzing to only instrument parts of the code. For
102details, see [README.instrument_list.md](README.instrument_list.md).
103
104## 7) Bonus feature #4: CMPLOG
105
106The gcc_plugin also support CMPLOG/Redqueen, just set `AFL_GCC_CMPLOG` before
107instrumenting the target.
108Read more about this in the llvm document.
109
110

README.injections.md

1# Injection fuzzing
2
3Coverage guided fuzzing so far is only able to detect crashes, so usually
4memory corruption issues, or - if implemented by hand in the harness -
5invariants.
6
7This is a proof-of-concept implementation to additionally hunt for injection
8vulnerabilities.
9It works by instrumenting calls to specific functions and parsing the
10query parameter for a specific unescaped dictionary string, and if detected,
11crashes the target.
12
13This has a very low false positive rate.
14But obviously this can only find injection vulnerailities that are suspectible
15to this specific (but most common) issue. Hence in a rare kind of injection
16vulnerability this won't find the bug - and be a false negative.
17But this can be tweaked by the user - see the HOW TO MODIFY section below.
18
19## How to use
20
21Set one or more of the following environment variables for **compiling**
22the target and - *this is important* - when **fuzzing** the target:
23
24 - `AFL_LLVM_INJECTIONS_SQL`
25 - `AFL_LLVM_INJECTIONS_LDAP`
26 - `AFL_LLVM_INJECTIONS_XSS`
27
28Alternatively you can set `AFL_LLVM_INJECTIONS_ALL` to enable all.
29
30## How to modify
31
32If you want to add more fuctions to check for e.g. SQL injections:
33Add these to `instrumentation/injection-pass.cc` and recompile.
34
35If you want to test for more injection inputs:
36Add the dictionary tokens to `src/afl-fuzz.c` and the check for them to
37`instrumentation/afl-compiler-rt.o.c`.
38
39If you want to add new injection targets:
40You will have to edit all three files.
41
42Just search for:
43```
44// Marker: ADD_TO_INJECTIONS
45```
46in the files to see where this needs to be added.
47
48**NOTE:** pull requests to improve this feature are highly welcome :-)
49

README.instrument_list.md

1# Using AFL++ with partial instrumentation
2
3This file describes two different mechanisms to selectively instrument only
4specific parts in the target.
5
6Both mechanisms work for LLVM and GCC_PLUGIN, but not for afl-clang/afl-gcc.
7
8## 1) Description and purpose
9
10When building and testing complex programs where only a part of the program is
11the fuzzing target, it often helps to only instrument the necessary parts of the
12program, leaving the rest uninstrumented. This helps to focus the fuzzer on the
13important parts of the program, avoiding undesired noise and disturbance by
14uninteresting code being exercised.
15
16For this purpose, "partial instrumentation" support is provided by AFL++ that
17allows to specify what should be instrumented and what not.
18
19Both mechanisms for partial instrumentation can be used together.
20
21## 2) Selective instrumentation with __AFL_COVERAGE_... directives
22
23In this mechanism, the selective instrumentation is done in the source code.
24
25After the includes, a special define has to be made, e.g.:
26
27```
28#include <stdio.h>
29#include <stdint.h>
30// ...
31
32__AFL_COVERAGE();  // <- required for this feature to work
33```
34
35If you want to disable the coverage at startup until you specify coverage should
36be started, then add `__AFL_COVERAGE_START_OFF();` at that position.
37
38From here on out, you have the following macros available that you can use in
39any function where you want:
40
41* `__AFL_COVERAGE_ON();` - Enable coverage from this point onwards.
42* `__AFL_COVERAGE_OFF();` - Disable coverage from this point onwards.
43* `__AFL_COVERAGE_DISCARD();` - Reset all coverage gathered until this point.
44* `__AFL_COVERAGE_SKIP();` - Mark this test case as unimportant. Whatever
45  happens, afl-fuzz will ignore it.
46
47A special function is `__afl_coverage_interesting`. To use this, you must define
48`void __afl_coverage_interesting(u8 val, u32 id);`. Then you can use this
49function globally, where the `val` parameter can be set by you, the `id`
50parameter is for afl-fuzz and will be overwritten. Note that useful parameters
51for `val` are: 1, 2, 3, 4, 8, 16, 32, 64, 128. A value of, e.g., 33 will be seen
52as 32 for coverage purposes.
53
54## 3) Selective instrumentation with AFL_LLVM_ALLOWLIST/AFL_LLVM_DENYLIST
55
56This feature is equivalent to llvm 12 sancov feature and allows to specify on a
57filename and/or function name level to instrument these or skip them.
58
59### 3a) How to use the partial instrumentation mode
60
61In order to build with partial instrumentation, you need to build with
62afl-clang-fast/afl-clang-fast++ or afl-clang-lto/afl-clang-lto++. The only
63required change is that you need to set either the environment variable
64`AFL_LLVM_ALLOWLIST` or `AFL_LLVM_DENYLIST` set with a filename.
65
66That file should contain the file names or functions that are to be instrumented
67(`AFL_LLVM_ALLOWLIST`) or are specifically NOT to be instrumented
68(`AFL_LLVM_DENYLIST`).
69
70GCC_PLUGIN: you can use either `AFL_LLVM_ALLOWLIST` or `AFL_GCC_ALLOWLIST` (or
71the same for `_DENYLIST`), both work.
72
73For matching to succeed, the function/file name that is being compiled must end
74in the function/file name entry contained in this instrument file list. That is
75to avoid breaking the match when absolute paths are used during compilation.
76
77**NOTE:** In builds with optimization enabled, functions might be inlined and
78would not match!
79
80For example, if your source tree looks like this:
81
82```
83project/
84project/feature_a/a1.cpp
85project/feature_a/a2.cpp
86project/feature_b/b1.cpp
87project/feature_b/b2.cpp
88```
89
90And you only want to test feature_a, then create an "instrument file list" file
91containing:
92
93```
94feature_a/a1.cpp
95feature_a/a2.cpp
96```
97
98However, if the "instrument file list" file contains only this, it works as
99well:
100
101```
102a1.cpp
103a2.cpp
104```
105
106But it might lead to files being unwantedly instrumented if the same filename
107exists somewhere else in the project directories.
108
109You can also specify function names. Note that for C++ the function names must
110be mangled to match! `nm` can print these names.
111
112AFL++ is able to identify whether an entry is a filename or a function. However,
113if you want to be sure (and compliant to the sancov allow/blocklist format), you
114can specify source file entries like this:
115
116```
117src: *malloc.c
118```
119
120And function entries like this:
121
122```
123fun: MallocFoo
124```
125
126Note that whitespace is ignored and comments (`# foo`) are supported.
127
128### 3b) UNIX-style pattern matching
129
130You can add UNIX-style pattern matching in the "instrument file list" entries.
131See `man fnmatch` for the syntax. Do not set any of the `fnmatch` flags.

README.laf-intel.md

1# laf-intel instrumentation
2
3## Introduction
4
5This originally is the work of an individual nicknamed laf-intel. His blog
6[Circumventing Fuzzing Roadblocks with Compiler Transformations](https://lafintel.wordpress.com/)
7and GitLab repo [laf-llvm-pass](https://gitlab.com/laf-intel/laf-llvm-pass/)
8describe some code transformations that help AFL++ to enter conditional blocks,
9where conditions consist of comparisons of large values.
10
11## Usage
12
13By default, these passes will not run when you compile programs using
14afl-clang-fast. Hence, you can use AFL++ as usual. To enable the passes, you
15must set environment variables before you compile the target project.
16
17The following options exist:
18
19`export AFL_LLVM_LAF_SPLIT_SWITCHES=1`
20
21Enables the split-switches pass.
22
23`export AFL_LLVM_LAF_TRANSFORM_COMPARES=1`
24
25Enables the transform-compares pass (strcmp, memcmp, strncmp, strcasecmp,
26strncasecmp).
27
28`export AFL_LLVM_LAF_SPLIT_COMPARES=1`
29
30Enables the split-compares pass. By default, it will
311. simplify operators >= (and <=) into chains of > (<) and == comparisons
322. change signed integer comparisons to a chain of sign-only comparison and
33   unsigned integer comparisons
343. split all unsigned integer comparisons with bit widths of 64, 32, or 16 bits
35   to chains of 8 bits comparisons.
36
37You can change the behavior of the last step by setting `export
38AFL_LLVM_LAF_SPLIT_COMPARES_BITW=<bit_width>`, where bit_width may be 64, 32, or
3916. For example, a bit_width of 16 would split larger comparisons down to 16 bit
40comparisons.
41
42A new unique feature is splitting floating point comparisons into a series
43of sign, exponent and mantissa comparisons followed by splitting each of them
44into 8 bit comparisons when necessary. It is activated with the
45`AFL_LLVM_LAF_SPLIT_FLOATS` setting.
46
47Note that setting this automatically activates `AFL_LLVM_LAF_SPLIT_COMPARES`.
48
49You can also set `AFL_LLVM_LAF_ALL` and have all of the above enabled. :-)
50

README.llvm.md

1# Fast LLVM-based instrumentation for afl-fuzz
2
3For the general instruction manual, see [docs/README.md](../docs/README.md).
4
5For the GCC-based instrumentation, see
6[README.gcc_plugin.md](README.gcc_plugin.md).
7
8## 1) Introduction
9
10! llvm_mode works with llvm versions 3.8 up to 17 - but 13+ is recommended !
11
12The code in this directory allows you to instrument programs for AFL++ using
13true compiler-level instrumentation, instead of the more crude assembly-level
14rewriting approach taken by afl-gcc and afl-clang. This has several interesting
15properties:
16
17- The compiler can make many optimizations that are hard to pull off when
18  manually inserting assembly. As a result, some slow, CPU-bound programs will
19  run up to around 2x faster.
20
21  The gains are less pronounced for fast binaries, where the speed is limited
22  chiefly by the cost of creating new processes. In such cases, the gain will
23  probably stay within 10%.
24
25- The instrumentation is CPU-independent. At least in principle, you should be
26  able to rely on it to fuzz programs on non-x86 architectures (after building
27  afl-fuzz with AFL_NO_X86=1).
28
29- The instrumentation can cope a bit better with multi-threaded targets.
30
31- Because the feature relies on the internals of LLVM, it is clang-specific and
32  will *not* work with GCC (see ../gcc_plugin/ for an alternative once it is
33  available).
34
35Once this implementation is shown to be sufficiently robust and portable, it
36will probably replace afl-clang. For now, it can be built separately and
37co-exists with the original code.
38
39The idea and much of the initial implementation came from Laszlo Szekeres.
40
41## 2a) How to use this - short
42
43Set the `LLVM_CONFIG` variable to the clang version you want to use, e.g.:
44
45```
46LLVM_CONFIG=llvm-config-9 make
47```
48
49In case you have your own compiled llvm version specify the full path:
50
51```
52LLVM_CONFIG=~/llvm-project/build/bin/llvm-config make
53```
54
55If you try to use a new llvm version on an old Linux this can fail because of
56old c++ libraries. In this case usually switching to gcc/g++ to compile
57llvm_mode will work:
58
59```
60LLVM_CONFIG=llvm-config-7 REAL_CC=gcc REAL_CXX=g++ make
61```
62
63It is highly recommended to use the newest clang version you can put your hands
64on :)
65
66Then look at [README.persistent_mode.md](README.persistent_mode.md).
67
68## 2b) How to use this - long
69
70In order to leverage this mechanism, you need to have clang installed on your
71system. You should also make sure that the llvm-config tool is in your path (or
72pointed to via LLVM_CONFIG in the environment).
73
74Note that if you have several LLVM versions installed, pointing LLVM_CONFIG to
75the version you want to use will switch compiling to this specific version - if
76you installation is set up correctly :-)
77
78Unfortunately, some systems that do have clang come without llvm-config or the
79LLVM development headers; one example of this is FreeBSD. FreeBSD users will
80also run into problems with clang being built statically and not being able to
81load modules (you'll see "Service unavailable" when loading afl-llvm-pass.so).
82
83To solve all your problems, you can grab pre-built binaries for your OS from:
84
85[https://llvm.org/releases/download.html](https://llvm.org/releases/download.html)
86
87...and then put the bin/ directory from the tarball at the beginning of your
88$PATH when compiling the feature and building packages later on. You don't need
89to be root for that.
90
91To build the instrumentation itself, type `make`. This will generate binaries
92called afl-clang-fast and afl-clang-fast++ in the parent directory. Once this is
93done, you can instrument third-party code in a way similar to the standard
94operating mode of AFL, e.g.:
95
96```
97  CC=/path/to/afl/afl-clang-fast ./configure [...options...]
98  make
99```
100
101Be sure to also include CXX set to afl-clang-fast++ for C++ code.
102
103Note that afl-clang-fast/afl-clang-fast++ are just pointers to afl-cc. You can
104also use afl-cc/afl-c++ and instead direct it to use LLVM instrumentation by
105either setting `AFL_CC_COMPILER=LLVM` or pass the parameter `--afl-llvm` via
106CFLAGS/CXXFLAGS/CPPFLAGS.
107
108The tool honors roughly the same environmental variables as afl-gcc (see
109[docs/env_variables.md](../docs/env_variables.md)). This includes
110`AFL_USE_ASAN`, `AFL_HARDEN`, and `AFL_DONT_OPTIMIZE`. However, `AFL_INST_RATIO`
111is not honored as it does not serve a good purpose with the more effective
112PCGUARD analysis.
113
114## 3) Options
115
116Several options are present to make llvm_mode faster or help it rearrange the
117code to make afl-fuzz path discovery easier.
118
119If you need just to instrument specific parts of the code, you can create the
120instrument file list which C/C++ files to actually instrument. See
121[README.instrument_list.md](README.instrument_list.md)
122
123For splitting memcmp, strncmp, etc., see
124[README.laf-intel.md](README.laf-intel.md).
125
126Then there are different ways of instrumenting the target:
127
1281. A better instrumentation strategy uses LTO and link time instrumentation.
129   Note that not all targets can compile in this mode, however, if it works it
130   is the best option you can use. To go with this option, use
131   afl-clang-lto/afl-clang-lto++. See [README.lto.md](README.lto.md).
132
1332. Alternatively you can choose a completely different coverage method:
134
1352a. N-GRAM coverage - which combines the previous visited edges with the current
136    one. This explodes the map but on the other hand has proven to be effective
137    for fuzzing. See
138    [7) AFL++ N-Gram Branch Coverage](#7-afl-n-gram-branch-coverage).
139
1402b. Context sensitive coverage - which combines the visited edges with an
141    individual caller ID (the function that called the current one). See
142    [6) AFL++ Context Sensitive Branch Coverage](#6-afl-context-sensitive-branch-coverage).
143
144Then - additionally to one of the instrumentation options above - there is a
145very effective new instrumentation option called CmpLog as an alternative to
146laf-intel that allow AFL++ to apply mutations similar to Redqueen. See
147[README.cmplog.md](README.cmplog.md).
148
149Finally, if your llvm version is 8 or lower, you can activate a mode that
150prevents that a counter overflow result in a 0 value. This is good for path
151discovery, but the llvm implementation for x86 for this functionality is not
152optimal and was only fixed in llvm 9. You can set this with AFL_LLVM_NOT_ZERO=1.
153
154Support for thread safe counters has been added for all modes. Activate it with
155`AFL_LLVM_THREADSAFE_INST=1`. The tradeoff is better precision in multi threaded
156apps for a slightly higher instrumentation overhead. This also disables the
157nozero counter default for performance reasons.
158
159## 4) deferred initialization, persistent mode, shared memory fuzzing
160
161This is the most powerful and effective fuzzing you can do. For a full
162explanation, see [README.persistent_mode.md](README.persistent_mode.md).
163
164## 5) Bonus feature: 'dict2file' pass
165
166Just specify `AFL_LLVM_DICT2FILE=/absolute/path/file.txt` and during compilation
167all constant string compare parameters will be written to this file to be used
168with afl-fuzz' `-x` option.
169
170Adding `AFL_LLVM_DICT2FILE_NO_MAIN=1` will skip parsing `main()` which often
171does command line parsing which has string comparisons that are not helpful
172for fuzzing.
173
174## 6) AFL++ Context Sensitive Branch Coverage
175
176### What is this?
177
178This is an LLVM-based implementation of the context sensitive branch coverage.
179
180Basically every function gets its own ID and, every time when an edge is logged,
181all the IDs in the callstack are hashed and combined with the edge transition
182hash to augment the classic edge coverage with the information about the calling
183context.
184
185So if both function A and function B call a function C, the coverage collected
186in C will be different.
187
188In math the coverage is collected as follows: `map[current_location_ID ^
189previous_location_ID >> 1 ^ hash_callstack_IDs] += 1`
190
191The callstack hash is produced XOR-ing the function IDs to avoid explosion with
192recursive functions.
193
194### Usage
195
196Set the `AFL_LLVM_INSTRUMENT=CTX` or `AFL_LLVM_CTX=1` environment variable.
197
198It is highly recommended to increase the MAP_SIZE_POW2 definition in config.h to
199at least 18 and maybe up to 20 for this as otherwise too many map collisions
200occur.
201
202### Caller Branch Coverage
203
204If the context sensitive coverage introduces too may collisions and becoming
205detrimental, the user can choose to augment edge coverage with just the called
206function ID, instead of the entire callstack hash.
207
208In math the coverage is collected as follows: `map[current_location_ID ^
209previous_location_ID >> 1 ^ previous_callee_ID] += 1`
210
211Set the `AFL_LLVM_INSTRUMENT=CALLER` or `AFL_LLVM_CALLER=1` environment
212variable.
213
214## 7) AFL++ N-Gram Branch Coverage
215
216### Source
217
218This is an LLVM-based implementation of the n-gram branch coverage proposed in
219the paper
220["Be Sensitive and Collaborative: Analyzing Impact of Coverage Metrics in Greybox Fuzzing"](https://www.usenix.org/system/files/raid2019-wang-jinghan.pdf)
221by Jinghan Wang, et. al.
222
223Note that the original implementation (available
224[here](https://github.com/bitsecurerlab/afl-sensitive)) is built on top of AFL's
225QEMU mode. This is essentially a port that uses LLVM vectorized instructions
226(available from llvm versions 4.0.1 and higher) to achieve the same results when
227compiling source code.
228
229In math the branch coverage is performed as follows: `map[current_location ^
230prev_location[0] >> 1 ^ prev_location[1] >> 1 ^ ... up to n-1`] += 1`
231
232### Usage
233
234The size of `n` (i.e., the number of branches to remember) is an option that is
235specified either in the `AFL_LLVM_INSTRUMENT=NGRAM-{value}` or the
236`AFL_LLVM_NGRAM_SIZE` environment variable. Good values are 2, 4, or 8, valid
237are 2-16.
238
239It is highly recommended to increase the MAP_SIZE_POW2 definition in config.h to
240at least 18 and maybe up to 20 for this as otherwise too many map collisions
241occur.
242
243## 8) NeverZero counters
244
245In larger, complex, or reiterative programs, the byte sized counters that
246collect the edge coverage can easily fill up and wrap around. This is not that
247much of an issue - unless, by chance, it wraps just to a value of zero when the
248program execution ends. In this case, afl-fuzz is not able to see that the edge
249has been accessed and will ignore it.
250
251NeverZero prevents this behavior. If a counter wraps, it jumps over the value 0
252directly to a 1. This improves path discovery (by a very small amount) at a very
253low cost (one instruction per edge).
254
255(The alternative of saturated counters has been tested also and proved to be
256inferior in terms of path discovery.)
257
258This is implemented in afl-gcc and afl-gcc-fast, however, for llvm_mode this is
259optional if multithread safe counters are selected or the llvm version is below
2609 - as there are severe performance costs in these cases.
261
262If you want to enable this for llvm versions below 9 or thread safe counters,
263then set
264
265```
266export AFL_LLVM_NOT_ZERO=1
267```
268
269In case you are on llvm 9 or greater and you do not want this behavior, then you
270can set:
271
272```
273AFL_LLVM_SKIP_NEVERZERO=1
274```
275
276If the target does not have extensive loops or functions that are called a lot,
277then this can give a small performance boost.
278
279Please note that the default counter implementations are not thread safe!
280
281Support for thread safe counters in mode LLVM CLASSIC can be activated with
282setting `AFL_LLVM_THREADSAFE_INST=1`.
283
284## 8) Source code coverage through instrumentation
285
286Measuring source code coverage is a common task in fuzzing, but it is very
287difficut to do in some situations (e.g. when using snapshot fuzzing).
288
289When using the `AFL_LLVM_INSTRUMENT=llvm-codecov` option, afl-cc will use
290native trace-pc-guard instrumentation but additionally select options that
291are required to utilize the instrumentation for source code coverage.
292
293In particular, it will switch the instrumentation to be per basic block
294instead of instrumenting edges, disable all guard pruning and enable the
295experimental pc-table support that allows the runtime to gather 100% of
296instrumented basic blocks at start, including their locations.
297
298Note: You must compile AFL with the `CODE_COVERAGE=1` option to enable the
299respective parts in the AFL compiler runtime. Support is currently only
300implemented for Nyx, but can in theory also work without Nyx.
301
302Note: You might have to adjust `MAP_SIZE_POW2` in include/config.h to ensure
303that your coverage map is large enough to hold all basic blocks of your
304target program without any collisions.
305
306More documentation on how to utilize this with Nyx will follow.
307

README.lto.md

1# afl-clang-lto - collision free instrumentation at link time
2
3## TL;DR:
4
5This version requires a LLVM 12 or newer.
6
71. Use afl-clang-lto/afl-clang-lto++ because the resulting binaries run
8   slightly faster and give better coverage.
9
102. You can use it together with COMPCOV, COMPLOG and the instrument file
11   listing features.
12
133. It only works with LLVM 12 or newer.
14
154. AUTODICTIONARY feature (see below)
16
175. If any problems arise, be sure to set `AR=llvm-ar RANLIB=llvm-ranlib AS=llvm-as`.
18   Some targets might need `LD=afl-clang-lto` and others `LD=afl-ld-lto`.
19
20## Introduction and problem description
21
22A big issue with how vanilla AFL worked was that the basic block IDs that are
23set during compilation are random - and hence naturally the larger the number
24of instrumented locations, the higher the number of edge collisions are in the
25map. This can result in not discovering new paths and therefore degrade the
26efficiency of the fuzzing process.
27
28*This issue is underestimated in the fuzzing community* With a 2^16 = 64kb
29standard map at already 256 instrumented blocks, there is on average one
30collision. On average, a target has 10.000 to 50.000 instrumented blocks, hence
31the real collisions are between 750-18.000!
32
33Note that PCGUARD (our own modified implementation and the SANCOV PCGUARD
34implementation from libfuzzer) also provides collision free coverage.
35It is a bit slower though and can a few targets with very early constructors.
36
37* We instrument at link time when we have all files pre-compiled.
38* To instrument at link time, we compile in LTO (link time optimization) mode.
39* Our compiler (afl-clang-lto/afl-clang-lto++) takes care of setting the correct
40  LTO options and runs our own afl-ld linker instead of the system linker.
41* The LLVM linker collects all LTO files to link and instruments them so that we
42  have non-colliding edge coverage.
43* We use a new (for afl) edge coverage - which is the same as in llvm
44  -fsanitize=coverage edge coverage mode. :)
45
46The result:
47
48* 10-25% speed gain compared to llvm_mode
49* guaranteed non-colliding edge coverage
50* The compile time, especially for binaries to an instrumented library, can be
51  much (and sometimes much much) longer.
52
53Example build output from a libtiff build:
54
55```
56libtool: link: afl-clang-lto -g -O2 -Wall -W -o thumbnail thumbnail.o  ../libtiff/.libs/libtiff.a ../port/.libs/libport.a -llzma -ljbig -ljpeg -lz -lm
57afl-clang-lto++2.63d by Marc "vanHauser" Heuse <[email protected]> in mode LTO
58afl-llvm-lto++2.63d by Marc "vanHauser" Heuse <[email protected]>
59AUTODICTIONARY: 11 strings found
60[+] Instrumented 12071 locations with no collisions (on average 1046 collisions would be in afl-gcc/afl-clang-fast) (non-hardened mode).
61```
62
63## Getting LLVM 12+
64
65### Installing llvm
66
67The best way to install LLVM is to follow [https://apt.llvm.org/](https://apt.llvm.org/)
68
69e.g. for LLVM 15:
70```
71wget https://apt.llvm.org/llvm.sh
72chmod +x llvm.sh
73sudo ./llvm.sh 15 all
74```
75
76LLVM 12 to 18 should be available in all current Linux repositories.
77
78## How to build afl-clang-lto
79
80That part is easy.
81Just set `LLVM_CONFIG` to the llvm-config-VERSION and build AFL++, e.g. for
82LLVM 15:
83
84```
85cd ~/AFLplusplus
86export LLVM_CONFIG=llvm-config-15
87make
88sudo make install
89```
90
91## How to use afl-clang-lto
92
93Just use afl-clang-lto like you did with afl-clang-fast or afl-gcc.
94
95Also, the instrument file listing (AFL_LLVM_ALLOWLIST/AFL_LLVM_DENYLIST ->
96[README.instrument_list.md](README.instrument_list.md)) and laf-intel/compcov
97(AFL_LLVM_LAF_* -> [README.laf-intel.md](README.laf-intel.md)) work.
98
99Example (note that you might need to add the version, e.g. `llvm-ar-15`:
100
101```
102CC=afl-clang-lto CXX=afl-clang-lto++ RANLIB=llvm-ranlib AR=llvm-ar AS=llvm-as ./configure
103make
104```
105
106NOTE: some targets also need to set the linker, try both `afl-clang-lto` and
107`afl-ld-lto` for `LD=` before `configure`.
108
109## Instrumenting shared libraries
110
111Note: this is highly discouraged! Try to compile to static libraries with
112afl-clang-lto instead of shared libraries!
113
114To make instrumented shared libraries work with afl-clang-lto, you have to do
115quite some extra steps.
116
117Every shared library you want to instrument has to be individually compiled. The
118environment variable `AFL_LLVM_LTO_DONTWRITEID=1` has to be set during
119compilation. Additionally, the environment variable `AFL_LLVM_LTO_STARTID` has
120to be set to the added edge count values of all previous compiled instrumented
121shared libraries for that target. E.g., for the first shared library this would
122be `AFL_LLVM_LTO_STARTID=0` and afl-clang-lto will then report how many edges
123have been instrumented (let's say it reported 1000 instrumented edges). The
124second shared library then has to be set to that value
125(`AFL_LLVM_LTO_STARTID=1000` in our example), for the third to all previous
126counts added, etc.
127
128The final program compilation step then may *not* have
129`AFL_LLVM_LTO_DONTWRITEID` set, and `AFL_LLVM_LTO_STARTID` must be set to all
130edge counts added of all shared libraries it will be linked to.
131
132This is quite some hands-on work, so better stay away from instrumenting shared
133libraries. :-)
134
135## AUTODICTIONARY feature
136
137While compiling, a dictionary based on string comparisons is automatically
138generated and put into the target binary. This dictionary is transferred to
139afl-fuzz on start. This improves coverage statistically by 5-10%. :)
140
141Note that if for any reason you do not want to use the autodictionary feature,
142then just set the environment variable `AFL_NO_AUTODICT` when starting afl-fuzz.
143
144## Fixed memory map
145
146To speed up fuzzing a little bit more, it is possible to set a fixed shared
147memory map. Recommended is the value 0x10000.
148
149In most cases, this will work without any problems. However, if a target uses
150early constructors, ifuncs, or a deferred forkserver, this can crash the target.
151
152Also, on unusual operating systems/processors/kernels or weird libraries the
153recommended 0x10000 address might not work, so then change the fixed address.
154
155To enable this feature, set `AFL_LLVM_MAP_ADDR` with the address.
156
157## Document edge IDs
158
159Setting `export AFL_LLVM_DOCUMENT_IDS=file` will document in a file which edge
160ID was given to which function. This helps to identify functions with variable
161bytes or which functions were touched by an input.
162
163## Solving difficult targets
164
165Some targets are difficult because the configure script does unusual stuff that
166is unexpected for afl. See the next section `Potential issues` for how to solve
167these.
168
169### Example: ffmpeg
170
171An example of a hard to solve target is ffmpeg. Here is how to successfully
172instrument it:
173
1741. Get and extract the current ffmpeg and change to its directory.
175
1762. Running configure with --cc=clang fails and various other items will fail
177   when compiling, so we have to trick configure:
178
179    ```
180    ./configure --enable-lto --disable-shared --disable-inline-asm
181    ```
182
1833. Now the configuration is done - and we edit the settings in
184   `./ffbuild/config.mak` (-: the original line, +: what to change it into):
185
186    ```
187    -CC=gcc
188    +CC=afl-clang-lto
189    -CXX=g++
190    +CXX=afl-clang-lto++
191    -AS=gcc
192    +AS=llvm-as
193    -LD=gcc
194    +LD=afl-clang-lto++
195    -DEPCC=gcc
196    +DEPCC=afl-clang-lto
197    -DEPAS=gcc
198    +DEPAS=afl-clang-lto++
199    -AR=ar
200    +AR=llvm-ar
201    -AR_CMD=ar
202    +AR_CMD=llvm-ar
203    -NM_CMD=nm -g
204    +NM_CMD=llvm-nm -g
205    -RANLIB=ranlib -D
206    +RANLIB=llvm-ranlib -D
207    ```
208
2094. Then type make, wait for a long time, and you are done. :)
210
211### Example: WebKit jsc
212
213Building jsc is difficult as the build script has bugs.
214
2151. Checkout Webkit:
216
217    ```
218    svn checkout https://svn.webkit.org/repository/webkit/trunk WebKit
219    cd WebKit
220    ```
221
2222. Fix the build environment:
223
224    ```
225    mkdir -p WebKitBuild/Release
226    cd WebKitBuild/Release
227    ln -s ../../../../../usr/bin/llvm-ar-12 llvm-ar-12
228    ln -s ../../../../../usr/bin/llvm-ranlib-12 llvm-ranlib-12
229    cd ../..
230    ```
231
2323. Build. :)
233
234    ```
235    Tools/Scripts/build-jsc --jsc-only --cli --cmakeargs="-DCMAKE_AR='llvm-ar-12' -DCMAKE_RANLIB='llvm-ranlib-12' -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON -DCMAKE_CC_FLAGS='-O3 -lrt' -DCMAKE_CXX_FLAGS='-O3 -lrt' -DIMPORTED_LOCATION='/lib/x86_64-linux-gnu/' -DCMAKE_CC=afl-clang-lto -DCMAKE_CXX=afl-clang-lto++ -DENABLE_STATIC_JSC=ON"
236    ```
237
238## Potential issues
239
240### Compiling libraries fails
241
242If you see this message:
243
244```
245/bin/ld: libfoo.a: error adding symbols: archive has no index; run ranlib to add one
246```
247
248This is because usually gnu gcc ranlib is being called which cannot deal with
249clang LTO files. The solution is simple: when you `./configure`, you also have
250to set `RANLIB=llvm-ranlib` and `AR=llvm-ar`.
251
252Solution:
253
254```
255AR=llvm-ar RANLIB=llvm-ranlib CC=afl-clang-lto CXX=afl-clang-lto++ ./configure --disable-shared
256```
257
258And on some targets you have to set `AR=/RANLIB=` even for `make` as the
259configure script does not save it. Other targets ignore environment variables
260and need the parameters set via `./configure --cc=... --cxx= --ranlib= ...` etc.
261(I am looking at you ffmpeg!)
262
263If you see this message:
264
265```
266assembler command failed ...
267```
268
269Then try setting `llvm-as` for configure:
270
271```
272AS=llvm-as  ...
273```
274
275### Compiling programs still fail
276
277afl-clang-lto is still work in progress.
278
279Known issues:
280* Anything that LLVM 12+ cannot compile, afl-clang-lto cannot compile either -
281  obviously.
282* Anything that does not compile with LTO, afl-clang-lto cannot compile either -
283  obviously.
284
285Hence, if building a target with afl-clang-lto fails, try to build it with
286LLVM 12 and LTO enabled (`CC=clang-12`, `CXX=clang++-12`, `CFLAGS=-flto=full`,
287and `CXXFLAGS=-flto=full`).
288
289If this succeeds, then there is an issue with afl-clang-lto. Please report at
290[https://github.com/AFLplusplus/AFLplusplus/issues/226](https://github.com/AFLplusplus/AFLplusplus/issues/226).
291
292Even some targets where clang-12 fails can be built if the fail is just in
293`./configure`, see `Solving difficult targets` above.
294
295## History
296
297This was originally envisioned by hexcoder- in Summer 2019. However, we saw no
298way to create a pass that is run at link time - although there is a option for
299this in the PassManager: EP_FullLinkTimeOptimizationLast. ("Fun" info - nobody
300knows what this is doing. And the developer who implemented this didn't respond
301to emails.)
302
303In December then came the idea to implement this as a pass that is run via the
304LLVM "opt" program, which is performed via an own linker that afterwards calls
305the real linker. This was first implemented in January and work ... kinda. The
306LTO time instrumentation worked, however, "how" the basic blocks were
307instrumented was a problem, as reducing duplicates turned out to be very, very
308difficult with a program that has so many paths and therefore so many
309dependencies. A lot of strategies were implemented - and failed. And then sat
310solvers were tried, but with over 10.000 variables that turned out to be a
311dead-end too.
312
313The final idea to solve this came from domenukk who proposed to insert a block
314into an edge and then just use incremental counters ... and this worked! After
315some trials and errors to implement this vanhauser-thc found out that there is
316actually an LLVM function for this: SplitEdge() :-)
317
318Still more problems came up though as this only works without bugs from LLVM 9
319onwards, and with high optimization the link optimization ruins the instrumented
320control flow graph.
321
322This is all now fixed with LLVM 12+. The llvm's own linker is now able to load
323passes and this bypasses all problems we had.
324
325Happy end :)
326

README.persistent_mode.md

1# llvm_mode persistent mode
2
3## 1) Introduction
4
5In persistent mode, AFL++ fuzzes a target multiple times in a single forked
6process, instead of forking a new process for each fuzz execution. This is the
7most effective way to fuzz, as the speed can easily be x10 or x20 times faster
8without any disadvantages. *All professional fuzzing uses this mode.*
9
10Persistent mode requires that the target can be called in one or more functions,
11and that it's state can be completely reset so that multiple calls can be
12performed without resource leaks, and that earlier runs will have no impact on
13future runs. An indicator for this is the `stability` value in the `afl-fuzz`
14UI. If this decreases to lower values in persistent mode compared to
15non-persistent mode, then the fuzz target keeps state.
16
17Examples can be found in [utils/persistent_mode](../utils/persistent_mode).
18
19## 2) TL;DR:
20
21Example `fuzz_target.c`:
22
23```c
24#include "what_you_need_for_your_target.h"
25
26__AFL_FUZZ_INIT();
27
28main() {
29
30  // anything else here, e.g. command line arguments, initialization, etc.
31
32#ifdef __AFL_HAVE_MANUAL_CONTROL
33  __AFL_INIT();
34#endif
35
36  unsigned char *buf = __AFL_FUZZ_TESTCASE_BUF;  // must be after __AFL_INIT
37                                                 // and before __AFL_LOOP!
38
39  while (__AFL_LOOP(10000)) {
40
41    int len = __AFL_FUZZ_TESTCASE_LEN;  // don't use the macro directly in a
42                                        // call!
43
44    if (len < 8) continue;  // check for a required/useful minimum input length
45
46    /* Setup function call, e.g. struct target *tmp = libtarget_init() */
47    /* Call function to be fuzzed, e.g.: */
48    target_function(buf, len);
49    /* Reset state. e.g. libtarget_free(tmp) */
50
51  }
52
53  return 0;
54
55}
56```
57
58And then compile:
59
60```
61afl-clang-fast -o fuzz_target fuzz_target.c -lwhat_you_need_for_your_target
62```
63
64And that is it! The speed increase is usually x10 to x20.
65
66If you want to be able to compile the target without afl-clang-fast/lto, then
67add this just after the includes:
68
69```c
70#ifndef __AFL_FUZZ_TESTCASE_LEN
71  ssize_t fuzz_len;
72  #define __AFL_FUZZ_TESTCASE_LEN fuzz_len
73  unsigned char fuzz_buf[1024000];
74  #define __AFL_FUZZ_TESTCASE_BUF fuzz_buf
75  #define __AFL_FUZZ_INIT() void sync(void);
76  #define __AFL_LOOP(x) ((fuzz_len = read(0, fuzz_buf, sizeof(fuzz_buf))) > 0 ? 1 : 0)
77  #define __AFL_INIT() sync()
78#endif
79```
80
81## 3) Deferred initialization
82
83AFL++ tries to optimize performance by executing the targeted binary just once,
84stopping it just before `main()`, and then cloning this "main" process to get a
85steady supply of targets to fuzz.
86
87Although this approach eliminates much of the OS-, linker- and libc-level costs
88of executing the program, it does not always help with binaries that perform
89other time-consuming initialization steps - say, parsing a large config file
90before getting to the fuzzed data.
91
92In such cases, it's beneficial to initialize the forkserver a bit later, once
93most of the initialization work is already done, but before the binary attempts
94to read the fuzzed input and parse it; in some cases, this can offer a 10x+
95performance gain. You can implement delayed initialization in LLVM mode in a
96fairly simple way.
97
98First, find a suitable location in the code where the delayed cloning can take
99place. This needs to be done with *extreme* care to avoid breaking the binary.
100In particular, the program will probably malfunction if you select a location
101after:
102
103- The creation of any vital threads or child processes - since the forkserver
104  can't clone them easily.
105
106- The initialization of timers via `setitimer()` or equivalent calls.
107
108- The creation of temporary files, network sockets, offset-sensitive file
109  descriptors, and similar shared-state resources - but only provided that their
110  state meaningfully influences the behavior of the program later on.
111
112- Any access to the fuzzed input, including reading the metadata about its size.
113
114With the location selected, add this code in the appropriate spot:
115
116```c
117#ifdef __AFL_HAVE_MANUAL_CONTROL
118  __AFL_INIT();
119#endif
120```
121
122You don't need the #ifdef guards, but including them ensures that the program
123will keep working normally when compiled with a tool other than afl-clang-fast/
124afl-clang-lto/afl-gcc-fast.
125
126Finally, recompile the program with afl-clang-fast/afl-clang-lto/afl-gcc-fast
127(afl-gcc or afl-clang will *not* generate a deferred-initialization binary) -
128and you should be all set!
129
130## 4) Persistent mode
131
132Some libraries provide APIs that are stateless, or whose state can be reset in
133between processing different input files. When such a reset is performed, a
134single long-lived process can be reused to try out multiple test cases,
135eliminating the need for repeated `fork()` calls and the associated OS overhead.
136
137The basic structure of the program that does this would be:
138
139```c
140  while (__AFL_LOOP(1000)) {
141
142    /* Read input data. */
143    /* Call library code to be fuzzed. */
144    /* Reset state. */
145
146  }
147
148  /* Exit normally. */
149```
150
151The numerical value specified within the loop controls the maximum number of
152iterations before AFL++ will restart the process from scratch. This minimizes
153the impact of memory leaks and similar glitches; 1000 is a good starting point,
154and going much higher increases the likelihood of hiccups without giving you any
155real performance benefits.
156
157A more detailed template is shown in
158[utils/persistent_mode](../utils/persistent_mode). Similarly to the deferred
159initialization, the feature works only with afl-clang-fast; `#ifdef` guards can
160be used to suppress it when using other compilers.
161
162Note that as with the deferred initialization, the feature is easy to misuse; if
163you do not fully reset the critical state, you may end up with false positives
164or waste a whole lot of CPU power doing nothing useful at all. Be particularly
165wary of memory leaks and of the state of file descriptors.
166
167When running in this mode, the execution paths will inherently vary a bit
168depending on whether the input loop is being entered for the first time or
169executed again.
170
171## 5) Shared memory fuzzing
172
173You can speed up the fuzzing process even more by receiving the fuzzing data via
174shared memory instead of stdin or files. This is a further speed multiplier of
175about 2x.
176
177Setting this up is very easy:
178
179After the includes set the following macro:
180
181```c
182__AFL_FUZZ_INIT();
183```
184
185Directly at the start of main - or if you are using the deferred forkserver with
186`__AFL_INIT()`, then *after* `__AFL_INIT()`:
187
188```c
189  unsigned char *buf = __AFL_FUZZ_TESTCASE_BUF;
190```
191
192Then as first line after the `__AFL_LOOP` while loop:
193
194```c
195  int len = __AFL_FUZZ_TESTCASE_LEN;
196```
197
198And that is all!