1*67e74705SXin Li================= 2*67e74705SXin LiSanitizerCoverage 3*67e74705SXin Li================= 4*67e74705SXin Li 5*67e74705SXin Li.. contents:: 6*67e74705SXin Li :local: 7*67e74705SXin Li 8*67e74705SXin LiIntroduction 9*67e74705SXin Li============ 10*67e74705SXin Li 11*67e74705SXin LiSanitizer tools have a very simple code coverage tool built in. It allows to 12*67e74705SXin Liget function-level, basic-block-level, and edge-level coverage at a very low 13*67e74705SXin Licost. 14*67e74705SXin Li 15*67e74705SXin LiHow to build and run 16*67e74705SXin Li==================== 17*67e74705SXin Li 18*67e74705SXin LiSanitizerCoverage can be used with :doc:`AddressSanitizer`, 19*67e74705SXin Li:doc:`LeakSanitizer`, :doc:`MemorySanitizer`, 20*67e74705SXin LiUndefinedBehaviorSanitizer, or without any sanitizer. Pass one of the 21*67e74705SXin Lifollowing compile-time flags: 22*67e74705SXin Li 23*67e74705SXin Li* ``-fsanitize-coverage=func`` for function-level coverage (very fast). 24*67e74705SXin Li* ``-fsanitize-coverage=bb`` for basic-block-level coverage (may add up to 30% 25*67e74705SXin Li **extra** slowdown). 26*67e74705SXin Li* ``-fsanitize-coverage=edge`` for edge-level coverage (up to 40% slowdown). 27*67e74705SXin Li 28*67e74705SXin LiYou may also specify ``-fsanitize-coverage=indirect-calls`` for 29*67e74705SXin Liadditional `caller-callee coverage`_. 30*67e74705SXin Li 31*67e74705SXin LiAt run time, pass ``coverage=1`` in ``ASAN_OPTIONS``, 32*67e74705SXin Li``LSAN_OPTIONS``, ``MSAN_OPTIONS`` or ``UBSAN_OPTIONS``, as 33*67e74705SXin Liappropriate. For the standalone coverage mode, use ``UBSAN_OPTIONS``. 34*67e74705SXin Li 35*67e74705SXin LiTo get `Coverage counters`_, add ``-fsanitize-coverage=8bit-counters`` 36*67e74705SXin Lito one of the above compile-time flags. At runtime, use 37*67e74705SXin Li``*SAN_OPTIONS=coverage=1:coverage_counters=1``. 38*67e74705SXin Li 39*67e74705SXin LiExample: 40*67e74705SXin Li 41*67e74705SXin Li.. code-block:: console 42*67e74705SXin Li 43*67e74705SXin Li % cat -n cov.cc 44*67e74705SXin Li 1 #include <stdio.h> 45*67e74705SXin Li 2 __attribute__((noinline)) 46*67e74705SXin Li 3 void foo() { printf("foo\n"); } 47*67e74705SXin Li 4 48*67e74705SXin Li 5 int main(int argc, char **argv) { 49*67e74705SXin Li 6 if (argc == 2) 50*67e74705SXin Li 7 foo(); 51*67e74705SXin Li 8 printf("main\n"); 52*67e74705SXin Li 9 } 53*67e74705SXin Li % clang++ -g cov.cc -fsanitize=address -fsanitize-coverage=func 54*67e74705SXin Li % ASAN_OPTIONS=coverage=1 ./a.out; ls -l *sancov 55*67e74705SXin Li main 56*67e74705SXin Li -rw-r----- 1 kcc eng 4 Nov 27 12:21 a.out.22673.sancov 57*67e74705SXin Li % ASAN_OPTIONS=coverage=1 ./a.out foo ; ls -l *sancov 58*67e74705SXin Li foo 59*67e74705SXin Li main 60*67e74705SXin Li -rw-r----- 1 kcc eng 4 Nov 27 12:21 a.out.22673.sancov 61*67e74705SXin Li -rw-r----- 1 kcc eng 8 Nov 27 12:21 a.out.22679.sancov 62*67e74705SXin Li 63*67e74705SXin LiEvery time you run an executable instrumented with SanitizerCoverage 64*67e74705SXin Lione ``*.sancov`` file is created during the process shutdown. 65*67e74705SXin LiIf the executable is dynamically linked against instrumented DSOs, 66*67e74705SXin Lione ``*.sancov`` file will be also created for every DSO. 67*67e74705SXin Li 68*67e74705SXin LiPostprocessing 69*67e74705SXin Li============== 70*67e74705SXin Li 71*67e74705SXin LiThe format of ``*.sancov`` files is very simple: the first 8 bytes is the magic, 72*67e74705SXin Lione of ``0xC0BFFFFFFFFFFF64`` and ``0xC0BFFFFFFFFFFF32``. The last byte of the 73*67e74705SXin Limagic defines the size of the following offsets. The rest of the data is the 74*67e74705SXin Lioffsets in the corresponding binary/DSO that were executed during the run. 75*67e74705SXin Li 76*67e74705SXin LiA simple script 77*67e74705SXin Li``$LLVM/projects/compiler-rt/lib/sanitizer_common/scripts/sancov.py`` is 78*67e74705SXin Liprovided to dump these offsets. 79*67e74705SXin Li 80*67e74705SXin Li.. code-block:: console 81*67e74705SXin Li 82*67e74705SXin Li % sancov.py print a.out.22679.sancov a.out.22673.sancov 83*67e74705SXin Li sancov.py: read 2 PCs from a.out.22679.sancov 84*67e74705SXin Li sancov.py: read 1 PCs from a.out.22673.sancov 85*67e74705SXin Li sancov.py: 2 files merged; 2 PCs total 86*67e74705SXin Li 0x465250 87*67e74705SXin Li 0x4652a0 88*67e74705SXin Li 89*67e74705SXin LiYou can then filter the output of ``sancov.py`` through ``addr2line --exe 90*67e74705SXin LiObjectFile`` or ``llvm-symbolizer --obj ObjectFile`` to get file names and line 91*67e74705SXin Linumbers: 92*67e74705SXin Li 93*67e74705SXin Li.. code-block:: console 94*67e74705SXin Li 95*67e74705SXin Li % sancov.py print a.out.22679.sancov a.out.22673.sancov 2> /dev/null | llvm-symbolizer --obj a.out 96*67e74705SXin Li cov.cc:3 97*67e74705SXin Li cov.cc:5 98*67e74705SXin Li 99*67e74705SXin LiSancov Tool 100*67e74705SXin Li=========== 101*67e74705SXin Li 102*67e74705SXin LiA new experimental ``sancov`` tool is developed to process coverage files. 103*67e74705SXin LiThe tool is part of LLVM project and is currently supported only on Linux. 104*67e74705SXin LiIt can handle symbolization tasks autonomously without any extra support 105*67e74705SXin Lifrom the environment. You need to pass .sancov files (named 106*67e74705SXin Li``<module_name>.<pid>.sancov`` and paths to all corresponding binary elf files. 107*67e74705SXin LiSancov matches these files using module names and binaries file names. 108*67e74705SXin Li 109*67e74705SXin Li.. code-block:: console 110*67e74705SXin Li 111*67e74705SXin Li USAGE: sancov [options] <action> (<binary file>|<.sancov file>)... 112*67e74705SXin Li 113*67e74705SXin Li Action (required) 114*67e74705SXin Li -print - Print coverage addresses 115*67e74705SXin Li -covered-functions - Print all covered functions. 116*67e74705SXin Li -not-covered-functions - Print all not covered functions. 117*67e74705SXin Li -html-report - Print HTML coverage report. 118*67e74705SXin Li 119*67e74705SXin Li Options 120*67e74705SXin Li -blacklist=<string> - Blacklist file (sanitizer blacklist format). 121*67e74705SXin Li -demangle - Print demangled function name. 122*67e74705SXin Li -strip_path_prefix=<string> - Strip this prefix from file paths in reports 123*67e74705SXin Li 124*67e74705SXin Li 125*67e74705SXin LiAutomatic HTML Report Generation 126*67e74705SXin Li================================ 127*67e74705SXin Li 128*67e74705SXin LiIf ``*SAN_OPTIONS`` contains ``html_cov_report=1`` option set, then html 129*67e74705SXin Licoverage report would be automatically generated alongside the coverage files. 130*67e74705SXin LiThe ``sancov`` binary should be present in ``PATH`` or 131*67e74705SXin Li``sancov_path=<path_to_sancov`` option can be used to specify tool location. 132*67e74705SXin Li 133*67e74705SXin Li 134*67e74705SXin LiHow good is the coverage? 135*67e74705SXin Li========================= 136*67e74705SXin Li 137*67e74705SXin LiIt is possible to find out which PCs are not covered, by subtracting the covered 138*67e74705SXin Liset from the set of all instrumented PCs. The latter can be obtained by listing 139*67e74705SXin Liall callsites of ``__sanitizer_cov()`` in the binary. On Linux, ``sancov.py`` 140*67e74705SXin Lican do this for you. Just supply the path to binary and a list of covered PCs: 141*67e74705SXin Li 142*67e74705SXin Li.. code-block:: console 143*67e74705SXin Li 144*67e74705SXin Li % sancov.py print a.out.12345.sancov > covered.txt 145*67e74705SXin Li sancov.py: read 2 64-bit PCs from a.out.12345.sancov 146*67e74705SXin Li sancov.py: 1 file merged; 2 PCs total 147*67e74705SXin Li % sancov.py missing a.out < covered.txt 148*67e74705SXin Li sancov.py: found 3 instrumented PCs in a.out 149*67e74705SXin Li sancov.py: read 2 PCs from stdin 150*67e74705SXin Li sancov.py: 1 PCs missing from coverage 151*67e74705SXin Li 0x4cc61c 152*67e74705SXin Li 153*67e74705SXin LiEdge coverage 154*67e74705SXin Li============= 155*67e74705SXin Li 156*67e74705SXin LiConsider this code: 157*67e74705SXin Li 158*67e74705SXin Li.. code-block:: c++ 159*67e74705SXin Li 160*67e74705SXin Li void foo(int *a) { 161*67e74705SXin Li if (a) 162*67e74705SXin Li *a = 0; 163*67e74705SXin Li } 164*67e74705SXin Li 165*67e74705SXin LiIt contains 3 basic blocks, let's name them A, B, C: 166*67e74705SXin Li 167*67e74705SXin Li.. code-block:: none 168*67e74705SXin Li 169*67e74705SXin Li A 170*67e74705SXin Li |\ 171*67e74705SXin Li | \ 172*67e74705SXin Li | B 173*67e74705SXin Li | / 174*67e74705SXin Li |/ 175*67e74705SXin Li C 176*67e74705SXin Li 177*67e74705SXin LiIf blocks A, B, and C are all covered we know for certain that the edges A=>B 178*67e74705SXin Liand B=>C were executed, but we still don't know if the edge A=>C was executed. 179*67e74705SXin LiSuch edges of control flow graph are called 180*67e74705SXin Li`critical <http://en.wikipedia.org/wiki/Control_flow_graph#Special_edges>`_. The 181*67e74705SXin Liedge-level coverage (``-fsanitize-coverage=edge``) simply splits all critical 182*67e74705SXin Liedges by introducing new dummy blocks and then instruments those blocks: 183*67e74705SXin Li 184*67e74705SXin Li.. code-block:: none 185*67e74705SXin Li 186*67e74705SXin Li A 187*67e74705SXin Li |\ 188*67e74705SXin Li | \ 189*67e74705SXin Li D B 190*67e74705SXin Li | / 191*67e74705SXin Li |/ 192*67e74705SXin Li C 193*67e74705SXin Li 194*67e74705SXin LiBitset 195*67e74705SXin Li====== 196*67e74705SXin Li 197*67e74705SXin LiWhen ``coverage_bitset=1`` run-time flag is given, the coverage will also be 198*67e74705SXin Lidumped as a bitset (text file with 1 for blocks that have been executed and 0 199*67e74705SXin Lifor blocks that were not). 200*67e74705SXin Li 201*67e74705SXin Li.. code-block:: console 202*67e74705SXin Li 203*67e74705SXin Li % clang++ -fsanitize=address -fsanitize-coverage=edge cov.cc 204*67e74705SXin Li % ASAN_OPTIONS="coverage=1:coverage_bitset=1" ./a.out 205*67e74705SXin Li main 206*67e74705SXin Li % ASAN_OPTIONS="coverage=1:coverage_bitset=1" ./a.out 1 207*67e74705SXin Li foo 208*67e74705SXin Li main 209*67e74705SXin Li % head *bitset* 210*67e74705SXin Li ==> a.out.38214.bitset-sancov <== 211*67e74705SXin Li 01101 212*67e74705SXin Li ==> a.out.6128.bitset-sancov <== 213*67e74705SXin Li 11011% 214*67e74705SXin Li 215*67e74705SXin LiFor a given executable the length of the bitset is always the same (well, 216*67e74705SXin Liunless dlopen/dlclose come into play), so the bitset coverage can be 217*67e74705SXin Lieasily used for bitset-based corpus distillation. 218*67e74705SXin Li 219*67e74705SXin LiCaller-callee coverage 220*67e74705SXin Li====================== 221*67e74705SXin Li 222*67e74705SXin Li(Experimental!) 223*67e74705SXin LiEvery indirect function call is instrumented with a run-time function call that 224*67e74705SXin Licaptures caller and callee. At the shutdown time the process dumps a separate 225*67e74705SXin Lifile called ``caller-callee.PID.sancov`` which contains caller/callee pairs as 226*67e74705SXin Lipairs of lines (odd lines are callers, even lines are callees) 227*67e74705SXin Li 228*67e74705SXin Li.. code-block:: console 229*67e74705SXin Li 230*67e74705SXin Li a.out 0x4a2e0c 231*67e74705SXin Li a.out 0x4a6510 232*67e74705SXin Li a.out 0x4a2e0c 233*67e74705SXin Li a.out 0x4a87f0 234*67e74705SXin Li 235*67e74705SXin LiCurrent limitations: 236*67e74705SXin Li 237*67e74705SXin Li* Only the first 14 callees for every caller are recorded, the rest are silently 238*67e74705SXin Li ignored. 239*67e74705SXin Li* The output format is not very compact since caller and callee may reside in 240*67e74705SXin Li different modules and we need to spell out the module names. 241*67e74705SXin Li* The routine that dumps the output is not optimized for speed 242*67e74705SXin Li* Only Linux x86_64 is tested so far. 243*67e74705SXin Li* Sandboxes are not supported. 244*67e74705SXin Li 245*67e74705SXin LiCoverage counters 246*67e74705SXin Li================= 247*67e74705SXin Li 248*67e74705SXin LiThis experimental feature is inspired by 249*67e74705SXin Li`AFL <http://lcamtuf.coredump.cx/afl/technical_details.txt>`__'s coverage 250*67e74705SXin Liinstrumentation. With additional compile-time and run-time flags you can get 251*67e74705SXin Limore sensitive coverage information. In addition to boolean values assigned to 252*67e74705SXin Lievery basic block (edge) the instrumentation will collect imprecise counters. 253*67e74705SXin LiOn exit, every counter will be mapped to a 8-bit bitset representing counter 254*67e74705SXin Liranges: ``1, 2, 3, 4-7, 8-15, 16-31, 32-127, 128+`` and those 8-bit bitsets will 255*67e74705SXin Libe dumped to disk. 256*67e74705SXin Li 257*67e74705SXin Li.. code-block:: console 258*67e74705SXin Li 259*67e74705SXin Li % clang++ -g cov.cc -fsanitize=address -fsanitize-coverage=edge,8bit-counters 260*67e74705SXin Li % ASAN_OPTIONS="coverage=1:coverage_counters=1" ./a.out 261*67e74705SXin Li % ls -l *counters-sancov 262*67e74705SXin Li ... a.out.17110.counters-sancov 263*67e74705SXin Li % xxd *counters-sancov 264*67e74705SXin Li 0000000: 0001 0100 01 265*67e74705SXin Li 266*67e74705SXin LiThese counters may also be used for in-process coverage-guided fuzzers. See 267*67e74705SXin Li``include/sanitizer/coverage_interface.h``: 268*67e74705SXin Li 269*67e74705SXin Li.. code-block:: c++ 270*67e74705SXin Li 271*67e74705SXin Li // The coverage instrumentation may optionally provide imprecise counters. 272*67e74705SXin Li // Rather than exposing the counter values to the user we instead map 273*67e74705SXin Li // the counters to a bitset. 274*67e74705SXin Li // Every counter is associated with 8 bits in the bitset. 275*67e74705SXin Li // We define 8 value ranges: 1, 2, 3, 4-7, 8-15, 16-31, 32-127, 128+ 276*67e74705SXin Li // The i-th bit is set to 1 if the counter value is in the i-th range. 277*67e74705SXin Li // This counter-based coverage implementation is *not* thread-safe. 278*67e74705SXin Li 279*67e74705SXin Li // Returns the number of registered coverage counters. 280*67e74705SXin Li uintptr_t __sanitizer_get_number_of_counters(); 281*67e74705SXin Li // Updates the counter 'bitset', clears the counters and returns the number of 282*67e74705SXin Li // new bits in 'bitset'. 283*67e74705SXin Li // If 'bitset' is nullptr, only clears the counters. 284*67e74705SXin Li // Otherwise 'bitset' should be at least 285*67e74705SXin Li // __sanitizer_get_number_of_counters bytes long and 8-aligned. 286*67e74705SXin Li uintptr_t 287*67e74705SXin Li __sanitizer_update_counter_bitset_and_clear_counters(uint8_t *bitset); 288*67e74705SXin Li 289*67e74705SXin LiTracing basic blocks 290*67e74705SXin Li==================== 291*67e74705SXin LiExperimental support for basic block (or edge) tracing. 292*67e74705SXin LiWith ``-fsanitize-coverage=trace-bb`` the compiler will insert 293*67e74705SXin Li``__sanitizer_cov_trace_basic_block(s32 *id)`` before every function, basic block, or edge 294*67e74705SXin Li(depending on the value of ``-fsanitize-coverage=[func,bb,edge]``). 295*67e74705SXin LiExample: 296*67e74705SXin Li 297*67e74705SXin Li.. code-block:: console 298*67e74705SXin Li 299*67e74705SXin Li % clang -g -fsanitize=address -fsanitize-coverage=edge,trace-bb foo.cc 300*67e74705SXin Li % ASAN_OPTIONS=coverage=1 ./a.out 301*67e74705SXin Li 302*67e74705SXin LiThis will produce two files after the process exit: 303*67e74705SXin Li`trace-points.PID.sancov` and `trace-events.PID.sancov`. 304*67e74705SXin LiThe first file will contain a textual description of all the instrumented points in the program 305*67e74705SXin Liin the form that you can feed into llvm-symbolizer (e.g. `a.out 0x4dca89`), one per line. 306*67e74705SXin LiThe second file will contain the actual execution trace as a sequence of 4-byte integers 307*67e74705SXin Li-- these integers are the indices into the array of instrumented points (the first file). 308*67e74705SXin Li 309*67e74705SXin LiBasic block tracing is currently supported only for single-threaded applications. 310*67e74705SXin Li 311*67e74705SXin Li 312*67e74705SXin LiTracing PCs 313*67e74705SXin Li=========== 314*67e74705SXin Li*Experimental* feature similar to tracing basic blocks, but with a different API. 315*67e74705SXin LiWith ``-fsanitize-coverage=trace-pc`` the compiler will insert 316*67e74705SXin Li``__sanitizer_cov_trace_pc()`` on every edge. 317*67e74705SXin LiWith an additional ``...=trace-pc,indirect-calls`` flag 318*67e74705SXin Li``__sanitizer_cov_trace_pc_indirect(void *callee)`` will be inserted on every indirect call. 319*67e74705SXin LiThese callbacks are not implemented in the Sanitizer run-time and should be defined 320*67e74705SXin Liby the user. So, these flags do not require the other sanitizer to be used. 321*67e74705SXin LiThis mechanism is used for fuzzing the Linux kernel (https://github.com/google/syzkaller) 322*67e74705SXin Liand can be used with `AFL <http://lcamtuf.coredump.cx/afl>`__. 323*67e74705SXin Li 324*67e74705SXin LiTracing data flow 325*67e74705SXin Li================= 326*67e74705SXin Li 327*67e74705SXin LiAn *experimental* feature to support data-flow-guided fuzzing. 328*67e74705SXin LiWith ``-fsanitize-coverage=trace-cmp`` the compiler will insert extra instrumentation 329*67e74705SXin Liaround comparison instructions and switch statements. 330*67e74705SXin LiThe fuzzer will need to define the following functions, 331*67e74705SXin Lithey will be called by the instrumented code. 332*67e74705SXin Li 333*67e74705SXin Li.. code-block:: c++ 334*67e74705SXin Li 335*67e74705SXin Li // Called before a comparison instruction. 336*67e74705SXin Li // SizeAndType is a packed value containing 337*67e74705SXin Li // - [63:32] the Size of the operands of comparison in bits 338*67e74705SXin Li // - [31:0] the Type of comparison (one of ICMP_EQ, ... ICMP_SLE) 339*67e74705SXin Li // Arg1 and Arg2 are arguments of the comparison. 340*67e74705SXin Li void __sanitizer_cov_trace_cmp(uint64_t SizeAndType, uint64_t Arg1, uint64_t Arg2); 341*67e74705SXin Li 342*67e74705SXin Li // Called before a switch statement. 343*67e74705SXin Li // Val is the switch operand. 344*67e74705SXin Li // Cases[0] is the number of case constants. 345*67e74705SXin Li // Cases[1] is the size of Val in bits. 346*67e74705SXin Li // Cases[2:] are the case constants. 347*67e74705SXin Li void __sanitizer_cov_trace_switch(uint64_t Val, uint64_t *Cases); 348*67e74705SXin Li 349*67e74705SXin LiThis interface is a subject to change. 350*67e74705SXin LiThe current implementation is not thread-safe and thus can be safely used only for single-threaded targets. 351*67e74705SXin Li 352*67e74705SXin LiOutput directory 353*67e74705SXin Li================ 354*67e74705SXin Li 355*67e74705SXin LiBy default, .sancov files are created in the current working directory. 356*67e74705SXin LiThis can be changed with ``ASAN_OPTIONS=coverage_dir=/path``: 357*67e74705SXin Li 358*67e74705SXin Li.. code-block:: console 359*67e74705SXin Li 360*67e74705SXin Li % ASAN_OPTIONS="coverage=1:coverage_dir=/tmp/cov" ./a.out foo 361*67e74705SXin Li % ls -l /tmp/cov/*sancov 362*67e74705SXin Li -rw-r----- 1 kcc eng 4 Nov 27 12:21 a.out.22673.sancov 363*67e74705SXin Li -rw-r----- 1 kcc eng 8 Nov 27 12:21 a.out.22679.sancov 364*67e74705SXin Li 365*67e74705SXin LiSudden death 366*67e74705SXin Li============ 367*67e74705SXin Li 368*67e74705SXin LiNormally, coverage data is collected in memory and saved to disk when the 369*67e74705SXin Liprogram exits (with an ``atexit()`` handler), when a SIGSEGV is caught, or when 370*67e74705SXin Li``__sanitizer_cov_dump()`` is called. 371*67e74705SXin Li 372*67e74705SXin LiIf the program ends with a signal that ASan does not handle (or can not handle 373*67e74705SXin Liat all, like SIGKILL), coverage data will be lost. This is a big problem on 374*67e74705SXin LiAndroid, where SIGKILL is a normal way of evicting applications from memory. 375*67e74705SXin Li 376*67e74705SXin LiWith ``ASAN_OPTIONS=coverage=1:coverage_direct=1`` coverage data is written to a 377*67e74705SXin Limemory-mapped file as soon as it collected. 378*67e74705SXin Li 379*67e74705SXin Li.. code-block:: console 380*67e74705SXin Li 381*67e74705SXin Li % ASAN_OPTIONS="coverage=1:coverage_direct=1" ./a.out 382*67e74705SXin Li main 383*67e74705SXin Li % ls 384*67e74705SXin Li 7036.sancov.map 7036.sancov.raw a.out 385*67e74705SXin Li % sancov.py rawunpack 7036.sancov.raw 386*67e74705SXin Li sancov.py: reading map 7036.sancov.map 387*67e74705SXin Li sancov.py: unpacking 7036.sancov.raw 388*67e74705SXin Li writing 1 PCs to a.out.7036.sancov 389*67e74705SXin Li % sancov.py print a.out.7036.sancov 390*67e74705SXin Li sancov.py: read 1 PCs from a.out.7036.sancov 391*67e74705SXin Li sancov.py: 1 files merged; 1 PCs total 392*67e74705SXin Li 0x4b2bae 393*67e74705SXin Li 394*67e74705SXin LiNote that on 64-bit platforms, this method writes 2x more data than the default, 395*67e74705SXin Libecause it stores full PC values instead of 32-bit offsets. 396*67e74705SXin Li 397*67e74705SXin LiIn-process fuzzing 398*67e74705SXin Li================== 399*67e74705SXin Li 400*67e74705SXin LiCoverage data could be useful for fuzzers and sometimes it is preferable to run 401*67e74705SXin Lia fuzzer in the same process as the code being fuzzed (in-process fuzzer). 402*67e74705SXin Li 403*67e74705SXin LiYou can use ``__sanitizer_get_total_unique_coverage()`` from 404*67e74705SXin Li``<sanitizer/coverage_interface.h>`` which returns the number of currently 405*67e74705SXin Licovered entities in the program. This will tell the fuzzer if the coverage has 406*67e74705SXin Liincreased after testing every new input. 407*67e74705SXin Li 408*67e74705SXin LiIf a fuzzer finds a bug in the ASan run, you will need to save the reproducer 409*67e74705SXin Libefore exiting the process. Use ``__asan_set_death_callback`` from 410*67e74705SXin Li``<sanitizer/asan_interface.h>`` to do that. 411*67e74705SXin Li 412*67e74705SXin LiAn example of such fuzzer can be found in `the LLVM tree 413*67e74705SXin Li<http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Fuzzer/README.txt?view=markup>`_. 414*67e74705SXin Li 415*67e74705SXin LiPerformance 416*67e74705SXin Li=========== 417*67e74705SXin Li 418*67e74705SXin LiThis coverage implementation is **fast**. With function-level coverage 419*67e74705SXin Li(``-fsanitize-coverage=func``) the overhead is not measurable. With 420*67e74705SXin Libasic-block-level coverage (``-fsanitize-coverage=bb``) the overhead varies 421*67e74705SXin Libetween 0 and 25%. 422*67e74705SXin Li 423*67e74705SXin Li============== ========= ========= ========= ========= ========= ========= 424*67e74705SXin Li benchmark cov0 cov1 diff 0-1 cov2 diff 0-2 diff 1-2 425*67e74705SXin Li============== ========= ========= ========= ========= ========= ========= 426*67e74705SXin Li 400.perlbench 1296.00 1307.00 1.01 1465.00 1.13 1.12 427*67e74705SXin Li 401.bzip2 858.00 854.00 1.00 1010.00 1.18 1.18 428*67e74705SXin Li 403.gcc 613.00 617.00 1.01 683.00 1.11 1.11 429*67e74705SXin Li 429.mcf 605.00 582.00 0.96 610.00 1.01 1.05 430*67e74705SXin Li 445.gobmk 896.00 880.00 0.98 1050.00 1.17 1.19 431*67e74705SXin Li 456.hmmer 892.00 892.00 1.00 918.00 1.03 1.03 432*67e74705SXin Li 458.sjeng 995.00 1009.00 1.01 1217.00 1.22 1.21 433*67e74705SXin Li462.libquantum 497.00 492.00 0.99 534.00 1.07 1.09 434*67e74705SXin Li 464.h264ref 1461.00 1467.00 1.00 1543.00 1.06 1.05 435*67e74705SXin Li 471.omnetpp 575.00 590.00 1.03 660.00 1.15 1.12 436*67e74705SXin Li 473.astar 658.00 652.00 0.99 715.00 1.09 1.10 437*67e74705SXin Li 483.xalancbmk 471.00 491.00 1.04 582.00 1.24 1.19 438*67e74705SXin Li 433.milc 616.00 627.00 1.02 627.00 1.02 1.00 439*67e74705SXin Li 444.namd 602.00 601.00 1.00 654.00 1.09 1.09 440*67e74705SXin Li 447.dealII 630.00 634.00 1.01 653.00 1.04 1.03 441*67e74705SXin Li 450.soplex 365.00 368.00 1.01 395.00 1.08 1.07 442*67e74705SXin Li 453.povray 427.00 434.00 1.02 495.00 1.16 1.14 443*67e74705SXin Li 470.lbm 357.00 375.00 1.05 370.00 1.04 0.99 444*67e74705SXin Li 482.sphinx3 927.00 928.00 1.00 1000.00 1.08 1.08 445*67e74705SXin Li============== ========= ========= ========= ========= ========= ========= 446*67e74705SXin Li 447*67e74705SXin LiWhy another coverage? 448*67e74705SXin Li===================== 449*67e74705SXin Li 450*67e74705SXin LiWhy did we implement yet another code coverage? 451*67e74705SXin Li * We needed something that is lightning fast, plays well with 452*67e74705SXin Li AddressSanitizer, and does not significantly increase the binary size. 453*67e74705SXin Li * Traditional coverage implementations based in global counters 454*67e74705SXin Li `suffer from contention on counters 455*67e74705SXin Li <https://groups.google.com/forum/#!topic/llvm-dev/cDqYgnxNEhY>`_. 456