1*67e74705SXin Li========================== 2*67e74705SXin LiUndefinedBehaviorSanitizer 3*67e74705SXin Li========================== 4*67e74705SXin Li 5*67e74705SXin Li.. contents:: 6*67e74705SXin Li :local: 7*67e74705SXin Li 8*67e74705SXin LiIntroduction 9*67e74705SXin Li============ 10*67e74705SXin Li 11*67e74705SXin LiUndefinedBehaviorSanitizer (UBSan) is a fast undefined behavior detector. 12*67e74705SXin LiUBSan modifies the program at compile-time to catch various kinds of undefined 13*67e74705SXin Libehavior during program execution, for example: 14*67e74705SXin Li 15*67e74705SXin Li* Using misaligned or null pointer 16*67e74705SXin Li* Signed integer overflow 17*67e74705SXin Li* Conversion to, from, or between floating-point types which would 18*67e74705SXin Li overflow the destination 19*67e74705SXin Li 20*67e74705SXin LiSee the full list of available :ref:`checks <ubsan-checks>` below. 21*67e74705SXin Li 22*67e74705SXin LiUBSan has an optional run-time library which provides better error reporting. 23*67e74705SXin LiThe checks have small runtime cost and no impact on address space layout or ABI. 24*67e74705SXin Li 25*67e74705SXin LiHow to build 26*67e74705SXin Li============ 27*67e74705SXin Li 28*67e74705SXin LiBuild LLVM/Clang with `CMake <http://llvm.org/docs/CMake.html>`_. 29*67e74705SXin Li 30*67e74705SXin LiUsage 31*67e74705SXin Li===== 32*67e74705SXin Li 33*67e74705SXin LiUse ``clang++`` to compile and link your program with ``-fsanitize=undefined`` 34*67e74705SXin Liflag. Make sure to use ``clang++`` (not ``ld``) as a linker, so that your 35*67e74705SXin Liexecutable is linked with proper UBSan runtime libraries. You can use ``clang`` 36*67e74705SXin Liinstead of ``clang++`` if you're compiling/linking C code. 37*67e74705SXin Li 38*67e74705SXin Li.. code-block:: console 39*67e74705SXin Li 40*67e74705SXin Li % cat test.cc 41*67e74705SXin Li int main(int argc, char **argv) { 42*67e74705SXin Li int k = 0x7fffffff; 43*67e74705SXin Li k += argc; 44*67e74705SXin Li return 0; 45*67e74705SXin Li } 46*67e74705SXin Li % clang++ -fsanitize=undefined test.cc 47*67e74705SXin Li % ./a.out 48*67e74705SXin Li test.cc:3:5: runtime error: signed integer overflow: 2147483647 + 1 cannot be represented in type 'int' 49*67e74705SXin Li 50*67e74705SXin LiYou can enable only a subset of :ref:`checks <ubsan-checks>` offered by UBSan, 51*67e74705SXin Liand define the desired behavior for each kind of check: 52*67e74705SXin Li 53*67e74705SXin Li* print a verbose error report and continue execution (default); 54*67e74705SXin Li* print a verbose error report and exit the program; 55*67e74705SXin Li* execute a trap instruction (doesn't require UBSan run-time support). 56*67e74705SXin Li 57*67e74705SXin LiFor example if you compile/link your program as: 58*67e74705SXin Li 59*67e74705SXin Li.. code-block:: console 60*67e74705SXin Li 61*67e74705SXin Li % clang++ -fsanitize=signed-integer-overflow,null,alignment -fno-sanitize-recover=null -fsanitize-trap=alignment 62*67e74705SXin Li 63*67e74705SXin Lithe program will continue execution after signed integer overflows, exit after 64*67e74705SXin Lithe first invalid use of a null pointer, and trap after the first use of misaligned 65*67e74705SXin Lipointer. 66*67e74705SXin Li 67*67e74705SXin Li.. _ubsan-checks: 68*67e74705SXin Li 69*67e74705SXin LiAvailablle checks 70*67e74705SXin Li================= 71*67e74705SXin Li 72*67e74705SXin LiAvailable checks are: 73*67e74705SXin Li 74*67e74705SXin Li - ``-fsanitize=alignment``: Use of a misaligned pointer or creation 75*67e74705SXin Li of a misaligned reference. 76*67e74705SXin Li - ``-fsanitize=bool``: Load of a ``bool`` value which is neither 77*67e74705SXin Li ``true`` nor ``false``. 78*67e74705SXin Li - ``-fsanitize=bounds``: Out of bounds array indexing, in cases 79*67e74705SXin Li where the array bound can be statically determined. 80*67e74705SXin Li - ``-fsanitize=enum``: Load of a value of an enumerated type which 81*67e74705SXin Li is not in the range of representable values for that enumerated 82*67e74705SXin Li type. 83*67e74705SXin Li - ``-fsanitize=float-cast-overflow``: Conversion to, from, or 84*67e74705SXin Li between floating-point types which would overflow the 85*67e74705SXin Li destination. 86*67e74705SXin Li - ``-fsanitize=float-divide-by-zero``: Floating point division by 87*67e74705SXin Li zero. 88*67e74705SXin Li - ``-fsanitize=function``: Indirect call of a function through a 89*67e74705SXin Li function pointer of the wrong type (Linux, C++ and x86/x86_64 only). 90*67e74705SXin Li - ``-fsanitize=integer-divide-by-zero``: Integer division by zero. 91*67e74705SXin Li - ``-fsanitize=nonnull-attribute``: Passing null pointer as a function 92*67e74705SXin Li parameter which is declared to never be null. 93*67e74705SXin Li - ``-fsanitize=null``: Use of a null pointer or creation of a null 94*67e74705SXin Li reference. 95*67e74705SXin Li - ``-fsanitize=object-size``: An attempt to potentially use bytes which 96*67e74705SXin Li the optimizer can determine are not part of the object being accessed. 97*67e74705SXin Li This will also detect some types of undefined behavior that may not 98*67e74705SXin Li directly access memory, but are provably incorrect given the size of 99*67e74705SXin Li the objects involved, such as invalid downcasts and calling methods on 100*67e74705SXin Li invalid pointers. These checks are made in terms of 101*67e74705SXin Li ``__builtin_object_size``, and consequently may be able to detect more 102*67e74705SXin Li problems at higher optimization levels. 103*67e74705SXin Li - ``-fsanitize=return``: In C++, reaching the end of a 104*67e74705SXin Li value-returning function without returning a value. 105*67e74705SXin Li - ``-fsanitize=returns-nonnull-attribute``: Returning null pointer 106*67e74705SXin Li from a function which is declared to never return null. 107*67e74705SXin Li - ``-fsanitize=shift``: Shift operators where the amount shifted is 108*67e74705SXin Li greater or equal to the promoted bit-width of the left hand side 109*67e74705SXin Li or less than zero, or where the left hand side is negative. For a 110*67e74705SXin Li signed left shift, also checks for signed overflow in C, and for 111*67e74705SXin Li unsigned overflow in C++. You can use ``-fsanitize=shift-base`` or 112*67e74705SXin Li ``-fsanitize=shift-exponent`` to check only left-hand side or 113*67e74705SXin Li right-hand side of shift operation, respectively. 114*67e74705SXin Li - ``-fsanitize=signed-integer-overflow``: Signed integer overflow, 115*67e74705SXin Li including all the checks added by ``-ftrapv``, and checking for 116*67e74705SXin Li overflow in signed division (``INT_MIN / -1``). 117*67e74705SXin Li - ``-fsanitize=unreachable``: If control flow reaches 118*67e74705SXin Li ``__builtin_unreachable``. 119*67e74705SXin Li - ``-fsanitize=unsigned-integer-overflow``: Unsigned integer 120*67e74705SXin Li overflows. 121*67e74705SXin Li - ``-fsanitize=vla-bound``: A variable-length array whose bound 122*67e74705SXin Li does not evaluate to a positive value. 123*67e74705SXin Li - ``-fsanitize=vptr``: Use of an object whose vptr indicates that 124*67e74705SXin Li it is of the wrong dynamic type, or that its lifetime has not 125*67e74705SXin Li begun or has ended. Incompatible with ``-fno-rtti``. Link must 126*67e74705SXin Li be performed by ``clang++``, not ``clang``, to make sure C++-specific 127*67e74705SXin Li parts of the runtime library and C++ standard libraries are present. 128*67e74705SXin Li 129*67e74705SXin LiYou can also use the following check groups: 130*67e74705SXin Li - ``-fsanitize=undefined``: All of the checks listed above other than 131*67e74705SXin Li ``unsigned-integer-overflow``. 132*67e74705SXin Li - ``-fsanitize=undefined-trap``: Deprecated alias of 133*67e74705SXin Li ``-fsanitize=undefined``. 134*67e74705SXin Li - ``-fsanitize=integer``: Checks for undefined or suspicious integer 135*67e74705SXin Li behavior (e.g. unsigned integer overflow). 136*67e74705SXin Li 137*67e74705SXin LiStack traces and report symbolization 138*67e74705SXin Li===================================== 139*67e74705SXin LiIf you want UBSan to print symbolized stack trace for each error report, you 140*67e74705SXin Liwill need to: 141*67e74705SXin Li 142*67e74705SXin Li#. Compile with ``-g`` and ``-fno-omit-frame-pointer`` to get proper debug 143*67e74705SXin Li information in your binary. 144*67e74705SXin Li#. Run your program with environment variable 145*67e74705SXin Li ``UBSAN_OPTIONS=print_stacktrace=1``. 146*67e74705SXin Li#. Make sure ``llvm-symbolizer`` binary is in ``PATH``. 147*67e74705SXin Li 148*67e74705SXin LiIssue Suppression 149*67e74705SXin Li================= 150*67e74705SXin Li 151*67e74705SXin LiUndefinedBehaviorSanitizer is not expected to produce false positives. 152*67e74705SXin LiIf you see one, look again; most likely it is a true positive! 153*67e74705SXin Li 154*67e74705SXin LiDisabling Instrumentation with ``__attribute__((no_sanitize("undefined")))`` 155*67e74705SXin Li---------------------------------------------------------------------------- 156*67e74705SXin Li 157*67e74705SXin LiYou disable UBSan checks for particular functions with 158*67e74705SXin Li``__attribute__((no_sanitize("undefined")))``. You can use all values of 159*67e74705SXin Li``-fsanitize=`` flag in this attribute, e.g. if your function deliberately 160*67e74705SXin Licontains possible signed integer overflow, you can use 161*67e74705SXin Li``__attribute__((no_sanitize("signed-integer-overflow")))``. 162*67e74705SXin Li 163*67e74705SXin LiThis attribute may not be 164*67e74705SXin Lisupported by other compilers, so consider using it together with 165*67e74705SXin Li``#if defined(__clang__)``. 166*67e74705SXin Li 167*67e74705SXin LiSuppressing Errors in Recompiled Code (Blacklist) 168*67e74705SXin Li------------------------------------------------- 169*67e74705SXin Li 170*67e74705SXin LiUndefinedBehaviorSanitizer supports ``src`` and ``fun`` entity types in 171*67e74705SXin Li:doc:`SanitizerSpecialCaseList`, that can be used to suppress error reports 172*67e74705SXin Liin the specified source files or functions. 173*67e74705SXin Li 174*67e74705SXin LiRuntime suppressions 175*67e74705SXin Li-------------------- 176*67e74705SXin Li 177*67e74705SXin LiSometimes you can suppress UBSan error reports for specific files, functions, 178*67e74705SXin Lior libraries without recompiling the code. You need to pass a path to 179*67e74705SXin Lisuppression file in a ``UBSAN_OPTIONS`` environment variable. 180*67e74705SXin Li 181*67e74705SXin Li.. code-block:: bash 182*67e74705SXin Li 183*67e74705SXin Li UBSAN_OPTIONS=suppressions=MyUBSan.supp 184*67e74705SXin Li 185*67e74705SXin LiYou need to specify a :ref:`check <ubsan-checks>` you are suppressing and the 186*67e74705SXin Libug location. For example: 187*67e74705SXin Li 188*67e74705SXin Li.. code-block:: bash 189*67e74705SXin Li 190*67e74705SXin Li signed-integer-overflow:file-with-known-overflow.cpp 191*67e74705SXin Li alignment:function_doing_unaligned_access 192*67e74705SXin Li vptr:shared_object_with_vptr_failures.so 193*67e74705SXin Li 194*67e74705SXin LiThere are several limitations: 195*67e74705SXin Li 196*67e74705SXin Li* Sometimes your binary must have enough debug info and/or symbol table, so 197*67e74705SXin Li that the runtime could figure out source file or function name to match 198*67e74705SXin Li against the suppression. 199*67e74705SXin Li* It is only possible to suppress recoverable checks. For the example above, 200*67e74705SXin Li you can additionally pass 201*67e74705SXin Li ``-fsanitize-recover=signed-integer-overflow,alignment,vptr``, although 202*67e74705SXin Li most of UBSan checks are recoverable by default. 203*67e74705SXin Li* Check groups (like ``undefined``) can't be used in suppressions file, only 204*67e74705SXin Li fine-grained checks are supported. 205*67e74705SXin Li 206*67e74705SXin LiSupported Platforms 207*67e74705SXin Li=================== 208*67e74705SXin Li 209*67e74705SXin LiUndefinedBehaviorSanitizer is supported on the following OS: 210*67e74705SXin Li 211*67e74705SXin Li* Android 212*67e74705SXin Li* Linux 213*67e74705SXin Li* FreeBSD 214*67e74705SXin Li* OS X 10.6 onwards 215*67e74705SXin Li 216*67e74705SXin Liand for the following architectures: 217*67e74705SXin Li 218*67e74705SXin Li* i386/x86\_64 219*67e74705SXin Li* ARM 220*67e74705SXin Li* AArch64 221*67e74705SXin Li* PowerPC64 222*67e74705SXin Li* MIPS/MIPS64 223*67e74705SXin Li 224*67e74705SXin LiCurrent Status 225*67e74705SXin Li============== 226*67e74705SXin Li 227*67e74705SXin LiUndefinedBehaviorSanitizer is available on selected platforms starting from LLVM 228*67e74705SXin Li3.3. The test suite is integrated into the CMake build and can be run with 229*67e74705SXin Li``check-ubsan`` command. 230*67e74705SXin Li 231*67e74705SXin LiAdditional Configuration 232*67e74705SXin Li======================== 233*67e74705SXin Li 234*67e74705SXin LiUndefinedBehaviorSanitizer adds static check data for each check unless it is 235*67e74705SXin Liin trap mode. This check data includes the full file name. The option 236*67e74705SXin Li``-fsanitize-undefined-strip-path-components=N`` can be used to trim this 237*67e74705SXin Liinformation. If ``N`` is positive, file information emitted by 238*67e74705SXin LiUndefinedBehaviorSanitizer will drop the first ``N`` components from the file 239*67e74705SXin Lipath. If ``N`` is negative, the last ``N`` components will be kept. 240*67e74705SXin Li 241*67e74705SXin LiExample 242*67e74705SXin Li------- 243*67e74705SXin Li 244*67e74705SXin LiFor a file called ``/code/library/file.cpp``, here is what would be emitted: 245*67e74705SXin Li* Default (No flag, or ``-fsanitize-undefined-strip-path-components=0``): ``/code/library/file.cpp`` 246*67e74705SXin Li* ``-fsanitize-undefined-strip-path-components=1``: ``code/library/file.cpp`` 247*67e74705SXin Li* ``-fsanitize-undefined-strip-path-components=2``: ``library/file.cpp`` 248*67e74705SXin Li* ``-fsanitize-undefined-strip-path-components=-1``: ``file.cpp`` 249*67e74705SXin Li* ``-fsanitize-undefined-strip-path-components=-2``: ``library/file.cpp`` 250*67e74705SXin Li 251*67e74705SXin LiMore Information 252*67e74705SXin Li================ 253*67e74705SXin Li 254*67e74705SXin Li* From LLVM project blog: 255*67e74705SXin Li `What Every C Programmer Should Know About Undefined Behavior 256*67e74705SXin Li <http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html>`_ 257*67e74705SXin Li* From John Regehr's *Embedded in Academia* blog: 258*67e74705SXin Li `A Guide to Undefined Behavior in C and C++ 259*67e74705SXin Li <http://blog.regehr.org/archives/213>`_ 260