xref: /aosp_15_r20/external/clang/docs/UndefinedBehaviorSanitizer.rst (revision 67e74705e28f6214e480b399dd47ea732279e315)
1*67e74705SXin Li==========================
2*67e74705SXin LiUndefinedBehaviorSanitizer
3*67e74705SXin Li==========================
4*67e74705SXin Li
5*67e74705SXin Li.. contents::
6*67e74705SXin Li   :local:
7*67e74705SXin Li
8*67e74705SXin LiIntroduction
9*67e74705SXin Li============
10*67e74705SXin Li
11*67e74705SXin LiUndefinedBehaviorSanitizer (UBSan) is a fast undefined behavior detector.
12*67e74705SXin LiUBSan modifies the program at compile-time to catch various kinds of undefined
13*67e74705SXin Libehavior during program execution, for example:
14*67e74705SXin Li
15*67e74705SXin Li* Using misaligned or null pointer
16*67e74705SXin Li* Signed integer overflow
17*67e74705SXin Li* Conversion to, from, or between floating-point types which would
18*67e74705SXin Li  overflow the destination
19*67e74705SXin Li
20*67e74705SXin LiSee the full list of available :ref:`checks <ubsan-checks>` below.
21*67e74705SXin Li
22*67e74705SXin LiUBSan has an optional run-time library which provides better error reporting.
23*67e74705SXin LiThe checks have small runtime cost and no impact on address space layout or ABI.
24*67e74705SXin Li
25*67e74705SXin LiHow to build
26*67e74705SXin Li============
27*67e74705SXin Li
28*67e74705SXin LiBuild LLVM/Clang with `CMake <http://llvm.org/docs/CMake.html>`_.
29*67e74705SXin Li
30*67e74705SXin LiUsage
31*67e74705SXin Li=====
32*67e74705SXin Li
33*67e74705SXin LiUse ``clang++`` to compile and link your program with ``-fsanitize=undefined``
34*67e74705SXin Liflag. Make sure to use ``clang++`` (not ``ld``) as a linker, so that your
35*67e74705SXin Liexecutable is linked with proper UBSan runtime libraries. You can use ``clang``
36*67e74705SXin Liinstead of ``clang++`` if you're compiling/linking C code.
37*67e74705SXin Li
38*67e74705SXin Li.. code-block:: console
39*67e74705SXin Li
40*67e74705SXin Li  % cat test.cc
41*67e74705SXin Li  int main(int argc, char **argv) {
42*67e74705SXin Li    int k = 0x7fffffff;
43*67e74705SXin Li    k += argc;
44*67e74705SXin Li    return 0;
45*67e74705SXin Li  }
46*67e74705SXin Li  % clang++ -fsanitize=undefined test.cc
47*67e74705SXin Li  % ./a.out
48*67e74705SXin Li  test.cc:3:5: runtime error: signed integer overflow: 2147483647 + 1 cannot be represented in type 'int'
49*67e74705SXin Li
50*67e74705SXin LiYou can enable only a subset of :ref:`checks <ubsan-checks>` offered by UBSan,
51*67e74705SXin Liand define the desired behavior for each kind of check:
52*67e74705SXin Li
53*67e74705SXin Li* print a verbose error report and continue execution (default);
54*67e74705SXin Li* print a verbose error report and exit the program;
55*67e74705SXin Li* execute a trap instruction (doesn't require UBSan run-time support).
56*67e74705SXin Li
57*67e74705SXin LiFor example if you compile/link your program as:
58*67e74705SXin Li
59*67e74705SXin Li.. code-block:: console
60*67e74705SXin Li
61*67e74705SXin Li  % clang++ -fsanitize=signed-integer-overflow,null,alignment -fno-sanitize-recover=null -fsanitize-trap=alignment
62*67e74705SXin Li
63*67e74705SXin Lithe program will continue execution after signed integer overflows, exit after
64*67e74705SXin Lithe first invalid use of a null pointer, and trap after the first use of misaligned
65*67e74705SXin Lipointer.
66*67e74705SXin Li
67*67e74705SXin Li.. _ubsan-checks:
68*67e74705SXin Li
69*67e74705SXin LiAvailablle checks
70*67e74705SXin Li=================
71*67e74705SXin Li
72*67e74705SXin LiAvailable checks are:
73*67e74705SXin Li
74*67e74705SXin Li  -  ``-fsanitize=alignment``: Use of a misaligned pointer or creation
75*67e74705SXin Li     of a misaligned reference.
76*67e74705SXin Li  -  ``-fsanitize=bool``: Load of a ``bool`` value which is neither
77*67e74705SXin Li     ``true`` nor ``false``.
78*67e74705SXin Li  -  ``-fsanitize=bounds``: Out of bounds array indexing, in cases
79*67e74705SXin Li     where the array bound can be statically determined.
80*67e74705SXin Li  -  ``-fsanitize=enum``: Load of a value of an enumerated type which
81*67e74705SXin Li     is not in the range of representable values for that enumerated
82*67e74705SXin Li     type.
83*67e74705SXin Li  -  ``-fsanitize=float-cast-overflow``: Conversion to, from, or
84*67e74705SXin Li     between floating-point types which would overflow the
85*67e74705SXin Li     destination.
86*67e74705SXin Li  -  ``-fsanitize=float-divide-by-zero``: Floating point division by
87*67e74705SXin Li     zero.
88*67e74705SXin Li  -  ``-fsanitize=function``: Indirect call of a function through a
89*67e74705SXin Li     function pointer of the wrong type (Linux, C++ and x86/x86_64 only).
90*67e74705SXin Li  -  ``-fsanitize=integer-divide-by-zero``: Integer division by zero.
91*67e74705SXin Li  -  ``-fsanitize=nonnull-attribute``: Passing null pointer as a function
92*67e74705SXin Li     parameter which is declared to never be null.
93*67e74705SXin Li  -  ``-fsanitize=null``: Use of a null pointer or creation of a null
94*67e74705SXin Li     reference.
95*67e74705SXin Li  -  ``-fsanitize=object-size``: An attempt to potentially use bytes which
96*67e74705SXin Li     the optimizer can determine are not part of the object being accessed.
97*67e74705SXin Li     This will also detect some types of undefined behavior that may not
98*67e74705SXin Li     directly access memory, but are provably incorrect given the size of
99*67e74705SXin Li     the objects involved, such as invalid downcasts and calling methods on
100*67e74705SXin Li     invalid pointers. These checks are made in terms of
101*67e74705SXin Li     ``__builtin_object_size``, and consequently may be able to detect more
102*67e74705SXin Li     problems at higher optimization levels.
103*67e74705SXin Li  -  ``-fsanitize=return``: In C++, reaching the end of a
104*67e74705SXin Li     value-returning function without returning a value.
105*67e74705SXin Li  -  ``-fsanitize=returns-nonnull-attribute``: Returning null pointer
106*67e74705SXin Li     from a function which is declared to never return null.
107*67e74705SXin Li  -  ``-fsanitize=shift``: Shift operators where the amount shifted is
108*67e74705SXin Li     greater or equal to the promoted bit-width of the left hand side
109*67e74705SXin Li     or less than zero, or where the left hand side is negative. For a
110*67e74705SXin Li     signed left shift, also checks for signed overflow in C, and for
111*67e74705SXin Li     unsigned overflow in C++. You can use ``-fsanitize=shift-base`` or
112*67e74705SXin Li     ``-fsanitize=shift-exponent`` to check only left-hand side or
113*67e74705SXin Li     right-hand side of shift operation, respectively.
114*67e74705SXin Li  -  ``-fsanitize=signed-integer-overflow``: Signed integer overflow,
115*67e74705SXin Li     including all the checks added by ``-ftrapv``, and checking for
116*67e74705SXin Li     overflow in signed division (``INT_MIN / -1``).
117*67e74705SXin Li  -  ``-fsanitize=unreachable``: If control flow reaches
118*67e74705SXin Li     ``__builtin_unreachable``.
119*67e74705SXin Li  -  ``-fsanitize=unsigned-integer-overflow``: Unsigned integer
120*67e74705SXin Li     overflows.
121*67e74705SXin Li  -  ``-fsanitize=vla-bound``: A variable-length array whose bound
122*67e74705SXin Li     does not evaluate to a positive value.
123*67e74705SXin Li  -  ``-fsanitize=vptr``: Use of an object whose vptr indicates that
124*67e74705SXin Li     it is of the wrong dynamic type, or that its lifetime has not
125*67e74705SXin Li     begun or has ended. Incompatible with ``-fno-rtti``. Link must
126*67e74705SXin Li     be performed by ``clang++``, not ``clang``, to make sure C++-specific
127*67e74705SXin Li     parts of the runtime library and C++ standard libraries are present.
128*67e74705SXin Li
129*67e74705SXin LiYou can also use the following check groups:
130*67e74705SXin Li  -  ``-fsanitize=undefined``: All of the checks listed above other than
131*67e74705SXin Li     ``unsigned-integer-overflow``.
132*67e74705SXin Li  -  ``-fsanitize=undefined-trap``: Deprecated alias of
133*67e74705SXin Li     ``-fsanitize=undefined``.
134*67e74705SXin Li  -  ``-fsanitize=integer``: Checks for undefined or suspicious integer
135*67e74705SXin Li     behavior (e.g. unsigned integer overflow).
136*67e74705SXin Li
137*67e74705SXin LiStack traces and report symbolization
138*67e74705SXin Li=====================================
139*67e74705SXin LiIf you want UBSan to print symbolized stack trace for each error report, you
140*67e74705SXin Liwill need to:
141*67e74705SXin Li
142*67e74705SXin Li#. Compile with ``-g`` and ``-fno-omit-frame-pointer`` to get proper debug
143*67e74705SXin Li   information in your binary.
144*67e74705SXin Li#. Run your program with environment variable
145*67e74705SXin Li   ``UBSAN_OPTIONS=print_stacktrace=1``.
146*67e74705SXin Li#. Make sure ``llvm-symbolizer`` binary is in ``PATH``.
147*67e74705SXin Li
148*67e74705SXin LiIssue Suppression
149*67e74705SXin Li=================
150*67e74705SXin Li
151*67e74705SXin LiUndefinedBehaviorSanitizer is not expected to produce false positives.
152*67e74705SXin LiIf you see one, look again; most likely it is a true positive!
153*67e74705SXin Li
154*67e74705SXin LiDisabling Instrumentation with ``__attribute__((no_sanitize("undefined")))``
155*67e74705SXin Li----------------------------------------------------------------------------
156*67e74705SXin Li
157*67e74705SXin LiYou disable UBSan checks for particular functions with
158*67e74705SXin Li``__attribute__((no_sanitize("undefined")))``. You can use all values of
159*67e74705SXin Li``-fsanitize=`` flag in this attribute, e.g. if your function deliberately
160*67e74705SXin Licontains possible signed integer overflow, you can use
161*67e74705SXin Li``__attribute__((no_sanitize("signed-integer-overflow")))``.
162*67e74705SXin Li
163*67e74705SXin LiThis attribute may not be
164*67e74705SXin Lisupported by other compilers, so consider using it together with
165*67e74705SXin Li``#if defined(__clang__)``.
166*67e74705SXin Li
167*67e74705SXin LiSuppressing Errors in Recompiled Code (Blacklist)
168*67e74705SXin Li-------------------------------------------------
169*67e74705SXin Li
170*67e74705SXin LiUndefinedBehaviorSanitizer supports ``src`` and ``fun`` entity types in
171*67e74705SXin Li:doc:`SanitizerSpecialCaseList`, that can be used to suppress error reports
172*67e74705SXin Liin the specified source files or functions.
173*67e74705SXin Li
174*67e74705SXin LiRuntime suppressions
175*67e74705SXin Li--------------------
176*67e74705SXin Li
177*67e74705SXin LiSometimes you can suppress UBSan error reports for specific files, functions,
178*67e74705SXin Lior libraries without recompiling the code. You need to pass a path to
179*67e74705SXin Lisuppression file in a ``UBSAN_OPTIONS`` environment variable.
180*67e74705SXin Li
181*67e74705SXin Li.. code-block:: bash
182*67e74705SXin Li
183*67e74705SXin Li    UBSAN_OPTIONS=suppressions=MyUBSan.supp
184*67e74705SXin Li
185*67e74705SXin LiYou need to specify a :ref:`check <ubsan-checks>` you are suppressing and the
186*67e74705SXin Libug location. For example:
187*67e74705SXin Li
188*67e74705SXin Li.. code-block:: bash
189*67e74705SXin Li
190*67e74705SXin Li  signed-integer-overflow:file-with-known-overflow.cpp
191*67e74705SXin Li  alignment:function_doing_unaligned_access
192*67e74705SXin Li  vptr:shared_object_with_vptr_failures.so
193*67e74705SXin Li
194*67e74705SXin LiThere are several limitations:
195*67e74705SXin Li
196*67e74705SXin Li* Sometimes your binary must have enough debug info and/or symbol table, so
197*67e74705SXin Li  that the runtime could figure out source file or function name to match
198*67e74705SXin Li  against the suppression.
199*67e74705SXin Li* It is only possible to suppress recoverable checks. For the example above,
200*67e74705SXin Li  you can additionally pass
201*67e74705SXin Li  ``-fsanitize-recover=signed-integer-overflow,alignment,vptr``, although
202*67e74705SXin Li  most of UBSan checks are recoverable by default.
203*67e74705SXin Li* Check groups (like ``undefined``) can't be used in suppressions file, only
204*67e74705SXin Li  fine-grained checks are supported.
205*67e74705SXin Li
206*67e74705SXin LiSupported Platforms
207*67e74705SXin Li===================
208*67e74705SXin Li
209*67e74705SXin LiUndefinedBehaviorSanitizer is supported on the following OS:
210*67e74705SXin Li
211*67e74705SXin Li* Android
212*67e74705SXin Li* Linux
213*67e74705SXin Li* FreeBSD
214*67e74705SXin Li* OS X 10.6 onwards
215*67e74705SXin Li
216*67e74705SXin Liand for the following architectures:
217*67e74705SXin Li
218*67e74705SXin Li* i386/x86\_64
219*67e74705SXin Li* ARM
220*67e74705SXin Li* AArch64
221*67e74705SXin Li* PowerPC64
222*67e74705SXin Li* MIPS/MIPS64
223*67e74705SXin Li
224*67e74705SXin LiCurrent Status
225*67e74705SXin Li==============
226*67e74705SXin Li
227*67e74705SXin LiUndefinedBehaviorSanitizer is available on selected platforms starting from LLVM
228*67e74705SXin Li3.3. The test suite is integrated into the CMake build and can be run with
229*67e74705SXin Li``check-ubsan`` command.
230*67e74705SXin Li
231*67e74705SXin LiAdditional Configuration
232*67e74705SXin Li========================
233*67e74705SXin Li
234*67e74705SXin LiUndefinedBehaviorSanitizer adds static check data for each check unless it is
235*67e74705SXin Liin trap mode. This check data includes the full file name. The option
236*67e74705SXin Li``-fsanitize-undefined-strip-path-components=N`` can be used to trim this
237*67e74705SXin Liinformation. If ``N`` is positive, file information emitted by
238*67e74705SXin LiUndefinedBehaviorSanitizer will drop the first ``N`` components from the file
239*67e74705SXin Lipath. If ``N`` is negative, the last ``N`` components will be kept.
240*67e74705SXin Li
241*67e74705SXin LiExample
242*67e74705SXin Li-------
243*67e74705SXin Li
244*67e74705SXin LiFor a file called ``/code/library/file.cpp``, here is what would be emitted:
245*67e74705SXin Li* Default (No flag, or ``-fsanitize-undefined-strip-path-components=0``): ``/code/library/file.cpp``
246*67e74705SXin Li* ``-fsanitize-undefined-strip-path-components=1``: ``code/library/file.cpp``
247*67e74705SXin Li* ``-fsanitize-undefined-strip-path-components=2``: ``library/file.cpp``
248*67e74705SXin Li* ``-fsanitize-undefined-strip-path-components=-1``: ``file.cpp``
249*67e74705SXin Li* ``-fsanitize-undefined-strip-path-components=-2``: ``library/file.cpp``
250*67e74705SXin Li
251*67e74705SXin LiMore Information
252*67e74705SXin Li================
253*67e74705SXin Li
254*67e74705SXin Li* From LLVM project blog:
255*67e74705SXin Li  `What Every C Programmer Should Know About Undefined Behavior
256*67e74705SXin Li  <http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html>`_
257*67e74705SXin Li* From John Regehr's *Embedded in Academia* blog:
258*67e74705SXin Li  `A Guide to Undefined Behavior in C and C++
259*67e74705SXin Li  <http://blog.regehr.org/archives/213>`_
260