xref: /aosp_15_r20/system/extras/simpleperf/doc/inferno.md (revision 288bf5226967eb3dac5cce6c939ccc2a7f2b4fe5)
1*288bf522SAndroid Build Coastguard Worker# Inferno
2*288bf522SAndroid Build Coastguard Worker
3*288bf522SAndroid Build Coastguard Worker![logo](./inferno_small.png)
4*288bf522SAndroid Build Coastguard Worker
5*288bf522SAndroid Build Coastguard Worker[TOC]
6*288bf522SAndroid Build Coastguard Worker
7*288bf522SAndroid Build Coastguard Worker## Description
8*288bf522SAndroid Build Coastguard Worker
9*288bf522SAndroid Build Coastguard WorkerInferno is a flamegraph generator for native (C/C++) Android apps. It was
10*288bf522SAndroid Build Coastguard Workeroriginally written to profile and improve surfaceflinger performance
11*288bf522SAndroid Build Coastguard Worker(Android compositor) but it can be used for any native Android application
12*288bf522SAndroid Build Coastguard Worker. You can see a sample report generated with Inferno
13*288bf522SAndroid Build Coastguard Worker[here](./report.html). Report are self-contained in HTML so they can be
14*288bf522SAndroid Build Coastguard Workerexchanged easily.
15*288bf522SAndroid Build Coastguard Worker
16*288bf522SAndroid Build Coastguard WorkerNotice there is no concept of time in a flame graph since all callstack are
17*288bf522SAndroid Build Coastguard Workermerged together. As a result, the width of a flamegraph represents 100% of
18*288bf522SAndroid Build Coastguard Workerthe number of samples and the height is related to the number of functions on
19*288bf522SAndroid Build Coastguard Workerthe stack when sampling occurred.
20*288bf522SAndroid Build Coastguard Worker
21*288bf522SAndroid Build Coastguard Worker
22*288bf522SAndroid Build Coastguard Worker![flamegraph sample](./main_thread_flamegraph.png)
23*288bf522SAndroid Build Coastguard Worker
24*288bf522SAndroid Build Coastguard WorkerIn the flamegraph featured above you can see the main thread of SurfaceFlinger.
25*288bf522SAndroid Build Coastguard WorkerIt is immediatly apparent that most of the CPU time is spent processing messages
26*288bf522SAndroid Build Coastguard Worker`android::SurfaceFlinger::onMessageReceived`. The most expensive task is to ask
27*288bf522SAndroid Build Coastguard Worker the screen to be refreshed as `android::DisplayDevice::prepare` shows in orange
28*288bf522SAndroid Build Coastguard Worker. This graphic division helps to see what part of the program is costly and
29*288bf522SAndroid Build Coastguard Workerwhere a developer's effort to improve performances should go.
30*288bf522SAndroid Build Coastguard Worker
31*288bf522SAndroid Build Coastguard Worker## Example of bottleneck
32*288bf522SAndroid Build Coastguard Worker
33*288bf522SAndroid Build Coastguard WorkerA flamegraph give you instant vision on the CPU cycles cost centers but
34*288bf522SAndroid Build Coastguard Workerit can also be used to find specific offenders. To find them, look for
35*288bf522SAndroid Build Coastguard Workerplateaus. It is easier to see an example:
36*288bf522SAndroid Build Coastguard Worker
37*288bf522SAndroid Build Coastguard Worker![flamegraph sample](./bottleneck.png)
38*288bf522SAndroid Build Coastguard Worker
39*288bf522SAndroid Build Coastguard WorkerIn the previous flamegraph, two
40*288bf522SAndroid Build Coastguard Workerplateaus (due to `android::BufferQueueCore::validateConsistencyLocked`)
41*288bf522SAndroid Build Coastguard Workerare immediately apparent.
42*288bf522SAndroid Build Coastguard Worker
43*288bf522SAndroid Build Coastguard Worker## How it works
44*288bf522SAndroid Build Coastguard Worker
45*288bf522SAndroid Build Coastguard WorkerInferno relies on simpleperf to record the callstack of a native application
46*288bf522SAndroid Build Coastguard Workerthousands of times per second. Simpleperf takes care of unwinding the stack
47*288bf522SAndroid Build Coastguard Workereither using frame pointer (recommended) or dwarf. At the end of the recording
48*288bf522SAndroid Build Coastguard Worker`simpleperf` also symbolize all IPs automatically. The record are aggregated and
49*288bf522SAndroid Build Coastguard Workerdumps dumped to a file `perf.data`. This file is pulled from the Android device
50*288bf522SAndroid Build Coastguard Workerand processed on the host by Inferno. The callstacks are merged together to
51*288bf522SAndroid Build Coastguard Workervisualize in which part of an app the CPU cycles are spent.
52*288bf522SAndroid Build Coastguard Worker
53*288bf522SAndroid Build Coastguard Worker## How to use it
54*288bf522SAndroid Build Coastguard Worker
55*288bf522SAndroid Build Coastguard WorkerOpen a terminal and from `simpleperf/scripts` directory type:
56*288bf522SAndroid Build Coastguard Worker```
57*288bf522SAndroid Build Coastguard Worker./inferno.sh  (on Linux/Mac)
58*288bf522SAndroid Build Coastguard Workerinferno.bat (on Windows)
59*288bf522SAndroid Build Coastguard Worker```
60*288bf522SAndroid Build Coastguard Worker
61*288bf522SAndroid Build Coastguard WorkerInferno will collect data, process them and automatically open your web browser
62*288bf522SAndroid Build Coastguard Workerto display the HTML report.
63*288bf522SAndroid Build Coastguard Worker
64*288bf522SAndroid Build Coastguard Worker## Parameters
65*288bf522SAndroid Build Coastguard Worker
66*288bf522SAndroid Build Coastguard WorkerYou can select how long to sample for, the color of the node and many other
67*288bf522SAndroid Build Coastguard Workerthings. Use `-h` to get a list of all supported parameters.
68*288bf522SAndroid Build Coastguard Worker
69*288bf522SAndroid Build Coastguard Worker```
70*288bf522SAndroid Build Coastguard Worker./inferno.sh -h
71*288bf522SAndroid Build Coastguard Worker```
72*288bf522SAndroid Build Coastguard Worker
73*288bf522SAndroid Build Coastguard Worker## Troubleshooting
74*288bf522SAndroid Build Coastguard Worker
75*288bf522SAndroid Build Coastguard Worker### Messy flame graph
76*288bf522SAndroid Build Coastguard Worker
77*288bf522SAndroid Build Coastguard WorkerA healthy flame graph features a single call site at its base (see [here](./report.html)).
78*288bf522SAndroid Build Coastguard WorkerIf you don't see a unique call site like `_start` or `_start_thread` at the base
79*288bf522SAndroid Build Coastguard Workerfrom which all flames originate, something went wrong. : Stack unwinding may
80*288bf522SAndroid Build Coastguard Workerfail to reach the root callsite. These incomplete
81*288bf522SAndroid Build Coastguard Workercallstack are impossible to merge properly. By default Inferno asks
82*288bf522SAndroid Build Coastguard Worker `simpleperf` to unwind the stack via the kernel and frame pointers. Try to
83*288bf522SAndroid Build Coastguard Worker perform unwinding with dwarf `-du`, you can further tune this setting.
84*288bf522SAndroid Build Coastguard Worker
85*288bf522SAndroid Build Coastguard Worker
86*288bf522SAndroid Build Coastguard Worker### No flames
87*288bf522SAndroid Build Coastguard Worker
88*288bf522SAndroid Build Coastguard WorkerIf you see no flames at all or a mess of 1 level flame without a common base,
89*288bf522SAndroid Build Coastguard Workerthis may be because you compiled without frame pointers. Make sure there is no
90*288bf522SAndroid Build Coastguard Worker` -fomit-frame-pointer` in your build config. Alternatively, ask simpleperf to
91*288bf522SAndroid Build Coastguard Workercollect data with dward unwinding `-du`.
92*288bf522SAndroid Build Coastguard Worker
93*288bf522SAndroid Build Coastguard Worker
94*288bf522SAndroid Build Coastguard Worker
95*288bf522SAndroid Build Coastguard Worker### High percentage of lost samples
96*288bf522SAndroid Build Coastguard Worker
97*288bf522SAndroid Build Coastguard WorkerIf simpleperf reports a lot of lost sample it is probably because you are
98*288bf522SAndroid Build Coastguard Workerunwinding with `dwarf`. Dwarf unwinding involves copying the stack before it is
99*288bf522SAndroid Build Coastguard Workerprocessed. Try to use frame pointer unwinding which can be done by the kernel
100*288bf522SAndroid Build Coastguard Workerand it much faster.
101*288bf522SAndroid Build Coastguard Worker
102*288bf522SAndroid Build Coastguard WorkerThe cost of frame pointer is negligible on arm64 parameter but considerable
103*288bf522SAndroid Build Coastguard Worker on arm 32-bit arch (due to register pressure). Use a 64-bit build for better
104*288bf522SAndroid Build Coastguard Worker profiling.
105*288bf522SAndroid Build Coastguard Worker
106*288bf522SAndroid Build Coastguard Worker### run-as: package not debuggable
107*288bf522SAndroid Build Coastguard Worker
108*288bf522SAndroid Build Coastguard WorkerIf you cannot run as root, make sure the app is debuggable otherwise simpleperf
109*288bf522SAndroid Build Coastguard Workerwill not be able to profile it.
110