xref: /aosp_15_r20/external/bcc/tools/cpudist_example.txt (revision 387f9dfdfa2baef462e92476d413c7bc2470293e)
1Demonstrations of cpudist.
2
3This program summarizes task on-CPU time as a histogram, showing how long tasks
4spent on the CPU before being descheduled. This provides valuable information
5that can indicate oversubscription (too many tasks for too few processors),
6overhead due to excessive context switching (e.g. a common shared lock for
7multiple threads), uneven workload distribution, too-granular tasks, and more.
8
9By default CPU idle time are excluded by simply excluding PID 0.
10
11Alternatively, the same options are available for summarizing task off-CPU
12time, which helps understand how often threads are being descheduled and how
13long they spend waiting for I/O, locks, timers, and other causes of suspension.
14
15# ./cpudist.py
16Tracing on-CPU time... Hit Ctrl-C to end.
17^C
18     usecs               : count     distribution
19         0 -> 1          : 0        |                                        |
20         2 -> 3          : 1        |                                        |
21         4 -> 7          : 1        |                                        |
22         8 -> 15         : 13       |**                                      |
23        16 -> 31         : 187      |****************************************|
24        32 -> 63         : 89       |*******************                     |
25        64 -> 127        : 26       |*****                                   |
26       128 -> 255        : 0        |                                        |
27       256 -> 511        : 1        |                                        |
28
29This is from a mostly idle system. Tasks wake up occasionally and run for only
30a few dozen microseconds, and then get descheduled.
31
32Here's some output from a system that is heavily loaded by threads that perform
33computation but also compete for a lock:
34
35# ./cpudist.py
36Tracing on-CPU time... Hit Ctrl-C to end.
37^C
38     usecs               : count     distribution
39         0 -> 1          : 51       |*                                       |
40         2 -> 3          : 395      |***********                             |
41         4 -> 7          : 259      |*******                                 |
42         8 -> 15         : 61       |*                                       |
43        16 -> 31         : 75       |**                                      |
44        32 -> 63         : 31       |                                        |
45        64 -> 127        : 7        |                                        |
46       128 -> 255        : 5        |                                        |
47       256 -> 511        : 3        |                                        |
48       512 -> 1023       : 5        |                                        |
49      1024 -> 2047       : 6        |                                        |
50      2048 -> 4095       : 4        |                                        |
51      4096 -> 8191       : 1361     |****************************************|
52      8192 -> 16383      : 523      |***************                         |
53     16384 -> 32767      : 3        |                                        |
54
55A bimodal distribution is now clearly visible. Most of the time, tasks were
56able to run for 4-16ms before being descheduled (this is likely the quantum
57length). Occasionally, tasks had to be descheduled a lot earlier -- possibly
58because they competed for a shared lock.
59
60If necessary, you can restrict the output to include only threads from a
61particular process -- this helps reduce noise:
62
63# ./cpudist.py -p $(pidof parprimes)
64Tracing on-CPU time... Hit Ctrl-C to end.
65^C
66     usecs               : count     distribution
67         0 -> 1          : 3        |                                        |
68         2 -> 3          : 17       |                                        |
69         4 -> 7          : 39       |                                        |
70         8 -> 15         : 52       |*                                       |
71        16 -> 31         : 43       |                                        |
72        32 -> 63         : 12       |                                        |
73        64 -> 127        : 13       |                                        |
74       128 -> 255        : 0        |                                        |
75       256 -> 511        : 1        |                                        |
76       512 -> 1023       : 11       |                                        |
77      1024 -> 2047       : 15       |                                        |
78      2048 -> 4095       : 41       |                                        |
79      4096 -> 8191       : 1134     |************************                |
80      8192 -> 16383      : 1883     |****************************************|
81     16384 -> 32767      : 65       |*                                       |
82
83You can also ask for output at predefined intervals, and include timestamps for
84easier interpretation. While we're at it, the -P switch will print a histogram
85separately for each process:
86
87# ./cpudist.py -TP 5 3
88Tracing on-CPU time... Hit Ctrl-C to end.
89
9003:46:51
91
92pid = 0
93     usecs               : count     distribution
94         0 -> 1          : 0        |                                        |
95         2 -> 3          : 1        |**                                      |
96         4 -> 7          : 17       |**********************************      |
97         8 -> 15         : 11       |**********************                  |
98        16 -> 31         : 20       |****************************************|
99        32 -> 63         : 15       |******************************          |
100        64 -> 127        : 9        |******************                      |
101       128 -> 255        : 6        |************                            |
102       256 -> 511        : 0        |                                        |
103       512 -> 1023       : 0        |                                        |
104      1024 -> 2047       : 0        |                                        |
105      2048 -> 4095       : 1        |**                                      |
106
107pid = 5068
108     usecs               : count     distribution
109         0 -> 1          : 0        |                                        |
110         2 -> 3          : 1        |*************                           |
111         4 -> 7          : 3        |****************************************|
112         8 -> 15         : 0        |                                        |
113        16 -> 31         : 0        |                                        |
114        32 -> 63         : 1        |*************                           |
115
11603:46:56
117
118pid = 0
119     usecs               : count     distribution
120         0 -> 1          : 0        |                                        |
121         2 -> 3          : 1        |**                                      |
122         4 -> 7          : 19       |****************************************|
123         8 -> 15         : 11       |***********************                 |
124        16 -> 31         : 9        |******************                      |
125        32 -> 63         : 3        |******                                  |
126        64 -> 127        : 1        |**                                      |
127       128 -> 255        : 3        |******                                  |
128       256 -> 511        : 0        |                                        |
129       512 -> 1023       : 1        |**                                      |
130
131pid = 5068
132     usecs               : count     distribution
133         0 -> 1          : 1        |********************                    |
134         2 -> 3          : 0        |                                        |
135         4 -> 7          : 2        |****************************************|
136
13703:47:01
138
139pid = 0
140     usecs               : count     distribution
141         0 -> 1          : 0        |                                        |
142         2 -> 3          : 0        |                                        |
143         4 -> 7          : 12       |********************************        |
144         8 -> 15         : 15       |****************************************|
145        16 -> 31         : 15       |****************************************|
146        32 -> 63         : 0        |                                        |
147        64 -> 127        : 3        |********                                |
148       128 -> 255        : 1        |**                                      |
149       256 -> 511        : 0        |                                        |
150       512 -> 1023       : 1        |**                                      |
151
152pid = 5068
153     usecs               : count     distribution
154         0 -> 1          : 0        |                                        |
155         2 -> 3          : 1        |******                                  |
156         4 -> 7          : 6        |****************************************|
157         8 -> 15         : 0        |                                        |
158        16 -> 31         : 0        |                                        |
159        32 -> 63         : 2        |*************                           |
160
161This histogram was obtained while executing `dd if=/dev/zero of=/dev/null` with
162fairly large block sizes.
163
164You could also ask for an off-CPU report using the -O switch. Here's a
165histogram of task block times while the system is heavily loaded:
166
167# ./cpudist -O -p $(parprimes)
168Tracing off-CPU time... Hit Ctrl-C to end.
169^C
170     usecs               : count     distribution
171         0 -> 1          : 0        |                                        |
172         2 -> 3          : 1        |                                        |
173         4 -> 7          : 0        |                                        |
174         8 -> 15         : 0        |                                        |
175        16 -> 31         : 0        |                                        |
176        32 -> 63         : 3        |                                        |
177        64 -> 127        : 1        |                                        |
178       128 -> 255        : 1        |                                        |
179       256 -> 511        : 0        |                                        |
180       512 -> 1023       : 2        |                                        |
181      1024 -> 2047       : 4        |                                        |
182      2048 -> 4095       : 3        |                                        |
183      4096 -> 8191       : 70       |***                                     |
184      8192 -> 16383      : 867      |****************************************|
185     16384 -> 32767      : 141      |******                                  |
186     32768 -> 65535      : 8        |                                        |
187     65536 -> 131071     : 0        |                                        |
188    131072 -> 262143     : 1        |                                        |
189    262144 -> 524287     : 2        |                                        |
190    524288 -> 1048575    : 3        |                                        |
191
192As you can see, threads are switching out for relatively long intervals, even
193though we know the workload doesn't have any significant blocking. This can be
194a result of over-subscription -- too many threads contending over too few CPUs.
195Indeed, there are four available CPUs and more than four runnable threads:
196
197# nproc
1984
199# cat /proc/loadavg
2000.04 0.11 0.06 9/147 7494
201
202(This shows we have 9 threads runnable out of 147 total. This is more than 4,
203the number of available CPUs.)
204
205Finally, let's ask for a per-thread report and values in milliseconds instead
206of microseconds:
207
208# ./cpudist.py -p $(pidof parprimes) -mL
209Tracing on-CPU time... Hit Ctrl-C to end.
210
211
212tid = 5092
213     msecs               : count     distribution
214         0 -> 1          : 3        |                                        |
215         2 -> 3          : 4        |                                        |
216         4 -> 7          : 4        |                                        |
217         8 -> 15         : 535      |****************************************|
218        16 -> 31         : 14       |*                                       |
219
220tid = 5093
221     msecs               : count     distribution
222         0 -> 1          : 8        |                                        |
223         2 -> 3          : 6        |                                        |
224         4 -> 7          : 4        |                                        |
225         8 -> 15         : 534      |****************************************|
226        16 -> 31         : 12       |                                        |
227
228tid = 5094
229     msecs               : count     distribution
230         0 -> 1          : 38       |***                                     |
231         2 -> 3          : 5        |                                        |
232         4 -> 7          : 5        |                                        |
233         8 -> 15         : 476      |****************************************|
234        16 -> 31         : 25       |**                                      |
235
236tid = 5095
237     msecs               : count     distribution
238         0 -> 1          : 31       |**                                      |
239         2 -> 3          : 6        |                                        |
240         4 -> 7          : 10       |                                        |
241         8 -> 15         : 478      |****************************************|
242        16 -> 31         : 20       |*                                       |
243
244tid = 5096
245     msecs               : count     distribution
246         0 -> 1          : 21       |*                                       |
247         2 -> 3          : 5        |                                        |
248         4 -> 7          : 4        |                                        |
249         8 -> 15         : 523      |****************************************|
250        16 -> 31         : 16       |*                                       |
251
252tid = 5097
253     msecs               : count     distribution
254         0 -> 1          : 11       |                                        |
255         2 -> 3          : 7        |                                        |
256         4 -> 7          : 7        |                                        |
257         8 -> 15         : 502      |****************************************|
258        16 -> 31         : 23       |*                                       |
259
260tid = 5098
261     msecs               : count     distribution
262         0 -> 1          : 21       |*                                       |
263         2 -> 3          : 5        |                                        |
264         4 -> 7          : 3        |                                        |
265         8 -> 15         : 494      |****************************************|
266        16 -> 31         : 28       |**                                      |
267
268tid = 5099
269     msecs               : count     distribution
270         0 -> 1          : 15       |*                                       |
271         2 -> 3          : 4        |                                        |
272         4 -> 7          : 6        |                                        |
273         8 -> 15         : 521      |****************************************|
274        16 -> 31         : 12       |                                        |
275
276It looks like all threads are more-or-less equally busy, and are typically
277switched out after running for 8-15 milliseconds (again, this is the typical
278quantum length).
279
280
281USAGE message:
282
283# ./cpudist.py -h
284
285usage: cpudist.py [-h] [-O] [-T] [-m] [-P] [-L] [-p PID] [-I] [-e]
286                  [interval] [count]
287
288Summarize on- and off-CPU time per task as a histogram.
289
290positional arguments:
291  interval            output interval, in seconds
292  count               number of outputs
293
294optional arguments:
295  -h, --help          show this help message and exit
296  -O, --offcpu        measure off-CPU time
297  -T, --timestamp     include timestamp on output
298  -m, --milliseconds  millisecond histogram
299  -P, --pids          print a histogram per process ID
300  -L, --tids          print a histogram per thread ID
301  -p PID, --pid PID   trace this PID only
302  -I, --include-idle  include CPU idle time
303  -e, --extension     show extension summary (average/total/count)
304
305examples:
306    cpudist              # summarize on-CPU time as a histogram
307    cpudist -O           # summarize off-CPU time as a histogram
308    cpudist 1 10         # print 1 second summaries, 10 times
309    cpudist -mT 1        # 1s summaries, milliseconds, and timestamps
310    cpudist -P           # show each PID separately
311    cpudist -p 185       # trace PID 185 only
312    cpudist -I           # include CPU idle time
313    cpudist -e           # show extension summary (average/total/count)
314
315