1*387f9dfdSAndroid Build Coastguard Worker# bcc Tutorial 2*387f9dfdSAndroid Build Coastguard Worker 3*387f9dfdSAndroid Build Coastguard WorkerThis tutorial covers how to use [bcc](https://github.com/iovisor/bcc) tools to quickly solve performance, troubleshooting, and networking issues. If you want to develop new bcc tools, see [tutorial_bcc_python_developer.md](tutorial_bcc_python_developer.md) for that tutorial. 4*387f9dfdSAndroid Build Coastguard Worker 5*387f9dfdSAndroid Build Coastguard WorkerIt is assumed for this tutorial that bcc is already installed, and you can run tools like execsnoop successfully. See [INSTALL.md](../INSTALL.md). This uses enhancements added to the Linux 4.x series. 6*387f9dfdSAndroid Build Coastguard Worker 7*387f9dfdSAndroid Build Coastguard Worker## Observability 8*387f9dfdSAndroid Build Coastguard Worker 9*387f9dfdSAndroid Build Coastguard WorkerSome quick wins. 10*387f9dfdSAndroid Build Coastguard Worker 11*387f9dfdSAndroid Build Coastguard Worker### 0. Before bcc 12*387f9dfdSAndroid Build Coastguard Worker 13*387f9dfdSAndroid Build Coastguard WorkerBefore using bcc, you should start with the Linux basics. One reference is the [Linux Performance Analysis in 60,000 Milliseconds](https://netflixtechblog.com/linux-performance-analysis-in-60-000-milliseconds-accc10403c55) post, which covers these commands: 14*387f9dfdSAndroid Build Coastguard Worker 15*387f9dfdSAndroid Build Coastguard Worker1. uptime 16*387f9dfdSAndroid Build Coastguard Worker1. dmesg | tail 17*387f9dfdSAndroid Build Coastguard Worker1. vmstat 1 18*387f9dfdSAndroid Build Coastguard Worker1. mpstat -P ALL 1 19*387f9dfdSAndroid Build Coastguard Worker1. pidstat 1 20*387f9dfdSAndroid Build Coastguard Worker1. iostat -xz 1 21*387f9dfdSAndroid Build Coastguard Worker1. free -m 22*387f9dfdSAndroid Build Coastguard Worker1. sar -n DEV 1 23*387f9dfdSAndroid Build Coastguard Worker1. sar -n TCP,ETCP 1 24*387f9dfdSAndroid Build Coastguard Worker1. top 25*387f9dfdSAndroid Build Coastguard Worker 26*387f9dfdSAndroid Build Coastguard Worker### 1. General Performance 27*387f9dfdSAndroid Build Coastguard Worker 28*387f9dfdSAndroid Build Coastguard WorkerHere is a generic checklist for performance investigations with bcc, first as a list, then in detail: 29*387f9dfdSAndroid Build Coastguard Worker 30*387f9dfdSAndroid Build Coastguard Worker1. execsnoop 31*387f9dfdSAndroid Build Coastguard Worker1. opensnoop 32*387f9dfdSAndroid Build Coastguard Worker1. ext4slower (or btrfs\*, xfs\*, zfs\*) 33*387f9dfdSAndroid Build Coastguard Worker1. biolatency 34*387f9dfdSAndroid Build Coastguard Worker1. biosnoop 35*387f9dfdSAndroid Build Coastguard Worker1. cachestat 36*387f9dfdSAndroid Build Coastguard Worker1. tcpconnect 37*387f9dfdSAndroid Build Coastguard Worker1. tcpaccept 38*387f9dfdSAndroid Build Coastguard Worker1. tcpretrans 39*387f9dfdSAndroid Build Coastguard Worker1. runqlat 40*387f9dfdSAndroid Build Coastguard Worker1. profile 41*387f9dfdSAndroid Build Coastguard Worker 42*387f9dfdSAndroid Build Coastguard WorkerThese tools may be installed on your system under /usr/share/bcc/tools, or you can run them from the bcc github repo under /tools where they have a .py extension. Browse the 50+ tools available for more analysis options. 43*387f9dfdSAndroid Build Coastguard Worker 44*387f9dfdSAndroid Build Coastguard Worker#### 1.1 execsnoop 45*387f9dfdSAndroid Build Coastguard Worker 46*387f9dfdSAndroid Build Coastguard Worker``` 47*387f9dfdSAndroid Build Coastguard Worker# ./execsnoop 48*387f9dfdSAndroid Build Coastguard WorkerPCOMM PID RET ARGS 49*387f9dfdSAndroid Build Coastguard Workersupervise 9660 0 ./run 50*387f9dfdSAndroid Build Coastguard Workersupervise 9661 0 ./run 51*387f9dfdSAndroid Build Coastguard Workermkdir 9662 0 /bin/mkdir -p ./main 52*387f9dfdSAndroid Build Coastguard Workerrun 9663 0 ./run 53*387f9dfdSAndroid Build Coastguard Worker[...] 54*387f9dfdSAndroid Build Coastguard Worker``` 55*387f9dfdSAndroid Build Coastguard Worker 56*387f9dfdSAndroid Build Coastguard Workerexecsnoop prints one line of output for each new process. Check for short-lived processes. These can consume CPU resources, but not show up in most monitoring tools that periodically take snapshots of which processes are running. 57*387f9dfdSAndroid Build Coastguard Worker 58*387f9dfdSAndroid Build Coastguard WorkerIt works by tracing exec(), not the fork(), so it will catch many types of new processes but not all (eg, it won't see an application launching working processes, that doesn't exec() anything else). 59*387f9dfdSAndroid Build Coastguard Worker 60*387f9dfdSAndroid Build Coastguard WorkerMore [examples](../tools/execsnoop_example.txt). 61*387f9dfdSAndroid Build Coastguard Worker 62*387f9dfdSAndroid Build Coastguard Worker#### 1.2. opensnoop 63*387f9dfdSAndroid Build Coastguard Worker 64*387f9dfdSAndroid Build Coastguard Worker``` 65*387f9dfdSAndroid Build Coastguard Worker# ./opensnoop 66*387f9dfdSAndroid Build Coastguard WorkerPID COMM FD ERR PATH 67*387f9dfdSAndroid Build Coastguard Worker1565 redis-server 5 0 /proc/1565/stat 68*387f9dfdSAndroid Build Coastguard Worker1565 redis-server 5 0 /proc/1565/stat 69*387f9dfdSAndroid Build Coastguard Worker1565 redis-server 5 0 /proc/1565/stat 70*387f9dfdSAndroid Build Coastguard Worker1603 snmpd 9 0 /proc/net/dev 71*387f9dfdSAndroid Build Coastguard Worker1603 snmpd 11 0 /proc/net/if_inet6 72*387f9dfdSAndroid Build Coastguard Worker1603 snmpd -1 2 /sys/class/net/eth0/device/vendor 73*387f9dfdSAndroid Build Coastguard Worker1603 snmpd 11 0 /proc/sys/net/ipv4/neigh/eth0/retrans_time_ms 74*387f9dfdSAndroid Build Coastguard Worker1603 snmpd 11 0 /proc/sys/net/ipv6/neigh/eth0/retrans_time_ms 75*387f9dfdSAndroid Build Coastguard Worker1603 snmpd 11 0 /proc/sys/net/ipv6/conf/eth0/forwarding 76*387f9dfdSAndroid Build Coastguard Worker[...] 77*387f9dfdSAndroid Build Coastguard Worker``` 78*387f9dfdSAndroid Build Coastguard Worker 79*387f9dfdSAndroid Build Coastguard Workeropensnoop prints one line of output for each open() syscall, including details. 80*387f9dfdSAndroid Build Coastguard Worker 81*387f9dfdSAndroid Build Coastguard WorkerFiles that are opened can tell you a lot about how applications work: identifying their data files, config files, and log files. Sometimes applications can misbehave, and perform poorly, when they are constantly attempting to read files that do not exist. opensnoop gives you a quick look. 82*387f9dfdSAndroid Build Coastguard Worker 83*387f9dfdSAndroid Build Coastguard WorkerMore [examples](../tools/opensnoop_example.txt). 84*387f9dfdSAndroid Build Coastguard Worker 85*387f9dfdSAndroid Build Coastguard Worker#### 1.3. ext4slower (or btrfs\*, xfs\*, zfs\*) 86*387f9dfdSAndroid Build Coastguard Worker 87*387f9dfdSAndroid Build Coastguard Worker``` 88*387f9dfdSAndroid Build Coastguard Worker# ./ext4slower 89*387f9dfdSAndroid Build Coastguard WorkerTracing ext4 operations slower than 10 ms 90*387f9dfdSAndroid Build Coastguard WorkerTIME COMM PID T BYTES OFF_KB LAT(ms) FILENAME 91*387f9dfdSAndroid Build Coastguard Worker06:35:01 cron 16464 R 1249 0 16.05 common-auth 92*387f9dfdSAndroid Build Coastguard Worker06:35:01 cron 16463 R 1249 0 16.04 common-auth 93*387f9dfdSAndroid Build Coastguard Worker06:35:01 cron 16465 R 1249 0 16.03 common-auth 94*387f9dfdSAndroid Build Coastguard Worker06:35:01 cron 16465 R 4096 0 10.62 login.defs 95*387f9dfdSAndroid Build Coastguard Worker06:35:01 cron 16464 R 4096 0 10.61 login.defs 96*387f9dfdSAndroid Build Coastguard Worker``` 97*387f9dfdSAndroid Build Coastguard Worker 98*387f9dfdSAndroid Build Coastguard Workerext4slower traces the ext4 file system and times common operations, and then only prints those that exceed a threshold. 99*387f9dfdSAndroid Build Coastguard Worker 100*387f9dfdSAndroid Build Coastguard WorkerThis is great for identifying or exonerating one type of performance issue: show individually slow disk i/O via the file system. Disks process I/O asynchronously, and it can be difficult to associate latency at that layer with the latency applications experience. Tracing higher up in the kernel stack, at the VFS -> file system interface, will more closely match what an application suffers. Use this tool to identify if file system latency exceeds a given threshold. 101*387f9dfdSAndroid Build Coastguard Worker 102*387f9dfdSAndroid Build Coastguard WorkerSimilar tools exist in bcc for other file systems: btrfsslower, xfsslower, and zfsslower. There is also fileslower, which works at the VFS layer and traces everything (although at some higher overhead). 103*387f9dfdSAndroid Build Coastguard Worker 104*387f9dfdSAndroid Build Coastguard WorkerMore [examples](../tools/ext4slower_example.txt). 105*387f9dfdSAndroid Build Coastguard Worker 106*387f9dfdSAndroid Build Coastguard Worker#### 1.4. biolatency 107*387f9dfdSAndroid Build Coastguard Worker 108*387f9dfdSAndroid Build Coastguard Worker``` 109*387f9dfdSAndroid Build Coastguard Worker# ./biolatency 110*387f9dfdSAndroid Build Coastguard WorkerTracing block device I/O... Hit Ctrl-C to end. 111*387f9dfdSAndroid Build Coastguard Worker^C 112*387f9dfdSAndroid Build Coastguard Worker usecs : count distribution 113*387f9dfdSAndroid Build Coastguard Worker 0 -> 1 : 0 | | 114*387f9dfdSAndroid Build Coastguard Worker 2 -> 3 : 0 | | 115*387f9dfdSAndroid Build Coastguard Worker 4 -> 7 : 0 | | 116*387f9dfdSAndroid Build Coastguard Worker 8 -> 15 : 0 | | 117*387f9dfdSAndroid Build Coastguard Worker 16 -> 31 : 0 | | 118*387f9dfdSAndroid Build Coastguard Worker 32 -> 63 : 0 | | 119*387f9dfdSAndroid Build Coastguard Worker 64 -> 127 : 1 | | 120*387f9dfdSAndroid Build Coastguard Worker 128 -> 255 : 12 |******** | 121*387f9dfdSAndroid Build Coastguard Worker 256 -> 511 : 15 |********** | 122*387f9dfdSAndroid Build Coastguard Worker 512 -> 1023 : 43 |******************************* | 123*387f9dfdSAndroid Build Coastguard Worker 1024 -> 2047 : 52 |**************************************| 124*387f9dfdSAndroid Build Coastguard Worker 2048 -> 4095 : 47 |********************************** | 125*387f9dfdSAndroid Build Coastguard Worker 4096 -> 8191 : 52 |**************************************| 126*387f9dfdSAndroid Build Coastguard Worker 8192 -> 16383 : 36 |************************** | 127*387f9dfdSAndroid Build Coastguard Worker 16384 -> 32767 : 15 |********** | 128*387f9dfdSAndroid Build Coastguard Worker 32768 -> 65535 : 2 |* | 129*387f9dfdSAndroid Build Coastguard Worker 65536 -> 131071 : 2 |* | 130*387f9dfdSAndroid Build Coastguard Worker``` 131*387f9dfdSAndroid Build Coastguard Worker 132*387f9dfdSAndroid Build Coastguard Workerbiolatency traces disk I/O latency (time from device issue to completion), and when the tool ends (Ctrl-C, or a given interval), it prints a histogram summary of the latency. 133*387f9dfdSAndroid Build Coastguard Worker 134*387f9dfdSAndroid Build Coastguard WorkerThis is great for understanding disk I/O latency beyond the average times given by tools like iostat. I/O latency outliers will be visible at the end of the distribution, as well as multi-mode distributions. 135*387f9dfdSAndroid Build Coastguard Worker 136*387f9dfdSAndroid Build Coastguard WorkerMore [examples](../tools/biolatency_example.txt). 137*387f9dfdSAndroid Build Coastguard Worker 138*387f9dfdSAndroid Build Coastguard Worker#### 1.5. biosnoop 139*387f9dfdSAndroid Build Coastguard Worker 140*387f9dfdSAndroid Build Coastguard Worker``` 141*387f9dfdSAndroid Build Coastguard Worker# ./biosnoop 142*387f9dfdSAndroid Build Coastguard WorkerTIME(s) COMM PID DISK T SECTOR BYTES LAT(ms) 143*387f9dfdSAndroid Build Coastguard Worker0.000004001 supervise 1950 xvda1 W 13092560 4096 0.74 144*387f9dfdSAndroid Build Coastguard Worker0.000178002 supervise 1950 xvda1 W 13092432 4096 0.61 145*387f9dfdSAndroid Build Coastguard Worker0.001469001 supervise 1956 xvda1 W 13092440 4096 1.24 146*387f9dfdSAndroid Build Coastguard Worker0.001588002 supervise 1956 xvda1 W 13115128 4096 1.09 147*387f9dfdSAndroid Build Coastguard Worker1.022346001 supervise 1950 xvda1 W 13115272 4096 0.98 148*387f9dfdSAndroid Build Coastguard Worker1.022568002 supervise 1950 xvda1 W 13188496 4096 0.93 149*387f9dfdSAndroid Build Coastguard Worker[...] 150*387f9dfdSAndroid Build Coastguard Worker``` 151*387f9dfdSAndroid Build Coastguard Worker 152*387f9dfdSAndroid Build Coastguard Workerbiosnoop prints a line of output for each disk I/O, with details including latency (time from device issue to completion). 153*387f9dfdSAndroid Build Coastguard Worker 154*387f9dfdSAndroid Build Coastguard WorkerThis allows you to examine disk I/O in more detail, and look for time-ordered patterns (eg, reads queueing behind writes). Note that the output will be verbose if your system performs disk I/O at a high rate. 155*387f9dfdSAndroid Build Coastguard Worker 156*387f9dfdSAndroid Build Coastguard WorkerMore [examples](../tools/biosnoop_example.txt). 157*387f9dfdSAndroid Build Coastguard Worker 158*387f9dfdSAndroid Build Coastguard Worker#### 1.6. cachestat 159*387f9dfdSAndroid Build Coastguard Worker 160*387f9dfdSAndroid Build Coastguard Worker``` 161*387f9dfdSAndroid Build Coastguard Worker# ./cachestat 162*387f9dfdSAndroid Build Coastguard Worker HITS MISSES DIRTIES READ_HIT% WRITE_HIT% BUFFERS_MB CACHED_MB 163*387f9dfdSAndroid Build Coastguard Worker 1074 44 13 94.9% 2.9% 1 223 164*387f9dfdSAndroid Build Coastguard Worker 2195 170 8 92.5% 6.8% 1 143 165*387f9dfdSAndroid Build Coastguard Worker 182 53 56 53.6% 1.3% 1 143 166*387f9dfdSAndroid Build Coastguard Worker 62480 40960 20480 40.6% 19.8% 1 223 167*387f9dfdSAndroid Build Coastguard Worker 7 2 5 22.2% 22.2% 1 223 168*387f9dfdSAndroid Build Coastguard Worker 348 0 0 100.0% 0.0% 1 223 169*387f9dfdSAndroid Build Coastguard Worker[...] 170*387f9dfdSAndroid Build Coastguard Worker``` 171*387f9dfdSAndroid Build Coastguard Worker 172*387f9dfdSAndroid Build Coastguard Workercachestat prints a one line summary every second (or every custom interval) showing statistics from the file system cache. 173*387f9dfdSAndroid Build Coastguard Worker 174*387f9dfdSAndroid Build Coastguard WorkerUse this to identify a low cache hit ratio, and a high rate of misses: which gives one lead for performance tuning. 175*387f9dfdSAndroid Build Coastguard Worker 176*387f9dfdSAndroid Build Coastguard WorkerMore [examples](../tools/cachestat_example.txt). 177*387f9dfdSAndroid Build Coastguard Worker 178*387f9dfdSAndroid Build Coastguard Worker#### 1.7. tcpconnect 179*387f9dfdSAndroid Build Coastguard Worker 180*387f9dfdSAndroid Build Coastguard Worker``` 181*387f9dfdSAndroid Build Coastguard Worker# ./tcpconnect 182*387f9dfdSAndroid Build Coastguard WorkerPID COMM IP SADDR DADDR DPORT 183*387f9dfdSAndroid Build Coastguard Worker1479 telnet 4 127.0.0.1 127.0.0.1 23 184*387f9dfdSAndroid Build Coastguard Worker1469 curl 4 10.201.219.236 54.245.105.25 80 185*387f9dfdSAndroid Build Coastguard Worker1469 curl 4 10.201.219.236 54.67.101.145 80 186*387f9dfdSAndroid Build Coastguard Worker1991 telnet 6 ::1 ::1 23 187*387f9dfdSAndroid Build Coastguard Worker2015 ssh 6 fe80::2000:bff:fe82:3ac fe80::2000:bff:fe82:3ac 22 188*387f9dfdSAndroid Build Coastguard Worker[...] 189*387f9dfdSAndroid Build Coastguard Worker``` 190*387f9dfdSAndroid Build Coastguard Worker 191*387f9dfdSAndroid Build Coastguard Workertcpconnect prints one line of output for every active TCP connection (eg, via connect()), with details including source and destination addresses. 192*387f9dfdSAndroid Build Coastguard Worker 193*387f9dfdSAndroid Build Coastguard WorkerLook for unexpected connections that may point to inefficiencies in application configuration, or an intruder. 194*387f9dfdSAndroid Build Coastguard Worker 195*387f9dfdSAndroid Build Coastguard WorkerMore [examples](../tools/tcpconnect_example.txt). 196*387f9dfdSAndroid Build Coastguard Worker 197*387f9dfdSAndroid Build Coastguard Worker#### 1.8. tcpaccept 198*387f9dfdSAndroid Build Coastguard Worker 199*387f9dfdSAndroid Build Coastguard Worker``` 200*387f9dfdSAndroid Build Coastguard Worker# ./tcpaccept 201*387f9dfdSAndroid Build Coastguard WorkerPID COMM IP RADDR LADDR LPORT 202*387f9dfdSAndroid Build Coastguard Worker907 sshd 4 192.168.56.1 192.168.56.102 22 203*387f9dfdSAndroid Build Coastguard Worker907 sshd 4 127.0.0.1 127.0.0.1 22 204*387f9dfdSAndroid Build Coastguard Worker5389 perl 6 1234:ab12:2040:5020:2299:0:5:0 1234:ab12:2040:5020:2299:0:5:0 7001 205*387f9dfdSAndroid Build Coastguard Worker[...] 206*387f9dfdSAndroid Build Coastguard Worker``` 207*387f9dfdSAndroid Build Coastguard Worker 208*387f9dfdSAndroid Build Coastguard Workertcpaccept prints one line of output for every passive TCP connection (eg, via accept()), with details including source and destination addresses. 209*387f9dfdSAndroid Build Coastguard Worker 210*387f9dfdSAndroid Build Coastguard WorkerLook for unexpected connections that may point to inefficiencies in application configuration, or an intruder. 211*387f9dfdSAndroid Build Coastguard Worker 212*387f9dfdSAndroid Build Coastguard WorkerMore [examples](../tools/tcpaccept_example.txt). 213*387f9dfdSAndroid Build Coastguard Worker 214*387f9dfdSAndroid Build Coastguard Worker#### 1.9. tcpretrans 215*387f9dfdSAndroid Build Coastguard Worker 216*387f9dfdSAndroid Build Coastguard Worker``` 217*387f9dfdSAndroid Build Coastguard Worker# ./tcpretrans 218*387f9dfdSAndroid Build Coastguard WorkerTIME PID IP LADDR:LPORT T> RADDR:RPORT STATE 219*387f9dfdSAndroid Build Coastguard Worker01:55:05 0 4 10.153.223.157:22 R> 69.53.245.40:34619 ESTABLISHED 220*387f9dfdSAndroid Build Coastguard Worker01:55:05 0 4 10.153.223.157:22 R> 69.53.245.40:34619 ESTABLISHED 221*387f9dfdSAndroid Build Coastguard Worker01:55:17 0 4 10.153.223.157:22 R> 69.53.245.40:22957 ESTABLISHED 222*387f9dfdSAndroid Build Coastguard Worker[...] 223*387f9dfdSAndroid Build Coastguard Worker``` 224*387f9dfdSAndroid Build Coastguard Worker 225*387f9dfdSAndroid Build Coastguard Workertcprerans prints one line of output for every TCP retransmit packet, with details including source and destination addresses, and kernel state of the TCP connection. 226*387f9dfdSAndroid Build Coastguard Worker 227*387f9dfdSAndroid Build Coastguard WorkerTCP retransmissions cause latency and throughput issues. For ESTABLISHED retransmits, look for patterns with networks. For SYN_SENT, this may point to target kernel CPU saturation and kernel packet drops. 228*387f9dfdSAndroid Build Coastguard Worker 229*387f9dfdSAndroid Build Coastguard WorkerMore [examples](../tools/tcpretrans_example.txt). 230*387f9dfdSAndroid Build Coastguard Worker 231*387f9dfdSAndroid Build Coastguard Worker#### 1.10. runqlat 232*387f9dfdSAndroid Build Coastguard Worker 233*387f9dfdSAndroid Build Coastguard Worker``` 234*387f9dfdSAndroid Build Coastguard Worker# ./runqlat 235*387f9dfdSAndroid Build Coastguard WorkerTracing run queue latency... Hit Ctrl-C to end. 236*387f9dfdSAndroid Build Coastguard Worker^C 237*387f9dfdSAndroid Build Coastguard Worker usecs : count distribution 238*387f9dfdSAndroid Build Coastguard Worker 0 -> 1 : 233 |*********** | 239*387f9dfdSAndroid Build Coastguard Worker 2 -> 3 : 742 |************************************ | 240*387f9dfdSAndroid Build Coastguard Worker 4 -> 7 : 203 |********** | 241*387f9dfdSAndroid Build Coastguard Worker 8 -> 15 : 173 |******** | 242*387f9dfdSAndroid Build Coastguard Worker 16 -> 31 : 24 |* | 243*387f9dfdSAndroid Build Coastguard Worker 32 -> 63 : 0 | | 244*387f9dfdSAndroid Build Coastguard Worker 64 -> 127 : 30 |* | 245*387f9dfdSAndroid Build Coastguard Worker 128 -> 255 : 6 | | 246*387f9dfdSAndroid Build Coastguard Worker 256 -> 511 : 3 | | 247*387f9dfdSAndroid Build Coastguard Worker 512 -> 1023 : 5 | | 248*387f9dfdSAndroid Build Coastguard Worker 1024 -> 2047 : 27 |* | 249*387f9dfdSAndroid Build Coastguard Worker 2048 -> 4095 : 30 |* | 250*387f9dfdSAndroid Build Coastguard Worker 4096 -> 8191 : 20 | | 251*387f9dfdSAndroid Build Coastguard Worker 8192 -> 16383 : 29 |* | 252*387f9dfdSAndroid Build Coastguard Worker 16384 -> 32767 : 809 |****************************************| 253*387f9dfdSAndroid Build Coastguard Worker 32768 -> 65535 : 64 |*** | 254*387f9dfdSAndroid Build Coastguard Worker``` 255*387f9dfdSAndroid Build Coastguard Worker 256*387f9dfdSAndroid Build Coastguard Workerrunqlat times how long threads were waiting on the CPU run queues, and prints this as a histogram. 257*387f9dfdSAndroid Build Coastguard Worker 258*387f9dfdSAndroid Build Coastguard WorkerThis can help quantify time lost waiting for a turn on CPU, during periods of CPU saturation. 259*387f9dfdSAndroid Build Coastguard Worker 260*387f9dfdSAndroid Build Coastguard WorkerMore [examples](../tools/runqlat_example.txt). 261*387f9dfdSAndroid Build Coastguard Worker 262*387f9dfdSAndroid Build Coastguard Worker#### 1.11. profile 263*387f9dfdSAndroid Build Coastguard Worker 264*387f9dfdSAndroid Build Coastguard Worker``` 265*387f9dfdSAndroid Build Coastguard Worker# ./profile 266*387f9dfdSAndroid Build Coastguard WorkerSampling at 49 Hertz of all threads by user + kernel stack... Hit Ctrl-C to end. 267*387f9dfdSAndroid Build Coastguard Worker^C 268*387f9dfdSAndroid Build Coastguard Worker 00007f31d76c3251 [unknown] 269*387f9dfdSAndroid Build Coastguard Worker 47a2c1e752bf47f7 [unknown] 270*387f9dfdSAndroid Build Coastguard Worker - sign-file (8877) 271*387f9dfdSAndroid Build Coastguard Worker 1 272*387f9dfdSAndroid Build Coastguard Worker 273*387f9dfdSAndroid Build Coastguard Worker ffffffff813d0af8 __clear_user 274*387f9dfdSAndroid Build Coastguard Worker ffffffff813d5277 iov_iter_zero 275*387f9dfdSAndroid Build Coastguard Worker ffffffff814ec5f2 read_iter_zero 276*387f9dfdSAndroid Build Coastguard Worker ffffffff8120be9d __vfs_read 277*387f9dfdSAndroid Build Coastguard Worker ffffffff8120c385 vfs_read 278*387f9dfdSAndroid Build Coastguard Worker ffffffff8120d786 sys_read 279*387f9dfdSAndroid Build Coastguard Worker ffffffff817cc076 entry_SYSCALL_64_fastpath 280*387f9dfdSAndroid Build Coastguard Worker 00007fc5652ad9b0 read 281*387f9dfdSAndroid Build Coastguard Worker - dd (25036) 282*387f9dfdSAndroid Build Coastguard Worker 4 283*387f9dfdSAndroid Build Coastguard Worker 284*387f9dfdSAndroid Build Coastguard Worker 0000000000400542 func_a 285*387f9dfdSAndroid Build Coastguard Worker 0000000000400598 main 286*387f9dfdSAndroid Build Coastguard Worker 00007f12a133e830 __libc_start_main 287*387f9dfdSAndroid Build Coastguard Worker 083e258d4c544155 [unknown] 288*387f9dfdSAndroid Build Coastguard Worker - func_ab (13549) 289*387f9dfdSAndroid Build Coastguard Worker 5 290*387f9dfdSAndroid Build Coastguard Worker 291*387f9dfdSAndroid Build Coastguard Worker[...] 292*387f9dfdSAndroid Build Coastguard Worker 293*387f9dfdSAndroid Build Coastguard Worker ffffffff8105eb66 native_safe_halt 294*387f9dfdSAndroid Build Coastguard Worker ffffffff8103659e default_idle 295*387f9dfdSAndroid Build Coastguard Worker ffffffff81036d1f arch_cpu_idle 296*387f9dfdSAndroid Build Coastguard Worker ffffffff810bba5a default_idle_call 297*387f9dfdSAndroid Build Coastguard Worker ffffffff810bbd07 cpu_startup_entry 298*387f9dfdSAndroid Build Coastguard Worker ffffffff8104df55 start_secondary 299*387f9dfdSAndroid Build Coastguard Worker - swapper/1 (0) 300*387f9dfdSAndroid Build Coastguard Worker 75 301*387f9dfdSAndroid Build Coastguard Worker``` 302*387f9dfdSAndroid Build Coastguard Worker 303*387f9dfdSAndroid Build Coastguard Workerprofile is a CPU profiler, which takes samples of stack traces at timed intervals, and prints a summary of unique stack traces and a count of their occurrence. 304*387f9dfdSAndroid Build Coastguard Worker 305*387f9dfdSAndroid Build Coastguard WorkerUse this tool to understand the code paths that are consuming CPU resources. 306*387f9dfdSAndroid Build Coastguard Worker 307*387f9dfdSAndroid Build Coastguard WorkerMore [examples](../tools/profile_example.txt). 308*387f9dfdSAndroid Build Coastguard Worker 309*387f9dfdSAndroid Build Coastguard Worker### 2. Observability with Generic Tools 310*387f9dfdSAndroid Build Coastguard Worker 311*387f9dfdSAndroid Build Coastguard WorkerIn addition to the above tools for performance tuning, below is a checklist for bcc generic tools, first as a list, and in detail: 312*387f9dfdSAndroid Build Coastguard Worker 313*387f9dfdSAndroid Build Coastguard Worker1. trace 314*387f9dfdSAndroid Build Coastguard Worker1. argdist 315*387f9dfdSAndroid Build Coastguard Worker1. funccount 316*387f9dfdSAndroid Build Coastguard Worker 317*387f9dfdSAndroid Build Coastguard WorkerThese generic tools may be useful to provide visibility to solve your specific problems. 318*387f9dfdSAndroid Build Coastguard Worker 319*387f9dfdSAndroid Build Coastguard Worker#### 2.1. trace 320*387f9dfdSAndroid Build Coastguard Worker 321*387f9dfdSAndroid Build Coastguard Worker##### Example 1 322*387f9dfdSAndroid Build Coastguard Worker 323*387f9dfdSAndroid Build Coastguard WorkerSuppose you want to track file ownership change. There are three syscalls, `chown`, `fchown` and `lchown` which users can use to change file ownership. The corresponding syscall entry is `SyS_[f|l]chown`. The following command can be used to print out syscall parameters and the calling process user id. You can use `id` command to find the uid of a particular user. 324*387f9dfdSAndroid Build Coastguard Worker 325*387f9dfdSAndroid Build Coastguard Worker``` 326*387f9dfdSAndroid Build Coastguard Worker$ trace.py \ 327*387f9dfdSAndroid Build Coastguard Worker 'p::SyS_chown "file = %s, to_uid = %d, to_gid = %d, from_uid = %d", arg1, arg2, arg3, $uid' \ 328*387f9dfdSAndroid Build Coastguard Worker 'p::SyS_fchown "fd = %d, to_uid = %d, to_gid = %d, from_uid = %d", arg1, arg2, arg3, $uid' \ 329*387f9dfdSAndroid Build Coastguard Worker 'p::SyS_lchown "file = %s, to_uid = %d, to_gid = %d, from_uid = %d", arg1, arg2, arg3, $uid' 330*387f9dfdSAndroid Build Coastguard WorkerPID TID COMM FUNC - 331*387f9dfdSAndroid Build Coastguard Worker1269255 1269255 python3.6 SyS_lchown file = /tmp/dotsync-usisgezu/tmp, to_uid = 128203, to_gid = 100, from_uid = 128203 332*387f9dfdSAndroid Build Coastguard Worker1269441 1269441 zstd SyS_chown file = /tmp/dotsync-vic7ygj0/dotsync-package.zst, to_uid = 128203, to_gid = 100, from_uid = 128203 333*387f9dfdSAndroid Build Coastguard Worker1269255 1269255 python3.6 SyS_lchown file = /tmp/dotsync-a40zd7ev/tmp, to_uid = 128203, to_gid = 100, from_uid = 128203 334*387f9dfdSAndroid Build Coastguard Worker1269442 1269442 zstd SyS_chown file = /tmp/dotsync-gzp413o_/dotsync-package.zst, to_uid = 128203, to_gid = 100, from_uid = 128203 335*387f9dfdSAndroid Build Coastguard Worker1269255 1269255 python3.6 SyS_lchown file = /tmp/dotsync-whx4fivm/tmp/.bash_profile, to_uid = 128203, to_gid = 100, from_uid = 128203 336*387f9dfdSAndroid Build Coastguard Worker``` 337*387f9dfdSAndroid Build Coastguard Worker 338*387f9dfdSAndroid Build Coastguard Worker##### Example 2 339*387f9dfdSAndroid Build Coastguard Worker 340*387f9dfdSAndroid Build Coastguard WorkerSuppose you want to count nonvoluntary context switches (`nvcsw`) in your bpf based performance monitoring tools and you do not know what is the proper method. `/proc/<pid>/status` already tells you the number (`nonvoluntary_ctxt_switches`) for a pid and you can use `trace.py` to do a quick experiment to verify your method. With kernel source code, the `nvcsw` is counted at file `linux/kernel/sched/core.c` function `__schedule` and under condition 341*387f9dfdSAndroid Build Coastguard Worker``` 342*387f9dfdSAndroid Build Coastguard Worker!(!preempt && prev->state) // i.e., preempt || !prev->state 343*387f9dfdSAndroid Build Coastguard Worker``` 344*387f9dfdSAndroid Build Coastguard Worker 345*387f9dfdSAndroid Build Coastguard WorkerThe `__schedule` function is marked as `notrace`, and the best place to evaluate the above condition seems in `sched/sched_switch` tracepoint called inside function `__schedule` and defined in `linux/include/trace/events/sched.h`. `trace.py` already has `args` being the pointer to the tracepoint `TP_STRUCT__entry`. The above condition in function `__schedule` can be represented as 346*387f9dfdSAndroid Build Coastguard Worker``` 347*387f9dfdSAndroid Build Coastguard Workerargs->prev_state == TASK_STATE_MAX || args->prev_state == 0 348*387f9dfdSAndroid Build Coastguard Worker``` 349*387f9dfdSAndroid Build Coastguard Worker 350*387f9dfdSAndroid Build Coastguard WorkerThe below command can be used to count the involuntary context switches (per process or per pid) and compare to `/proc/<pid>/status` or `/proc/<pid>/task/<task_id>/status` for correctness, as in typical cases, involuntary context switches are not very common. 351*387f9dfdSAndroid Build Coastguard Worker``` 352*387f9dfdSAndroid Build Coastguard Worker$ trace.py -p 1134138 't:sched:sched_switch (args->prev_state == TASK_STATE_MAX || args->prev_state == 0)' 353*387f9dfdSAndroid Build Coastguard WorkerPID TID COMM FUNC 354*387f9dfdSAndroid Build Coastguard Worker1134138 1134140 contention_test sched_switch 355*387f9dfdSAndroid Build Coastguard Worker1134138 1134142 contention_test sched_switch 356*387f9dfdSAndroid Build Coastguard Worker... 357*387f9dfdSAndroid Build Coastguard Worker$ trace.py -L 1134140 't:sched:sched_switch (args->prev_state == TASK_STATE_MAX || args->prev_state == 0)' 358*387f9dfdSAndroid Build Coastguard WorkerPID TID COMM FUNC 359*387f9dfdSAndroid Build Coastguard Worker1134138 1134140 contention_test sched_switch 360*387f9dfdSAndroid Build Coastguard Worker1134138 1134140 contention_test sched_switch 361*387f9dfdSAndroid Build Coastguard Worker... 362*387f9dfdSAndroid Build Coastguard Worker``` 363*387f9dfdSAndroid Build Coastguard Worker 364*387f9dfdSAndroid Build Coastguard Worker##### Example 3 365*387f9dfdSAndroid Build Coastguard Worker 366*387f9dfdSAndroid Build Coastguard WorkerThis example is related to issue [1231](https://github.com/iovisor/bcc/issues/1231) and [1516](https://github.com/iovisor/bcc/issues/1516) where uprobe does not work at all in certain cases. First, you can do a `strace` as below 367*387f9dfdSAndroid Build Coastguard Worker 368*387f9dfdSAndroid Build Coastguard Worker``` 369*387f9dfdSAndroid Build Coastguard Worker$ strace trace.py 'r:bash:readline "%s", retval' 370*387f9dfdSAndroid Build Coastguard Worker... 371*387f9dfdSAndroid Build Coastguard Workerperf_event_open(0x7ffd968212f0, -1, 0, -1, 0x8 /* PERF_FLAG_??? */) = -1 EIO (Input/output error) 372*387f9dfdSAndroid Build Coastguard Worker... 373*387f9dfdSAndroid Build Coastguard Worker``` 374*387f9dfdSAndroid Build Coastguard Worker 375*387f9dfdSAndroid Build Coastguard WorkerThe `perf_event_open` syscall returns `-EIO`. Digging into kernel uprobe related codes in `/kernel/trace` and `/kernel/events` directories to search `EIO`, the function `uprobe_register` is the most suspicious. Let us find whether this function is called or not and what is the return value if it is called. In one terminal using the following command to print out the return value of uprobe_register, 376*387f9dfdSAndroid Build Coastguard Worker``` 377*387f9dfdSAndroid Build Coastguard Worker$ trace.py 'r::uprobe_register "ret = %d", retval' 378*387f9dfdSAndroid Build Coastguard Worker``` 379*387f9dfdSAndroid Build Coastguard WorkerIn another terminal run the same bash uretprobe tracing example, and you should get 380*387f9dfdSAndroid Build Coastguard Worker``` 381*387f9dfdSAndroid Build Coastguard Worker$ trace.py 'r::uprobe_register "ret = %d", retval' 382*387f9dfdSAndroid Build Coastguard WorkerPID TID COMM FUNC - 383*387f9dfdSAndroid Build Coastguard Worker1041401 1041401 python2.7 uprobe_register ret = -5 384*387f9dfdSAndroid Build Coastguard Worker``` 385*387f9dfdSAndroid Build Coastguard Worker 386*387f9dfdSAndroid Build Coastguard WorkerThe `-5` error code is EIO. This confirms that the following code in function `uprobe_register` is the most suspicious culprit. 387*387f9dfdSAndroid Build Coastguard Worker``` 388*387f9dfdSAndroid Build Coastguard Worker if (!inode->i_mapping->a_ops->readpage && !shmem_mapping(inode->i_mapping)) 389*387f9dfdSAndroid Build Coastguard Worker return -EIO; 390*387f9dfdSAndroid Build Coastguard Worker``` 391*387f9dfdSAndroid Build Coastguard WorkerThe `shmem_mapping` function is defined as 392*387f9dfdSAndroid Build Coastguard Worker``` 393*387f9dfdSAndroid Build Coastguard Workerbool shmem_mapping(struct address_space *mapping) 394*387f9dfdSAndroid Build Coastguard Worker{ 395*387f9dfdSAndroid Build Coastguard Worker return mapping->a_ops == &shmem_aops; 396*387f9dfdSAndroid Build Coastguard Worker} 397*387f9dfdSAndroid Build Coastguard Worker``` 398*387f9dfdSAndroid Build Coastguard Worker 399*387f9dfdSAndroid Build Coastguard WorkerTo confirm the theory, find what is `inode->i_mapping->a_ops` with the following command 400*387f9dfdSAndroid Build Coastguard Worker``` 401*387f9dfdSAndroid Build Coastguard Worker$ trace.py -I 'linux/fs.h' 'p::uprobe_register(struct inode *inode) "a_ops = %llx", inode->i_mapping->a_ops' 402*387f9dfdSAndroid Build Coastguard WorkerPID TID COMM FUNC - 403*387f9dfdSAndroid Build Coastguard Worker814288 814288 python2.7 uprobe_register a_ops = ffffffff81a2adc0 404*387f9dfdSAndroid Build Coastguard Worker^C$ grep ffffffff81a2adc0 /proc/kallsyms 405*387f9dfdSAndroid Build Coastguard Workerffffffff81a2adc0 R empty_aops 406*387f9dfdSAndroid Build Coastguard Worker``` 407*387f9dfdSAndroid Build Coastguard Worker 408*387f9dfdSAndroid Build Coastguard WorkerThe kernel symbol `empty_aops` does not have `readpage` defined and hence the above suspicious condition is true. Further examining the kernel source code shows that `overlayfs` does not provide its own `a_ops` while some other file systems (e.g., ext4) define their own `a_ops` (e.g., `ext4_da_aops`), and `ext4_da_aops` defines `readpage`. Hence, uprobe works fine on ext4 while not on overlayfs. 409*387f9dfdSAndroid Build Coastguard Worker 410*387f9dfdSAndroid Build Coastguard WorkerMore [examples](../tools/trace_example.txt). 411*387f9dfdSAndroid Build Coastguard Worker 412*387f9dfdSAndroid Build Coastguard Worker#### 2.2. argdist 413*387f9dfdSAndroid Build Coastguard Worker 414*387f9dfdSAndroid Build Coastguard WorkerMore [examples](../tools/argdist_example.txt). 415*387f9dfdSAndroid Build Coastguard Worker 416*387f9dfdSAndroid Build Coastguard Worker#### 2.3. funccount 417*387f9dfdSAndroid Build Coastguard Worker 418*387f9dfdSAndroid Build Coastguard WorkerMore [examples](../tools/funccount_example.txt). 419*387f9dfdSAndroid Build Coastguard Worker 420*387f9dfdSAndroid Build Coastguard Worker## Networking 421*387f9dfdSAndroid Build Coastguard Worker 422*387f9dfdSAndroid Build Coastguard WorkerTo do. 423