xref: /aosp_15_r20/external/bcc/tools/oomkill_example.txt (revision 387f9dfdfa2baef462e92476d413c7bc2470293e)
1*387f9dfdSAndroid Build Coastguard WorkerDemonstrations of oomkill, the Linux eBPF/bcc version.
2*387f9dfdSAndroid Build Coastguard Worker
3*387f9dfdSAndroid Build Coastguard Worker
4*387f9dfdSAndroid Build Coastguard Workeroomkill is a simple program that traces the Linux out-of-memory (OOM) killer,
5*387f9dfdSAndroid Build Coastguard Workerand shows basic details on one line per OOM kill:
6*387f9dfdSAndroid Build Coastguard Worker
7*387f9dfdSAndroid Build Coastguard Worker# ./oomkill
8*387f9dfdSAndroid Build Coastguard WorkerTracing oom_kill_process()... Ctrl-C to end.
9*387f9dfdSAndroid Build Coastguard Worker21:03:39 Triggered by PID 3297 ("ntpd"), OOM kill of PID 22516 ("perl"), 3850642 pages, loadavg: 0.99 0.39 0.30 3/282 22724
10*387f9dfdSAndroid Build Coastguard Worker21:03:48 Triggered by PID 22517 ("perl"), OOM kill of PID 22517 ("perl"), 3850642 pages, loadavg: 0.99 0.41 0.30 2/282 22932
11*387f9dfdSAndroid Build Coastguard Worker
12*387f9dfdSAndroid Build Coastguard WorkerThe first line shows that PID 22516, with process name "perl", was OOM killed
13*387f9dfdSAndroid Build Coastguard Workerwhen it reached 3850642 pages (usually 4 Kbytes per page). This OOM kill
14*387f9dfdSAndroid Build Coastguard Workerhappened to be triggered by PID 3297, process name "ntpd", doing some memory
15*387f9dfdSAndroid Build Coastguard Workerallocation.
16*387f9dfdSAndroid Build Coastguard Worker
17*387f9dfdSAndroid Build Coastguard WorkerThe system log (dmesg) shows pages of details and system context about an OOM
18*387f9dfdSAndroid Build Coastguard Workerkill. What it currently lacks, however, is context on how the system had been
19*387f9dfdSAndroid Build Coastguard Workerchanging over time. I've seen OOM kills where I wanted to know if the system
20*387f9dfdSAndroid Build Coastguard Workerwas at steady state at the time, or if there had been a recent increase in
21*387f9dfdSAndroid Build Coastguard Workerworkload that triggered the OOM event. oomkill provides some context: at the
22*387f9dfdSAndroid Build Coastguard Workerend of the line is the load average information from /proc/loadavg. For both
23*387f9dfdSAndroid Build Coastguard Workerof the oomkills here, we can see that the system was getting busier at the
24*387f9dfdSAndroid Build Coastguard Workertime (a higher 1 minute "average" of 0.99, compared to the 15 minute "average"
25*387f9dfdSAndroid Build Coastguard Workerof 0.30).
26*387f9dfdSAndroid Build Coastguard Worker
27*387f9dfdSAndroid Build Coastguard Workeroomkill can also be the basis of other tools and customizations. For example,
28*387f9dfdSAndroid Build Coastguard Workeryou can edit it to include other task_struct details from the target PID at
29*387f9dfdSAndroid Build Coastguard Workerthe time of the OOM kill.
30*387f9dfdSAndroid Build Coastguard Worker
31*387f9dfdSAndroid Build Coastguard Worker
32*387f9dfdSAndroid Build Coastguard WorkerThe following commands can be used to test this program, and invoke a memory
33*387f9dfdSAndroid Build Coastguard Workerconsuming process that exhausts system memory and is OOM killed:
34*387f9dfdSAndroid Build Coastguard Worker
35*387f9dfdSAndroid Build Coastguard Workersysctl -w vm.overcommit_memory=1              # always overcommit
36*387f9dfdSAndroid Build Coastguard Workerperl -e 'while (1) { $a .= "A" x 1024; }'     # eat all memory
37*387f9dfdSAndroid Build Coastguard Worker
38*387f9dfdSAndroid Build Coastguard WorkerWARNING: This exhausts system memory after disabling some overcommit checks.
39*387f9dfdSAndroid Build Coastguard WorkerOnly test in a lab environment.
40