1*387f9dfdSAndroid Build Coastguard WorkerDemonstrations of oomkill, the Linux eBPF/bcc version. 2*387f9dfdSAndroid Build Coastguard Worker 3*387f9dfdSAndroid Build Coastguard Worker 4*387f9dfdSAndroid Build Coastguard Workeroomkill is a simple program that traces the Linux out-of-memory (OOM) killer, 5*387f9dfdSAndroid Build Coastguard Workerand shows basic details on one line per OOM kill: 6*387f9dfdSAndroid Build Coastguard Worker 7*387f9dfdSAndroid Build Coastguard Worker# ./oomkill 8*387f9dfdSAndroid Build Coastguard WorkerTracing oom_kill_process()... Ctrl-C to end. 9*387f9dfdSAndroid Build Coastguard Worker21:03:39 Triggered by PID 3297 ("ntpd"), OOM kill of PID 22516 ("perl"), 3850642 pages, loadavg: 0.99 0.39 0.30 3/282 22724 10*387f9dfdSAndroid Build Coastguard Worker21:03:48 Triggered by PID 22517 ("perl"), OOM kill of PID 22517 ("perl"), 3850642 pages, loadavg: 0.99 0.41 0.30 2/282 22932 11*387f9dfdSAndroid Build Coastguard Worker 12*387f9dfdSAndroid Build Coastguard WorkerThe first line shows that PID 22516, with process name "perl", was OOM killed 13*387f9dfdSAndroid Build Coastguard Workerwhen it reached 3850642 pages (usually 4 Kbytes per page). This OOM kill 14*387f9dfdSAndroid Build Coastguard Workerhappened to be triggered by PID 3297, process name "ntpd", doing some memory 15*387f9dfdSAndroid Build Coastguard Workerallocation. 16*387f9dfdSAndroid Build Coastguard Worker 17*387f9dfdSAndroid Build Coastguard WorkerThe system log (dmesg) shows pages of details and system context about an OOM 18*387f9dfdSAndroid Build Coastguard Workerkill. What it currently lacks, however, is context on how the system had been 19*387f9dfdSAndroid Build Coastguard Workerchanging over time. I've seen OOM kills where I wanted to know if the system 20*387f9dfdSAndroid Build Coastguard Workerwas at steady state at the time, or if there had been a recent increase in 21*387f9dfdSAndroid Build Coastguard Workerworkload that triggered the OOM event. oomkill provides some context: at the 22*387f9dfdSAndroid Build Coastguard Workerend of the line is the load average information from /proc/loadavg. For both 23*387f9dfdSAndroid Build Coastguard Workerof the oomkills here, we can see that the system was getting busier at the 24*387f9dfdSAndroid Build Coastguard Workertime (a higher 1 minute "average" of 0.99, compared to the 15 minute "average" 25*387f9dfdSAndroid Build Coastguard Workerof 0.30). 26*387f9dfdSAndroid Build Coastguard Worker 27*387f9dfdSAndroid Build Coastguard Workeroomkill can also be the basis of other tools and customizations. For example, 28*387f9dfdSAndroid Build Coastguard Workeryou can edit it to include other task_struct details from the target PID at 29*387f9dfdSAndroid Build Coastguard Workerthe time of the OOM kill. 30*387f9dfdSAndroid Build Coastguard Worker 31*387f9dfdSAndroid Build Coastguard Worker 32*387f9dfdSAndroid Build Coastguard WorkerThe following commands can be used to test this program, and invoke a memory 33*387f9dfdSAndroid Build Coastguard Workerconsuming process that exhausts system memory and is OOM killed: 34*387f9dfdSAndroid Build Coastguard Worker 35*387f9dfdSAndroid Build Coastguard Workersysctl -w vm.overcommit_memory=1 # always overcommit 36*387f9dfdSAndroid Build Coastguard Workerperl -e 'while (1) { $a .= "A" x 1024; }' # eat all memory 37*387f9dfdSAndroid Build Coastguard Worker 38*387f9dfdSAndroid Build Coastguard WorkerWARNING: This exhausts system memory after disabling some overcommit checks. 39*387f9dfdSAndroid Build Coastguard WorkerOnly test in a lab environment. 40