1*387f9dfdSAndroid Build Coastguard WorkerDemonstrations of kvm exit reasons, the Linux eBPF/bcc version. 2*387f9dfdSAndroid Build Coastguard Worker 3*387f9dfdSAndroid Build Coastguard Worker 4*387f9dfdSAndroid Build Coastguard WorkerConsidering virtual machines' frequent exits can cause performance problems, 5*387f9dfdSAndroid Build Coastguard Workerthis tool aims to locate the frequent exited reasons and then find solutions 6*387f9dfdSAndroid Build Coastguard Workerto reduce or even avoid the exit, by displaying the detail exit reasons and 7*387f9dfdSAndroid Build Coastguard Workerthe counts of each vm exit for all vms running on one physical machine. 8*387f9dfdSAndroid Build Coastguard Worker 9*387f9dfdSAndroid Build Coastguard Worker 10*387f9dfdSAndroid Build Coastguard WorkerFeatures of this tool 11*387f9dfdSAndroid Build Coastguard Worker===================== 12*387f9dfdSAndroid Build Coastguard Worker 13*387f9dfdSAndroid Build Coastguard Worker- Although there is a patch: [KVM: x86: add full vm-exit reason debug entries] 14*387f9dfdSAndroid Build Coastguard Worker (https://patchwork.kernel.org/project/kvm/patch/[email protected]/) 15*387f9dfdSAndroid Build Coastguard Worker trying to fill more vm-exit reason debug entries, just as the comments said, 16*387f9dfdSAndroid Build Coastguard Worker the code allocates lots of memory that may never be consumed, misses some 17*387f9dfdSAndroid Build Coastguard Worker arch-specific kvm causes, and can not do kernel aggregation. Instead bcc, as 18*387f9dfdSAndroid Build Coastguard Worker a user space tool, can implement all these functions more easily and flexibly. 19*387f9dfdSAndroid Build Coastguard Worker- The bcc python logic could provide nice kernel aggregation and custom output, 20*387f9dfdSAndroid Build Coastguard Worker like collpasing all tids for one pid (e.i. one vm's qemu process id) with exit 21*387f9dfdSAndroid Build Coastguard Worker reasons sorted in descending order. For more information, see the following 22*387f9dfdSAndroid Build Coastguard Worker #USAGE message. 23*387f9dfdSAndroid Build Coastguard Worker- The bpf in-kernel percpu_array and percpu_cache further improves performance. 24*387f9dfdSAndroid Build Coastguard Worker For more information, see the following #Help to understand. 25*387f9dfdSAndroid Build Coastguard Worker 26*387f9dfdSAndroid Build Coastguard Worker 27*387f9dfdSAndroid Build Coastguard WorkerLimited 28*387f9dfdSAndroid Build Coastguard Worker======= 29*387f9dfdSAndroid Build Coastguard Worker 30*387f9dfdSAndroid Build Coastguard WorkerIn view of the hardware-assisted virtualization technology of 31*387f9dfdSAndroid Build Coastguard Workerdifferent architectures, currently we only adapt on vmx in intel. 32*387f9dfdSAndroid Build Coastguard WorkerAnd the amd feature is on the road.. 33*387f9dfdSAndroid Build Coastguard Worker 34*387f9dfdSAndroid Build Coastguard Worker 35*387f9dfdSAndroid Build Coastguard WorkerExample output: 36*387f9dfdSAndroid Build Coastguard Worker=============== 37*387f9dfdSAndroid Build Coastguard Worker 38*387f9dfdSAndroid Build Coastguard Worker# ./kvmexit.py 39*387f9dfdSAndroid Build Coastguard WorkerDisplay kvm exit reasons and statistics for all threads... Hit Ctrl-C to end. 40*387f9dfdSAndroid Build Coastguard WorkerPID TID KVM_EXIT_REASON COUNT 41*387f9dfdSAndroid Build Coastguard Worker^C1273551 1273568 EXIT_REASON_HLT 12 42*387f9dfdSAndroid Build Coastguard Worker1273551 1273568 EXIT_REASON_MSR_WRITE 6 43*387f9dfdSAndroid Build Coastguard Worker1274253 1274261 EXIT_REASON_EXTERNAL_INTERRUPT 1 44*387f9dfdSAndroid Build Coastguard Worker1274253 1274261 EXIT_REASON_HLT 12 45*387f9dfdSAndroid Build Coastguard Worker1274253 1274261 EXIT_REASON_MSR_WRITE 4 46*387f9dfdSAndroid Build Coastguard Worker 47*387f9dfdSAndroid Build Coastguard Worker# ./kvmexit.py 6 48*387f9dfdSAndroid Build Coastguard WorkerDisplay kvm exit reasons and statistics for all threads after sleeping 6 secs. 49*387f9dfdSAndroid Build Coastguard WorkerPID TID KVM_EXIT_REASON COUNT 50*387f9dfdSAndroid Build Coastguard Worker1273903 1273922 EXIT_REASON_EXTERNAL_INTERRUPT 175 51*387f9dfdSAndroid Build Coastguard Worker1273903 1273922 EXIT_REASON_CPUID 10 52*387f9dfdSAndroid Build Coastguard Worker1273903 1273922 EXIT_REASON_HLT 6043 53*387f9dfdSAndroid Build Coastguard Worker1273903 1273922 EXIT_REASON_IO_INSTRUCTION 24 54*387f9dfdSAndroid Build Coastguard Worker1273903 1273922 EXIT_REASON_MSR_WRITE 15025 55*387f9dfdSAndroid Build Coastguard Worker1273903 1273922 EXIT_REASON_PAUSE_INSTRUCTION 11 56*387f9dfdSAndroid Build Coastguard Worker1273903 1273922 EXIT_REASON_EOI_INDUCED 12 57*387f9dfdSAndroid Build Coastguard Worker1273903 1273922 EXIT_REASON_EPT_VIOLATION 6 58*387f9dfdSAndroid Build Coastguard Worker1273903 1273922 EXIT_REASON_EPT_MISCONFIG 380 59*387f9dfdSAndroid Build Coastguard Worker1273903 1273922 EXIT_REASON_PREEMPTION_TIMER 194 60*387f9dfdSAndroid Build Coastguard Worker1273551 1273568 EXIT_REASON_EXTERNAL_INTERRUPT 18 61*387f9dfdSAndroid Build Coastguard Worker1273551 1273568 EXIT_REASON_HLT 989 62*387f9dfdSAndroid Build Coastguard Worker1273551 1273568 EXIT_REASON_IO_INSTRUCTION 10 63*387f9dfdSAndroid Build Coastguard Worker1273551 1273568 EXIT_REASON_MSR_WRITE 2205 64*387f9dfdSAndroid Build Coastguard Worker1273551 1273568 EXIT_REASON_PAUSE_INSTRUCTION 1 65*387f9dfdSAndroid Build Coastguard Worker1273551 1273568 EXIT_REASON_EOI_INDUCED 5 66*387f9dfdSAndroid Build Coastguard Worker1273551 1273568 EXIT_REASON_EPT_MISCONFIG 61 67*387f9dfdSAndroid Build Coastguard Worker1273551 1273568 EXIT_REASON_PREEMPTION_TIMER 14 68*387f9dfdSAndroid Build Coastguard Worker 69*387f9dfdSAndroid Build Coastguard Worker# ./kvmexit.py -p 1273795 5 70*387f9dfdSAndroid Build Coastguard WorkerDisplay kvm exit reasons and statistics for PID 1273795 after sleeping 5 secs. 71*387f9dfdSAndroid Build Coastguard WorkerKVM_EXIT_REASON COUNT 72*387f9dfdSAndroid Build Coastguard WorkerMSR_WRITE 13467 73*387f9dfdSAndroid Build Coastguard WorkerHLT 5060 74*387f9dfdSAndroid Build Coastguard WorkerPREEMPTION_TIMER 345 75*387f9dfdSAndroid Build Coastguard WorkerEPT_MISCONFIG 264 76*387f9dfdSAndroid Build Coastguard WorkerEXTERNAL_INTERRUPT 169 77*387f9dfdSAndroid Build Coastguard WorkerEPT_VIOLATION 18 78*387f9dfdSAndroid Build Coastguard WorkerPAUSE_INSTRUCTION 6 79*387f9dfdSAndroid Build Coastguard WorkerIO_INSTRUCTION 4 80*387f9dfdSAndroid Build Coastguard WorkerEOI_INDUCED 2 81*387f9dfdSAndroid Build Coastguard Worker 82*387f9dfdSAndroid Build Coastguard Worker# ./kvmexit.py -p 1273795 5 -a 83*387f9dfdSAndroid Build Coastguard WorkerDisplay kvm exit reasons and statistics for PID 1273795 and its all threads after sleeping 5 secs. 84*387f9dfdSAndroid Build Coastguard WorkerTID KVM_EXIT_REASON COUNT 85*387f9dfdSAndroid Build Coastguard Worker1273819 EXTERNAL_INTERRUPT 64 86*387f9dfdSAndroid Build Coastguard Worker1273819 HLT 2802 87*387f9dfdSAndroid Build Coastguard Worker1273819 IO_INSTRUCTION 4 88*387f9dfdSAndroid Build Coastguard Worker1273819 MSR_WRITE 7196 89*387f9dfdSAndroid Build Coastguard Worker1273819 PAUSE_INSTRUCTION 2 90*387f9dfdSAndroid Build Coastguard Worker1273819 EOI_INDUCED 2 91*387f9dfdSAndroid Build Coastguard Worker1273819 EPT_VIOLATION 6 92*387f9dfdSAndroid Build Coastguard Worker1273819 EPT_MISCONFIG 162 93*387f9dfdSAndroid Build Coastguard Worker1273819 PREEMPTION_TIMER 194 94*387f9dfdSAndroid Build Coastguard Worker1273820 EXTERNAL_INTERRUPT 78 95*387f9dfdSAndroid Build Coastguard Worker1273820 HLT 2054 96*387f9dfdSAndroid Build Coastguard Worker1273820 MSR_WRITE 5199 97*387f9dfdSAndroid Build Coastguard Worker1273820 EPT_VIOLATION 2 98*387f9dfdSAndroid Build Coastguard Worker1273820 EPT_MISCONFIG 77 99*387f9dfdSAndroid Build Coastguard Worker1273820 PREEMPTION_TIMER 102 100*387f9dfdSAndroid Build Coastguard Worker 101*387f9dfdSAndroid Build Coastguard Worker# ./kvmexit.py -p 1273795 -v 0 102*387f9dfdSAndroid Build Coastguard WorkerDisplay kvm exit reasons and statistics for PID 1273795 VCPU 0... Hit Ctrl-C to end. 103*387f9dfdSAndroid Build Coastguard WorkerKVM_EXIT_REASON COUNT 104*387f9dfdSAndroid Build Coastguard Worker^CMSR_WRITE 2076 105*387f9dfdSAndroid Build Coastguard WorkerHLT 795 106*387f9dfdSAndroid Build Coastguard WorkerPREEMPTION_TIMER 86 107*387f9dfdSAndroid Build Coastguard WorkerEXTERNAL_INTERRUPT 20 108*387f9dfdSAndroid Build Coastguard WorkerEPT_MISCONFIG 10 109*387f9dfdSAndroid Build Coastguard WorkerPAUSE_INSTRUCTION 2 110*387f9dfdSAndroid Build Coastguard WorkerIO_INSTRUCTION 2 111*387f9dfdSAndroid Build Coastguard WorkerEPT_VIOLATION 1 112*387f9dfdSAndroid Build Coastguard WorkerEOI_INDUCED 1 113*387f9dfdSAndroid Build Coastguard Worker 114*387f9dfdSAndroid Build Coastguard Worker# ./kvmexit.py -p 1273795 -v 0 4 115*387f9dfdSAndroid Build Coastguard WorkerDisplay kvm exit reasons and statistics for PID 1273795 VCPU 0 after sleeping 4 secs. 116*387f9dfdSAndroid Build Coastguard WorkerKVM_EXIT_REASON COUNT 117*387f9dfdSAndroid Build Coastguard WorkerMSR_WRITE 4726 118*387f9dfdSAndroid Build Coastguard WorkerHLT 1827 119*387f9dfdSAndroid Build Coastguard WorkerPREEMPTION_TIMER 78 120*387f9dfdSAndroid Build Coastguard WorkerEPT_MISCONFIG 67 121*387f9dfdSAndroid Build Coastguard WorkerEXTERNAL_INTERRUPT 28 122*387f9dfdSAndroid Build Coastguard WorkerIO_INSTRUCTION 4 123*387f9dfdSAndroid Build Coastguard WorkerEOI_INDUCED 2 124*387f9dfdSAndroid Build Coastguard WorkerPAUSE_INSTRUCTION 2 125*387f9dfdSAndroid Build Coastguard Worker 126*387f9dfdSAndroid Build Coastguard Worker# ./kvmexit.py -p 1273795 -v 4 4 127*387f9dfdSAndroid Build Coastguard WorkerTraceback (most recent call last): 128*387f9dfdSAndroid Build Coastguard Worker File "tools/kvmexit.py", line 306, in <module> 129*387f9dfdSAndroid Build Coastguard Worker raise Exception("There's no v%s for PID %d." % (tgt_vcpu, args.pid)) 130*387f9dfdSAndroid Build Coastguard Worker Exception: There's no vCPU 4 for PID 1273795. 131*387f9dfdSAndroid Build Coastguard Worker 132*387f9dfdSAndroid Build Coastguard Worker# ./kvmexit.py -t 1273819 10 133*387f9dfdSAndroid Build Coastguard WorkerDisplay kvm exit reasons and statistics for TID 1273819 after sleeping 10 secs. 134*387f9dfdSAndroid Build Coastguard WorkerKVM_EXIT_REASON COUNT 135*387f9dfdSAndroid Build Coastguard WorkerMSR_WRITE 13318 136*387f9dfdSAndroid Build Coastguard WorkerHLT 5274 137*387f9dfdSAndroid Build Coastguard WorkerEPT_MISCONFIG 263 138*387f9dfdSAndroid Build Coastguard WorkerPREEMPTION_TIMER 171 139*387f9dfdSAndroid Build Coastguard WorkerEXTERNAL_INTERRUPT 109 140*387f9dfdSAndroid Build Coastguard WorkerIO_INSTRUCTION 8 141*387f9dfdSAndroid Build Coastguard WorkerPAUSE_INSTRUCTION 5 142*387f9dfdSAndroid Build Coastguard WorkerEOI_INDUCED 4 143*387f9dfdSAndroid Build Coastguard WorkerEPT_VIOLATION 2 144*387f9dfdSAndroid Build Coastguard Worker 145*387f9dfdSAndroid Build Coastguard Worker# ./kvmexit.py -T '1273820,1273819' 146*387f9dfdSAndroid Build Coastguard WorkerDisplay kvm exit reasons and statistics for TIDS ['1273820', '1273819']... Hit Ctrl-C to end. 147*387f9dfdSAndroid Build Coastguard WorkerTIDS KVM_EXIT_REASON COUNT 148*387f9dfdSAndroid Build Coastguard Worker^C1273819 EXTERNAL_INTERRUPT 300 149*387f9dfdSAndroid Build Coastguard Worker1273819 HLT 13718 150*387f9dfdSAndroid Build Coastguard Worker1273819 IO_INSTRUCTION 26 151*387f9dfdSAndroid Build Coastguard Worker1273819 MSR_WRITE 37457 152*387f9dfdSAndroid Build Coastguard Worker1273819 PAUSE_INSTRUCTION 13 153*387f9dfdSAndroid Build Coastguard Worker1273819 EOI_INDUCED 13 154*387f9dfdSAndroid Build Coastguard Worker1273819 EPT_VIOLATION 53 155*387f9dfdSAndroid Build Coastguard Worker1273819 EPT_MISCONFIG 654 156*387f9dfdSAndroid Build Coastguard Worker1273819 PREEMPTION_TIMER 958 157*387f9dfdSAndroid Build Coastguard Worker1273820 EXTERNAL_INTERRUPT 212 158*387f9dfdSAndroid Build Coastguard Worker1273820 HLT 9002 159*387f9dfdSAndroid Build Coastguard Worker1273820 MSR_WRITE 25495 160*387f9dfdSAndroid Build Coastguard Worker1273820 PAUSE_INSTRUCTION 2 161*387f9dfdSAndroid Build Coastguard Worker1273820 EPT_VIOLATION 64 162*387f9dfdSAndroid Build Coastguard Worker1273820 EPT_MISCONFIG 396 163*387f9dfdSAndroid Build Coastguard Worker1273820 PREEMPTION_TIMER 268 164*387f9dfdSAndroid Build Coastguard Worker 165*387f9dfdSAndroid Build Coastguard Worker 166*387f9dfdSAndroid Build Coastguard WorkerHelp to understand 167*387f9dfdSAndroid Build Coastguard Worker================== 168*387f9dfdSAndroid Build Coastguard Worker 169*387f9dfdSAndroid Build Coastguard WorkerWe use a PERCPU_ARRAY: pcpuArrayA and a percpu_hash: hashA to collaboratively 170*387f9dfdSAndroid Build Coastguard Workerstore each kvm exit reason and its count. The reason is there exists a rule when 171*387f9dfdSAndroid Build Coastguard Workerone vcpu exits and re-enters, it tends to continue to run on the same physical 172*387f9dfdSAndroid Build Coastguard Workercpu (pcpu as follows) as the last cycle, which is also called 'cache hit'. Thus 173*387f9dfdSAndroid Build Coastguard Workerwe turn to use a PERCPU_ARRAY to record the 'cache hit' situation to speed 174*387f9dfdSAndroid Build Coastguard Workerthings up; and for other cases, then use a percpu_hash. 175*387f9dfdSAndroid Build Coastguard Worker 176*387f9dfdSAndroid Build Coastguard WorkerBTW, we originally use a common hash to do this, with a u64(exit_reason) 177*387f9dfdSAndroid Build Coastguard Workerkey and a struct exit_info {tgid_pid, exit_reason} value. But due to 178*387f9dfdSAndroid Build Coastguard Workerthe big lock in bpf_hash, each updating is quite performance consuming. 179*387f9dfdSAndroid Build Coastguard Worker 180*387f9dfdSAndroid Build Coastguard WorkerNow imagine here is a pid_tgidA (vcpu A) exits and is going to run on 181*387f9dfdSAndroid Build Coastguard WorkerpcpuArrayA, the BPF code flow is as follows: 182*387f9dfdSAndroid Build Coastguard Worker 183*387f9dfdSAndroid Build Coastguard Worker pid_tgidA keeps running on the same pcpu 184*387f9dfdSAndroid Build Coastguard Worker // \\ 185*387f9dfdSAndroid Build Coastguard Worker // \\ 186*387f9dfdSAndroid Build Coastguard Worker // Y N \\ 187*387f9dfdSAndroid Build Coastguard Worker // \\ 188*387f9dfdSAndroid Build Coastguard Worker a. cache_hit b. cache_miss 189*387f9dfdSAndroid Build Coastguard Worker(cacheA's pid_tgid matches pid_tgidA) || 190*387f9dfdSAndroid Build Coastguard Worker | || 191*387f9dfdSAndroid Build Coastguard Worker | || 192*387f9dfdSAndroid Build Coastguard Worker "increase percpu exit_ct and return" || 193*387f9dfdSAndroid Build Coastguard Worker [*Note*] || 194*387f9dfdSAndroid Build Coastguard Worker pid_tgidA ever been exited on pcpuArrayA? 195*387f9dfdSAndroid Build Coastguard Worker // \\ 196*387f9dfdSAndroid Build Coastguard Worker // \\ 197*387f9dfdSAndroid Build Coastguard Worker // \\ 198*387f9dfdSAndroid Build Coastguard Worker // Y N \\ 199*387f9dfdSAndroid Build Coastguard Worker // \\ 200*387f9dfdSAndroid Build Coastguard Worker b.a load_last_hashA b.b initialize_hashA_with_zero 201*387f9dfdSAndroid Build Coastguard Worker \ / 202*387f9dfdSAndroid Build Coastguard Worker \ / 203*387f9dfdSAndroid Build Coastguard Worker \ / 204*387f9dfdSAndroid Build Coastguard Worker "increase percpu exit_ct" 205*387f9dfdSAndroid Build Coastguard Worker || 206*387f9dfdSAndroid Build Coastguard Worker || 207*387f9dfdSAndroid Build Coastguard Worker is another pid_tgid been running on pcpuArrayA? 208*387f9dfdSAndroid Build Coastguard Worker // \\ 209*387f9dfdSAndroid Build Coastguard Worker // Y N \\ 210*387f9dfdSAndroid Build Coastguard Worker // \\ 211*387f9dfdSAndroid Build Coastguard Worker b.*.a save_theLastHit_hashB do_nothing 212*387f9dfdSAndroid Build Coastguard Worker \\ // 213*387f9dfdSAndroid Build Coastguard Worker \\ // 214*387f9dfdSAndroid Build Coastguard Worker \\ // 215*387f9dfdSAndroid Build Coastguard Worker b.* save_to_pcpuArrayA 216*387f9dfdSAndroid Build Coastguard Worker 217*387f9dfdSAndroid Build Coastguard Worker 218*387f9dfdSAndroid Build Coastguard Worker[*Note*] we do not update the table in above "a.", in case the vcpu hit the same 219*387f9dfdSAndroid Build Coastguard Workerpcpu again when exits next time, instead we only update until this pcpu is not 220*387f9dfdSAndroid Build Coastguard Workerhitted by the same tgidpid(vcpu) again, which is in "b.*.a" and "b.*". 221*387f9dfdSAndroid Build Coastguard Worker 222*387f9dfdSAndroid Build Coastguard Worker 223*387f9dfdSAndroid Build Coastguard WorkerUSAGE message: 224*387f9dfdSAndroid Build Coastguard Worker============== 225*387f9dfdSAndroid Build Coastguard Worker 226*387f9dfdSAndroid Build Coastguard Worker# ./kvmexit.py -h 227*387f9dfdSAndroid Build Coastguard Workerusage: kvmexit.py [-h] [-p PID [-v VCPU | -a] ] [-t TID | -T 'TID1,TID2'] [duration] 228*387f9dfdSAndroid Build Coastguard Worker 229*387f9dfdSAndroid Build Coastguard WorkerDisplay kvm_exit_reason and its statistics at a timed interval 230*387f9dfdSAndroid Build Coastguard Worker 231*387f9dfdSAndroid Build Coastguard Workeroptional arguments: 232*387f9dfdSAndroid Build Coastguard Worker -h, --help show this help message and exit 233*387f9dfdSAndroid Build Coastguard Worker -p PID, --pid PID display process with this PID only, collpase all tids with exit reasons sorted in descending order 234*387f9dfdSAndroid Build Coastguard Worker -v VCPU, --v VCPU display this VCPU only for this PID 235*387f9dfdSAndroid Build Coastguard Worker -a, --alltids display all TIDS for this PID 236*387f9dfdSAndroid Build Coastguard Worker -t TID, --tid TID display thread with this TID only with exit reasons sorted in descending order 237*387f9dfdSAndroid Build Coastguard Worker -T 'TID1,TID2', --tids 'TID1,TID2' 238*387f9dfdSAndroid Build Coastguard Worker display threads for a union like {395490, 395491} 239*387f9dfdSAndroid Build Coastguard Worker duration duration of display, after sleeping several seconds 240*387f9dfdSAndroid Build Coastguard Worker 241*387f9dfdSAndroid Build Coastguard Workerexamples: 242*387f9dfdSAndroid Build Coastguard Worker ./kvmexit # Display kvm_exit_reason and its statistics in real-time until Ctrl-C 243*387f9dfdSAndroid Build Coastguard Worker ./kvmexit 5 # Display in real-time after sleeping 5s 244*387f9dfdSAndroid Build Coastguard Worker ./kvmexit -p 3195281 # Collpase all tids for pid 3195281 with exit reasons sorted in descending order 245*387f9dfdSAndroid Build Coastguard Worker ./kvmexit -p 3195281 20 # Collpase all tids for pid 3195281 with exit reasons sorted in descending order, and display after sleeping 20s 246*387f9dfdSAndroid Build Coastguard Worker ./kvmexit -p 3195281 -v 0 # Display only vcpu0 for pid 3195281, descending sort by default 247*387f9dfdSAndroid Build Coastguard Worker ./kvmexit -p 3195281 -a # Display all tids for pid 3195281 248*387f9dfdSAndroid Build Coastguard Worker ./kvmexit -t 395490 # Display only for tid 395490 with exit reasons sorted in descending order 249*387f9dfdSAndroid Build Coastguard Worker ./kvmexit -t 395490 20 # Display only for tid 395490 with exit reasons sorted in descending order after sleeping 20s 250*387f9dfdSAndroid Build Coastguard Worker ./kvmexit -T '395490,395491' # Display for a union like {395490, 395491} 251