1# Interrupts (x86_64) 2 3Interrupts are how devices request service from the guest drivers. This page explores the details of 4interrupt routing from the perspective of CrosVM. 5 6## Critical acronyms 7 8This subject area uses *a lot* of acronyms: 9 10- IRQ: Interrupt ReQuest 11- ISR: Interrupt Service Routine 12- EOI: End Of Interrupt 13- MSI: message signaled interrupts. In this document, synonymous with MSI-X. 14- MSI-X: message signaled interrupts - extended 15- LAPIC: local APIC 16- APIC: Advanced Programmable Interrupt Controller (successor to the legacy PIC) 17- IOAPIC: IO APIC (has physical interrupt lines, which it responds to by triggering an MSI directed 18 to the LAPIC). 19- PIC: Programmable Interrupt Controller (the "legacy PIC" / Intel 8259 chip). 20 21## Interrupts come in two flavors 22 23Interrupts on `x86_64` in CrosVM come in two primary flavors: legacy and MSI-X. In this document, 24MSI is used to refer to the concept of message signaled interrupts, but it always refers to 25interrupts sent via MSI-X because that is what CrosVM uses. 26 27### Legacy interrupts (INTx) 28 29These interrupts are traditionally delivered via dedicated signal lines to PICs and/or the IOAPIC. 30Older devices, especially those that are used during early boot, often rely on these types of 31interrupts. These typically are the first 24 GSIs, and are serviced either by the PIC (during very 32early boot), or by the IOAPIC (after it is activated & the PIC is switched off). 33 34#### Background on EOI 35 36The purpose of EOI is rooted in how legacy interrupt lines are shared. If two devices `D1` and `D2` 37share a line `L`, `D2` has no guarantee that it will be serviced when `L` is asserted. After 38receiving EOI, `D2` has to check whether it was serviced, and if it was not, to re-assert `L`. An 39example of how this occurs is if `D2` requests service while `D1` is already being serviced. In that 40case, the line has to be reasserted otherwise `D2` won't be serviced. 41 42Because interrupt lines to the IOAPIC can be shared by multiple devices, EOI is critical for devices 43to figure out whether they were serviced in response to sending the IRQ, or whether the IRQ needs to 44be resent. The operating principles mean that sending extra EOIs to a legacy device is perfectly 45safe, because they could be due to another device on the same line receiving service, and so devices 46must be tolerant of such "extra" (from their perspective) EOIs. 47 48These "extra" EOIs come from the fact that EOI is often a broadcast message that goes to all legacy 49devices. Broadcast is required because interrupt lines can be routed through the two 8259 PICs via 50cascade before they reach the CPU, broadcast to both PICs (and attached devices) is the only way to 51ensure EOI reaches the device that was serviced. 52 53#### EOI in CrosVM 54 55When the guest's ISR completes and signals EOI, the CrosVM irqchip implementation is responsible for 56propagating EOI to the device backends. EOI is delivered to the devices via their 57[resample event](https://crosvm.dev/doc/devices/struct.IrqLevelEvent.html). Devices are then 58responsible for listening to that resample event, and checking their internal state to see if they 59received service. If the device wasn't serviced, it must then reassert the IRQ. 60 61### MSIs 62 63MSIs do not use dedicated signal lines; instead, they are "messages" which are sent on the system 64bus. The LAPIC(s) receive these messages, and inject the interrupt into the VCPU (where injection 65means: jump to ISR). 66 67#### About EOI 68 69EOI is not meaningful for MSIs because lines are *never* shared. No devices using MSI will listen 70for the EOI event, and the irqchip will not signal it. 71 72## The fundamental deception on x86_64: there are no legacy interrupts (usually) 73 74After very early boot, the PIC is switched off and legacy interrupts somewhat cease to be legacy. 75Instead of being handled by the PIC, legacy interrupts are handled by the IOAPIC, and all the IOAPIC 76does is convert them into MSIs; in other words, from the perspective of CrosVM & the guest VCPUs, 77after early boot, every interrupt is a MSI. 78 79## Interrupt handling irqchip specifics 80 81Each `IrqChip` can handle interrupts differently. Often these differences are because the underlying 82hypervisors will have different interrupt features such as KVM's irqfds. Generally a hypervisor has 83three choices for implementing an irqchip: 84 85- Fully in kernel: all of the irqchip (LAPIC & IOAPIC) are implemented in the kernel portion of the 86 hypervisor. 87- Split: the performance critical part of the irqchip (LAPIC) is implemented in the kernel, but the 88 IOAPIC is implemented by the VMM. 89- Userspace: here, the entire irqchip is implemented in the VMM. This is generally slower and not 90 commonly used. 91 92Below, we describe the rough flow for interrupts in virtio devices for each of the chip types. We 93limit ourselves to virtio devices becauseas these are the performance critical devices in CrosVM. 94 95### Kernel mode IRQ chip (w/ irqfd support) 96 97#### MSIs 98 991. Device wants service, so it signals an `Event` object. 1001. The `Event` object is registered with the hypervisor, so the hypervisor immediately routes the 101 IRQ to a LAPIC so a VCPU can be interrupted. 1021. The LAPIC interrupts the VCPU, which jumps to the kernel's ISR (interrupt service routine). 1031. The ISR runs. 104 105#### Legacy interrupts 106 107These are handled similarly to MSIs, except the kernel mode IOAPIC is what initially picks up the 108event, rather than the LAPIC. 109 110### Split IRQ chip (w/ irqfd support) 111 112This is the same as the kernel mode case. 113 114### Split IRQ chip (no irqfd kernel support) 115 116#### MSIs 117 1181. Device wants service, so it signals an `Event` object. 1191. The `Event`object is attached to the IrqChip in CrosVM. An interrupt handling thread wakes up 120 from the `Event` signal. 1211. The IrqChip resets the `Event`. 1221. The IrqChip asserts the interrupt to the LAPIC in the kernel via an ioctl (or equivalent). 1231. The LAPIC interrupts the VCPU, which jumps to the kernel’s ISR (interrupt service routine). 1241. The ISR runs, and on completion sends EOI (end of interrupt). In CrosVM, this is called the 125 [resample event](https://crosvm.dev/doc/devices/struct.IrqLevelEvent.html). 1261. EOI is sent. 127 128#### Legacy interrupts 129 130This introduces an additional `Event` object in the interrupt path, since the IRQ pin itself is an 131`Event`, and the MSI is also an `Event`. These interrupts are processed twice by the IRQ handler: 132once as a legacy IOAPIC event, and a second time as an MSI. 133 134### Userspace IRQ chip 135 136This chip is not widely used in production. Contributions to fill in this section are welcome. 137