1*bb4ee6a4SAndroid Build Coastguard Worker# Architecture: Snapshotting 2*bb4ee6a4SAndroid Build Coastguard Worker 3*bb4ee6a4SAndroid Build Coastguard WorkerSnapshotting is a **highly experimental** `x86_64` only feature currently under development. It is 4*bb4ee6a4SAndroid Build Coastguard Worker100% **not supported** and only supports a very limited set of devices. This page roughly summarizes 5*bb4ee6a4SAndroid Build Coastguard Workerhow the system works, and how device authors should think about it when writing new devices. 6*bb4ee6a4SAndroid Build Coastguard Worker 7*bb4ee6a4SAndroid Build Coastguard Worker## The snapshot & restore sequence 8*bb4ee6a4SAndroid Build Coastguard Worker 9*bb4ee6a4SAndroid Build Coastguard WorkerThe data required for a snapshot is stored in several places, including guest memory, and the 10*bb4ee6a4SAndroid Build Coastguard Workerdevices running on the host. To take an accurate snapshot, we need a point in time snapshot. Since 11*bb4ee6a4SAndroid Build Coastguard Workerthere is no way to fetch this state atomically, we have to freeze the guest (VCPUs) and the device 12*bb4ee6a4SAndroid Build Coastguard Workerbackends. Similarly, on restore we must freeze in the same way to prevent partially restored state 13*bb4ee6a4SAndroid Build Coastguard Workerfrom being modified. 14*bb4ee6a4SAndroid Build Coastguard Worker 15*bb4ee6a4SAndroid Build Coastguard Worker## Snapshotting a running VM 16*bb4ee6a4SAndroid Build Coastguard Worker 17*bb4ee6a4SAndroid Build Coastguard WorkerIn code, this is implemented by 18*bb4ee6a4SAndroid Build Coastguard Worker[vm_control::do_snapshot](https://crosvm.dev/doc/vm_control/fn.do_snapshot.html). We always freeze 19*bb4ee6a4SAndroid Build Coastguard Workerthe VCPUs first 20*bb4ee6a4SAndroid Build Coastguard Worker([vm_control::VcpuSuspendGuard](https://crosvm.dev/doc/vm_control/struct.VcpuSuspendGuard.html)). 21*bb4ee6a4SAndroid Build Coastguard WorkerThis is done so that we can flush all pending interrupts to the irqchip (LAPIC) without triggering 22*bb4ee6a4SAndroid Build Coastguard Workerfurther activity from the driver (which could in turn trigger more device activity). With the VCPUs 23*bb4ee6a4SAndroid Build Coastguard Workerfrozen, we freeze devices 24*bb4ee6a4SAndroid Build Coastguard Worker([vm_control::DeviceSleepGuard](https://crosvm.dev/doc/vm_control/struct.DeviceSleepGuard.html)). 25*bb4ee6a4SAndroid Build Coastguard WorkerFrom here, it's a just a matter of serializing VCPU state, guest memory, and device state. 26*bb4ee6a4SAndroid Build Coastguard Worker 27*bb4ee6a4SAndroid Build Coastguard Worker### A word about interrupts 28*bb4ee6a4SAndroid Build Coastguard Worker 29*bb4ee6a4SAndroid Build Coastguard WorkerInterrupts come in two primary flavors from the snapshotting perspective: legacy interrupts (e.g. 30*bb4ee6a4SAndroid Build Coastguard WorkerIOAPIC interrupt lines), and MSIs. 31*bb4ee6a4SAndroid Build Coastguard Worker 32*bb4ee6a4SAndroid Build Coastguard Worker#### Legacy interrupts 33*bb4ee6a4SAndroid Build Coastguard Worker 34*bb4ee6a4SAndroid Build Coastguard WorkerThese are a little tricky because they are allocated as part of device creation, and device creation 35*bb4ee6a4SAndroid Build Coastguard Workerhappens **before** we snapshot or restore. To avoid actually having to snapshot or restore the 36*bb4ee6a4SAndroid Build Coastguard Worker`Event` object wiring for these interrupts, we rely on the fact that as long as the VM is created 37*bb4ee6a4SAndroid Build Coastguard Workerwith the right shape (e.g. devices), the interrupt `Event`s will be wired between the device & the 38*bb4ee6a4SAndroid Build Coastguard Workerirqchip correctly. As part of restoring, we will set the routing table, which ensures that those 39*bb4ee6a4SAndroid Build Coastguard Workerevents map to the right GSIs in the hypervisor. 40*bb4ee6a4SAndroid Build Coastguard Worker 41*bb4ee6a4SAndroid Build Coastguard Worker#### MSIs 42*bb4ee6a4SAndroid Build Coastguard Worker 43*bb4ee6a4SAndroid Build Coastguard WorkerThese are much simpler, because of how MSIs are implemented in CrosVM. In `MsixConfig`, we save the 44*bb4ee6a4SAndroid Build Coastguard WorkerMSI routing information for every IRQ. At restore time, we just register these MSIs with the 45*bb4ee6a4SAndroid Build Coastguard Workerhypervisor using the exact same mechanism that would be invoked on device activation (albeit 46*bb4ee6a4SAndroid Build Coastguard Workerbypassing GSI allocation since we know from the saved state exactly which GSI must be used). 47*bb4ee6a4SAndroid Build Coastguard Worker 48*bb4ee6a4SAndroid Build Coastguard Worker#### Flushing IRQs to the irqchip 49*bb4ee6a4SAndroid Build Coastguard Worker 50*bb4ee6a4SAndroid Build Coastguard WorkerIRQs sometimes pass through multiple host `Event`s before reaching the hypervisor (or VCPU loop) for 51*bb4ee6a4SAndroid Build Coastguard Workerinjection. Rather than trying to snapshot the `Event` state, we freeze all interrupt sources 52*bb4ee6a4SAndroid Build Coastguard Worker(devices) and flush all pending interrupts into the irqchip. This way, snapshotting the irqchip 53*bb4ee6a4SAndroid Build Coastguard Workerstate is sufficient to capture all pending interrupts. 54*bb4ee6a4SAndroid Build Coastguard Worker 55*bb4ee6a4SAndroid Build Coastguard Worker### Two-step snapshotting 56*bb4ee6a4SAndroid Build Coastguard Worker 57*bb4ee6a4SAndroid Build Coastguard WorkerTwo-step snapshotting is performed in crosvm to ensure data retention. 58*bb4ee6a4SAndroid Build Coastguard Worker 59*bb4ee6a4SAndroid Build Coastguard WorkerProblem definition: 60*bb4ee6a4SAndroid Build Coastguard Worker 61*bb4ee6a4SAndroid Build Coastguard Worker1. VMM Manager requests crosvm to suspend. 62*bb4ee6a4SAndroid Build Coastguard Worker1. Crosvm suspends, however host-side processes are still running. 63*bb4ee6a4SAndroid Build Coastguard Worker1. VMM Manager requests processes suspend. 64*bb4ee6a4SAndroid Build Coastguard Worker1. VMM Manager requests snapshot from crosvm. 65*bb4ee6a4SAndroid Build Coastguard Worker1. VMM Manager snapshots host-side processes. 66*bb4ee6a4SAndroid Build Coastguard Worker1. VMM Manager requests host-side processes and crosvm to resume (or stop). 67*bb4ee6a4SAndroid Build Coastguard Worker 68*bb4ee6a4SAndroid Build Coastguard WorkerThe problem is that data may be lost in steps 4 & 5, because of the time between steps 2 & 3. After 69*bb4ee6a4SAndroid Build Coastguard Workerstep 2, crosvm is suspended and host-side processes are still running, which means host-side 70*bb4ee6a4SAndroid Build Coastguard Workerprocesses may send data to crosvm but the device in crosvm has not read that data. 71*bb4ee6a4SAndroid Build Coastguard Worker 72*bb4ee6a4SAndroid Build Coastguard WorkerWhen the VM resumes, there are no issues, as the data gets read and processing continues normally. 73*bb4ee6a4SAndroid Build Coastguard WorkerHowever, when the VM restores, that data is lost as it was not saved. 74*bb4ee6a4SAndroid Build Coastguard Worker 75*bb4ee6a4SAndroid Build Coastguard WorkerSolution is two-step snapshotting. We modify step 4 to read any data coming from the host just 76*bb4ee6a4SAndroid Build Coastguard Workerbefore snapshotting, to save that data in crosvm, and then process that data when the VM resumes. 77*bb4ee6a4SAndroid Build Coastguard Worker 78*bb4ee6a4SAndroid Build Coastguard Worker## Restoring a VM in lieu of booting 79*bb4ee6a4SAndroid Build Coastguard Worker 80*bb4ee6a4SAndroid Build Coastguard WorkerRestoring on to a running VM is not supported, and may never be. Our preferred approach is to 81*bb4ee6a4SAndroid Build Coastguard Workerinstead create a new VM from a snapshot. This is why `vm_control::do_restore` can be invoked as part 82*bb4ee6a4SAndroid Build Coastguard Workerof the VM creation process. 83*bb4ee6a4SAndroid Build Coastguard Worker 84*bb4ee6a4SAndroid Build Coastguard Worker## Implications for device authors 85*bb4ee6a4SAndroid Build Coastguard Worker 86*bb4ee6a4SAndroid Build Coastguard WorkerNew devices SHOULD be compatible with the `devices::Suspendable` trait, but MAY defer actual 87*bb4ee6a4SAndroid Build Coastguard Workerimplementation to the future. This trait's implementation defines how the device will sleep/wake, 88*bb4ee6a4SAndroid Build Coastguard Workerand how its state will be saved & restored as part of snapshotting. 89*bb4ee6a4SAndroid Build Coastguard Worker 90*bb4ee6a4SAndroid Build Coastguard WorkerNew virtio devices SHOULD implement the virtio device snapshot methods on 91*bb4ee6a4SAndroid Build Coastguard Worker[VirtioDevice](https://crosvm.dev/doc/devices/virtio/virtio_device/trait.VirtioDevice.html): 92*bb4ee6a4SAndroid Build Coastguard Worker`virtio_sleep`, `virtio_wake`, `virtio_snapshot`, and `virtio_restore`. 93