1# virtio-queue 2 3The `virtio-queue` crate provides a virtio device implementation for a virtio 4queue, a virtio descriptor and a chain of such descriptors. 5Two formats of virtio queues are defined in the specification: split virtqueues 6and packed virtqueues. The `virtio-queue` crate offers support only for the 7[split virtqueues](https://docs.oasis-open.org/virtio/virtio/v1.1/csprd01/virtio-v1.1-csprd01.html#x1-240006) 8format. 9The purpose of the virtio-queue API is to be consumed by virtio device 10implementations (such as the block device or vsock device). 11The main abstraction is the `Queue`. The crate is also defining a state object 12for the queue, i.e. `QueueState`. 13 14## Usage 15 16Let’s take a concrete example of how a device would work with a queue, using 17the MMIO bus. 18 19First, it is important to mention that the mandatory parts of the virtio 20interface are the following: 21 22- the device status field → provides an indication of 23 [the completed steps](https://docs.oasis-open.org/virtio/virtio/v1.1/csprd01/virtio-v1.1-csprd01.html#x1-100001) 24 of the device initialization routine, 25- the feature bits → 26 [the features](https://docs.oasis-open.org/virtio/virtio/v1.1/csprd01/virtio-v1.1-csprd01.html#x1-100001) 27 the driver/device understand(s), 28- [notifications](https://docs.oasis-open.org/virtio/virtio/v1.1/csprd01/virtio-v1.1-csprd01.html#x1-170003), 29- one or more 30 [virtqueues](https://docs.oasis-open.org/virtio/virtio/v1.1/csprd01/virtio-v1.1-csprd01.html#x1-230005) 31 → the mechanism for data transport between the driver and device. 32 33Each virtqueue consists of three parts: 34 35- Descriptor Table, 36- Available Ring, 37- Used Ring. 38 39Before booting the virtual machine (VM), the VMM does the following set up: 40 411. initialize an array of Queues using the Queue constructor. 422. register the device to the MMIO bus, so that the driver can later send 43 read/write requests from/to the MMIO space, some of those requests also set 44 up the queues’ state. 453. other pre-boot configurations, such as registering a fd for the interrupt 46 assigned to the device, fd which will be later used by the device to inform 47 the driver that it has information to communicate. 48 49After the boot of the VM, the driver starts sending read/write requests to 50configure things like: 51 52* the supported features; 53* queue parameters. The following setters are used for the queue set up: 54 * `set_size` → for setting the size of the queue. 55 * `set_ready` → configure the queue to the `ready for processing` state. 56 * `set_desc_table_address`, `set_avail_ring_address`, 57 `set_used_ring_address` → configure the guest address of the constituent 58 parts of the queue. 59 * `set_event_idx` → it is called as part of the features' negotiation in 60 the `virtio-device` crate, and is enabling or disabling the 61 VIRTIO_F_RING_EVENT_IDX feature. 62* the device activation. As part of this activation, the device can also create 63 a queue handler for the device, that can be later used to process the queue. 64 65Once the queues are ready, the device can be used. 66 67The steady state operation of a virtio device follows a model where the driver 68produces descriptor chains which are consumed by the device, and both parties 69need to be notified when new elements have been placed on the associate ring to 70avoid busy polling. The precise notification mechanism is left up to the VMM 71that incorporates the devices and queues (it usually involves things like MMIO 72vm exits and interrupt injection into the guest). The queue implementation is 73agnostic to the notification mechanism in use, and it exposes methods and 74functionality (such as iterators) that are called from the outside in response 75to a notification event. 76 77### Data transmission using virtqueues 78 79The basic principle of how the queues are used by the device/driver is the 80following, as showed in the diagram below as well: 81 821. when the guest driver has a new request (buffer), it allocates free 83 descriptor(s) for the buffer in the descriptor table, chaining as necessary. 842. the driver adds a new entry with the head index of the descriptor chain 85 describing the request, in the available ring entries. 863. the driver increments the `idx` with the number of new entries, the diagram 87 shows the simple use case of only one new entry. 884. the driver sends an available buffer notification to the device if such 89 notifications are not suppressed. 905. the device will at some point consume that request, by first reading the 91 `idx` field from the available ring. This can be directly achieved with 92 `Queue::avail_idx`, but we do not recommend to the consumers of the crate 93 to use this because it is already called behind the scenes by the iterator 94 over all available descriptor chain heads. 956. the device gets the index of the descriptor chain(s) corresponding to the 96 read `idx` value. 977. the device reads the corresponding descriptor(s) from the descriptor table. 988. the device adds a new entry in the used ring by using `Queue::add_used`; the 99 entry is defined in the spec as `virtq_used_elem`, and in `virtio-queue` as 100 `VirtqUsedElem`. This structure is holding both the index of the descriptor 101 chain and the number of bytes that were written to the memory as part of 102 serving the request. 1039. the device increments the `idx` from the used ring; this is done as part of 104 the `Queue::add_used` that was mentioned above. 10510. the device sends a used buffer notification to the driver if such 106 notifications are not suppressed. 107 108 109 110A descriptor is storing four fields, with the first two, `addr` and `len`, 111pointing to the data in memory to which the descriptor refers, as shown in the 112diagram below. The `flags` field is useful for indicating if, for example, the 113buffer is device readable or writable, or if we have another descriptor chained 114after this one (VIRTQ_DESC_F_NEXT flag set). `next` field is storing the index 115of the next descriptor if VIRTQ_DESC_F_NEXT is set. 116 117 118 119**Requirements for device implementation** 120 121* Abstractions from virtio-queue such as `DescriptorChain` can be used to parse 122 descriptors provided by the device, which represent input or output memory 123 areas for device I/O. A descriptor is essentially an (address, length) pair, 124 which is subsequently used by the device model operation. We do not check the 125 validity of the descriptors, and instead expect any validations to happen 126 when the device implementation is attempting to access the corresponding 127 areas. Early checks can add non-negligible additional costs, and exclusively 128 relying upon them may lead to time-of-check-to-time-of-use race conditions. 129* The device should validate before reading/writing to a buffer that it is 130 device-readable/device-writable. 131 132## Design 133 134`QueueT` is a trait that allows different implementations for a `Queue` 135object for single-threaded context and multi-threaded context. The 136implementations provided in `virtio-queue` are: 137 1381. `Queue` → it is used for the single-threaded context. 1392. `QueueSync` → it is used for the multi-threaded context, and is simply 140 a wrapper over an `Arc<Mutex<Queue>>`. 141 142Besides the above abstractions, the `virtio-queue` crate provides also the 143following ones: 144 145* `Descriptor` → which mostly offers accessors for the members of the 146 `Descriptor`. 147* `DescriptorChain` → provides accessors for the `DescriptorChain`’s members 148 and an `Iterator` implementation for iterating over the `DescriptorChain`, 149 there is also an abstraction for iterators over just the device readable or 150 just the device writable descriptors (`DescriptorChainRwIter`). 151* `AvailIter` - is a consuming iterator over all available descriptor chain 152 heads in the queue. 153 154## Save/Restore Queue 155 156The `Queue` allows saving the state through the `state` function which returns 157a `QueueState`. `Queue` objects can be created from a previously saved state by 158using `QueueState::try_from`. The VMM should check for errors when restoring 159a `Queue` from a previously saved state. 160 161### Notification suppression 162 163A big part of the `virtio-queue` crate consists of the notification suppression 164support. As already mentioned, the driver can send an available buffer 165notification to the device when there are new entries in the available ring, 166and the device can send a used buffer notification to the driver when there are 167new entries in the used ring. There might be cases when sending a notification 168each time these scenarios happen is not efficient, for example when the driver 169is processing the used ring, it would not need to receive another used buffer 170notification. The mechanism for suppressing the notifications is detailed in 171the following sections from the specification: 172- [Used Buffer Notification Suppression](https://docs.oasis-open.org/virtio/virtio/v1.1/csprd01/virtio-v1.1-csprd01.html#x1-400007), 173- [Available Buffer Notification Suppression](https://docs.oasis-open.org/virtio/virtio/v1.1/csprd01/virtio-v1.1-csprd01.html#x1-4800010). 174 175The `Queue` abstraction is proposing the following sequence of steps for 176processing new available ring entries: 177 1781. the device first disables the notifications to make the driver aware it is 179 processing the available ring and does not want interruptions, by using 180 `Queue::disable_notification`. Notifications are disabled by the device 181 either if VIRTIO_F_EVENT_IDX is not negotiated, and VIRTQ_USED_F_NO_NOTIFY 182 is set in the `flags` field of the used ring, or if VIRTIO_F_EVENT_IDX is 183 negotiated, and `avail_event` value is not updated, i.e. it remains set to 184 the latest `idx` value of the available ring that was already notified by 185 the driver. 1862. the device processes the new entries by using the `AvailIter` iterator. 1873. the device can enable the notifications now, by using 188 `Queue::enable_notification`. Notifications are enabled by the device either 189 if VIRTIO_F_EVENT_IDX is not negotiated, and 0 is set in the `flags` field 190 of the used ring, or if VIRTIO_F_EVENT_IDX is negotiated, and `avail_event` 191 value is set to the smallest `idx` value of the available ring that was not 192 already notified by the driver. This way the device makes sure that it won’t 193 miss any notification. 194 195The above steps should be done in a loop to also handle the less likely case 196where the driver added new entries just before we re-enabled notifications. 197 198On the driver side, the `Queue` provides the `needs_notification` method which 199should be used each time the device adds a new entry to the used ring. 200Depending on the `used_event` value and on the last used value 201(`signalled_used`), `needs_notification` returns true to let the device know it 202should send a notification to the guest. 203 204## Assumptions 205 206We assume the users of the `Queue` implementation won’t attempt to use the 207queue before checking that the `ready` bit is set. This can be verified by 208calling `Queue::is_valid` which, besides this, is also checking that the three 209queue parts are valid memory regions. 210We assume consumers will use `AvailIter::go_to_previous_position` only in 211single-threaded contexts. 212We assume the users will consume the entries from the available ring in the 213recommended way from the documentation, i.e. device starts processing the 214available ring entries, disables the notifications, processes the entries, 215and then re-enables notifications. 216 217## License 218 219This project is licensed under either of 220 221- [Apache License](http://www.apache.org/licenses/LICENSE-2.0), Version 2.0 222- [BSD-3-Clause License](https://opensource.org/licenses/BSD-3-Clause) 223