Lines Matching +full:cpu +full:- +full:to +full:- +full:pci

1 .. SPDX-License-Identifier: GPL-2.0
3 PCI pass-thru devices
5 In a Hyper-V guest VM, PCI pass-thru devices (also called
6 virtual PCI devices, or vPCI devices) are physical PCI devices
10 provides higher bandwidth access to the device with lower
12 hypervisor. The device should appear to the guest just as it
14 to the Linux device drivers for the device.
16 Hyper-V terminology for vPCI devices is "Discrete Device
17 Assignment" (DDA). Public documentation for Hyper-V DDA is
20 …tps://learn.microsoft.com/en-us/windows-server/virtualization/hyper-v/plan/plan-for-deploying-devi…
23 and for GPUs. A similar mechanism for NICs is called SR-IOV
25 driver to interact directly with the hardware. See Hyper-V
26 public documentation here: `SR-IOV`_
28 .. _SR-IOV: https://learn.microsoft.com/en-us/windows-hardware/drivers/network/overview-of-single-r…
30 This discussion of vPCI devices includes DDA and SR-IOV
34 -------------------
35 Hyper-V provides full PCI functionality for a vPCI device when
38 APIs for accessing PCI config space and for other integration
39 with Linux. But the initial detection of the PCI device and
40 its integration with the Linux PCI subsystem must use Hyper-V
41 specific mechanisms. Consequently, vPCI devices on Hyper-V
42 have a dual identity. They are initially presented to Linux
46 drivers/pci/controller/pci-hyperv.c handles a newly introduced
47 vPCI device by fabricating a PCI bus topology and creating all
48 the normal PCI device data structures in Linux that would
49 exist if the PCI device were discovered via ACPI on a bare-
51 device also has a normal PCI identity in Linux, and the normal
53 were running in Linux on bare-metal. Because vPCI devices are
56 may be added to a VM or removed from a VM at any time during
60 PCI device at the same time. In response to the VMBus offer
62 VMBus connection to the vPCI VSP on the Hyper-V host. That
63 connection has a single VMBus channel. The channel is used to
66 is fully configured in Linux as a PCI device, the VMBus
67 channel is used only if Linux changes the vCPU to be interrupted
74 PCI Device Setup
75 ----------------
76 PCI device setup follows a sequence that Hyper-V originally
77 created for Windows guests, and that can be ill-suited for
78 Linux guests due to differences in the overall structure of
79 the Linux PCI subsystem compared with Windows. Nonetheless,
80 with a bit of hackery in the Hyper-V virtual PCI driver for
81 Linux, the virtual PCI device is setup in Linux so that
82 generic Linux PCI subsystem code and the Linux driver for the
85 Each vPCI device is set up in Linux to be in its own PCI
86 domain with a host bridge. The PCI domainID is derived from
87 bytes 4 and 5 of the instance GUID assigned to the VMBus vPCI
88 device. The Hyper-V host does not guarantee that these bytes
89 are unique, so hv_pci_probe() has an algorithm to resolve
90 collisions. The collision resolution is intended to be stable
91 across reboots of the same VM so that the PCI domainIDs don't
95 hv_pci_probe() allocates a guest MMIO range to be used as PCI
97 to the Hyper-V host over the VMBus channel as part of telling
98 the host that the device is ready to enter d0. See
100 MMIO range, the Hyper-V host intercepts the accesses and maps
101 them to the physical device PCI config space.
104 the Hyper-V host, and uses this information to allocate MMIO
105 space for the BARs. That MMIO space is then setup to be
107 PCI subsystem code in Linux processes the BARs.
109 Finally, hv_pci_probe() creates the root PCI bus. At this
110 point the Hyper-V virtual PCI driver hackery is done, and the
111 normal Linux PCI machinery for scanning the root bus works to
112 detect the device, to perform driver matching, and to
115 PCI Device Removal
116 ------------------
117 A Hyper-V host may initiate removal of a vPCI device from a
119 is instigated by an admin action taken on the Hyper-V host and
123 "Eject" message sent from the host to the guest over the VMBus
125 a message, the Hyper-V virtual PCI driver in Linux
126 asynchronously invokes Linux kernel PCI subsystem calls to
128 complete, an "Ejection Complete" message is sent back to
129 Hyper-V over the VMBus channel indicating that the device has
130 been removed. At this point, Hyper-V sends a VMBus rescind
131 message to the Linux guest, which the VMBus driver in Linux
135 message also indicates to the guest that Hyper-V has stopped
137 guest were to attempt to access that device's MMIO space, it
142 After sending the Eject message, Hyper-V allows the guest VM
143 60 seconds to cleanly shutdown the device and respond with
146 within the allowed 60 seconds, the Hyper-V host forcibly
154 Hyper-V virtual PCI driver is very tricky. Ejection has been
156 fully setup. The Hyper-V virtual PCI driver has been updated
157 several times over the years to fix race conditions when
159 modifying this code to prevent re-introducing such problems.
163 --------------------
164 The Hyper-V virtual PCI driver supports vPCI devices using
165 MSI, multi-MSI, or MSI-X. Assigning the guest vCPU that will
166 receive the interrupt for a particular MSI or MSI-X message is
168 the Hyper-V interfaces. For the single-MSI and MSI-X cases,
172 (on x86) or the GICD registers are set (on arm64) to specify
174 with Hyper-V, which must decide which physical CPU should
175 receive the interrupt before it is forwarded to the guest VM.
176 Unfortunately, the Hyper-V decision-making process is a bit
178 interrupts on a single CPU, causing a performance bottleneck.
182 The Hyper-V virtual PCI driver implements the
184 Unfortunately, on Hyper-V the implementation requires sending
185 a VMBus message to the Hyper-V host and awaiting an interrupt
188 held, it doesn't work to do the normal sleep until awakened by
196 Most of the code in the Hyper-V virtual PCI driver (pci-
197 hyperv.c) applies to Hyper-V and Linux guests running on x86
199 interrupt assignments are managed. On x86, the Hyper-V
200 virtual PCI driver in the guest must make a hypercall to tell
201 Hyper-V which guest vCPU should be interrupted by each
202 MSI/MSI-X interrupt, and the x86 interrupt vector number that
205 Hyper-V virtual PCI driver manages the allocation of an SPI
206 for each MSI/MSI-X interrupt. The Hyper-V virtual PCI driver
208 which Hyper-V emulates, so no hypercall is necessary as with
209 x86. Hyper-V does not support using LPIs for vPCI devices in
212 The Hyper-V virtual PCI driver in Linux supports vPCI devices
215 interface, the Hyper-V virtual PCI driver is called to tell
216 the Hyper-V host to change the interrupt targeting and
218 IRQ domain needs to reassign an interrupt vector due to
219 running out of vectors on a CPU, there's no path to inform the
220 Hyper-V host of the change, and things break. Fortunately,
222 using all the vectors on a CPU doesn't happen. Since such a
227 ---
228 By default, Hyper-V pins all guest VM memory in the host
229 when the VM is created, and programs the physical IOMMU to
230 allow the VM to have DMA access to all its memory. Hence
231 it is safe to assign PCI devices to the VM, and allow the
232 guest operating system to program the DMA transfers. The
234 DMA to memory belonging to the host or to other VMs on the
236 are in "direct" mode since Hyper-V does not provide a virtual
239 Hyper-V assumes that physical PCI devices always perform
240 cache-coherent DMA. When running on x86, this behavior is
242 architecture allows for both cache-coherent and
243 non-cache-coherent devices, with the behavior of each device
244 specified in the ACPI DSDT. But when a PCI device is assigned
245 to a guest VM, that device does not appear in the DSDT, so the
246 Hyper-V VMBus driver propagates cache-coherency information
247 from the VMBus node in the ACPI DSDT to all VMBus devices,
249 device and as a PCI device). See vmbus_dma_configure().
250 Current Hyper-V versions always indicate that the VMBus is
252 cache coherent and the CPU does not perform any sync
256 ----------------------
258 messages are passed over a VMBus channel between the Hyper-V
259 host and the Hyper-v vPCI driver in the Linux guest. Some
260 messages have been revised in newer versions of Hyper-V, so
261 the guest and host must agree on the vPCI protocol version to
265 extend support to VMs with more than 64 vCPUs, and provide
267 guest virtual NUMA node to which it is most closely affined in
271 ------------------------
277 the Linux guest defaults the device NUMA node to 0. But even
279 information, the ability of the host to provide such
283 Unfortunately it is not possible to distinguish the two cases
286 PCI config space access in a CoCo VM
287 ------------------------------------
288 Linux PCI device drivers access PCI config space using a
289 standard set of functions provided by the Linux PCI subsystem.
290 In Hyper-V guests these standard functions map to functions
292 in the Hyper-V virtual PCI driver. In normal VMs,
293 these hv_pcifront_*() functions directly access the PCI config
294 space, and the accesses trap to Hyper-V to be handled.
295 But in CoCo VMs, memory encryption prevents Hyper-V
296 from reading the guest instruction stream to emulate the
298 hypercalls with explicit arguments describing the access to be
301 Config Block back-channel
302 -------------------------
303 The Hyper-V host and Hyper-V virtual PCI driver in Linux
304 together implement a non-standard back-channel communication
305 path between the host and guest. The back-channel path uses
308 hyperv_write_cfg_blk() are the primary interfaces provided to
310 interfaces are used only by the Mellanox mlx5 driver to pass
311 diagnostic data to a Hyper-V host running in the Azure public
314 (pci-hyperv-intf.c, under CONFIG_PCI_HYPERV_INTERFACE) that
315 effectively stubs them out when running in non-Hyper-V