2024-09-23 –, Hall A+B
Compute Express Link (CXL) is an open standard interconnect built upon industrial PCI layers to enhance the performance and efficiency of data centers by enabling high-speed, low-latency communication between CPUs and various types of devices such as accelerators and memory. It supports three key protocols: CXL.io as the control protocol, CXL.cache as the host-device cache-coherent protocol, and CXL.mem as load store memory access protocol. CXL Type 2 devices leverage the three protocols to seamlessly integrate with host CPUs, providing a unified and efficient interface for high-speed data transfer and memory sharing. This integration is crucial for heterogeneous computing environments where accelerators, such as GPUs, and other specialized processors, handle intensive workloads.
VFIO is the standard interface used by Linux kernel to pass a host device, such as a PCI device, to a virtual machine (VM). To pass a PCI device to a VM, VFIO provides several modules, including vfio-pci (the generic PCI stub driver), VFIO variant drivers (vendor-specific PCI stub drivers), and vfio-pci-core (the core functions needed by vfio-pci and other VFIO variant drivers). With the VFIO UABIs, user space device model like QEMU can map the device registers and memory regions into the VM, allowing the VM to directly access the device. With a VFIO variant driver from HW vendors, it can also support mediate passthrough, live migration for use cases like vGPU. Although CXL is built upon the PCI layers, passing a CXL type-2 device can be different than PCI device due to CXL specifications, e.g. emulating CXL DVSECs, handling CXL-defined register regions in the BAR, exposing CXL HDM regions. Thus, a new set of VFIO CXL modules needs to be introduced.
In this topic, we review the requirements of a CXL type-2 device, discuss the architecture design of VFIO CXL modules, their UABIs, and the required changes to the kernel CXL core and QEMU besides VFIO.
Zhi is an open-source developer working on vGPU, confidential computing, and virtualization. He is currently working on NVIDIA vGPU. For confidential computing, he is interested in Intel TDX/AMD SEV-SNP and worked on TDX connect enabling at Intel.