162306a36Sopenharmony_ci.. SPDX-License-Identifier: GPL-2.0 262306a36Sopenharmony_ci 362306a36Sopenharmony_ci=========================================================== 462306a36Sopenharmony_ciPOWER9 eXternal Interrupt Virtualization Engine (XIVE Gen1) 562306a36Sopenharmony_ci=========================================================== 662306a36Sopenharmony_ci 762306a36Sopenharmony_ciDevice types supported: 862306a36Sopenharmony_ci - KVM_DEV_TYPE_XIVE POWER9 XIVE Interrupt Controller generation 1 962306a36Sopenharmony_ci 1062306a36Sopenharmony_ciThis device acts as a VM interrupt controller. It provides the KVM 1162306a36Sopenharmony_ciinterface to configure the interrupt sources of a VM in the underlying 1262306a36Sopenharmony_ciPOWER9 XIVE interrupt controller. 1362306a36Sopenharmony_ci 1462306a36Sopenharmony_ciOnly one XIVE instance may be instantiated. A guest XIVE device 1562306a36Sopenharmony_cirequires a POWER9 host and the guest OS should have support for the 1662306a36Sopenharmony_ciXIVE native exploitation interrupt mode. If not, it should run using 1762306a36Sopenharmony_cithe legacy interrupt mode, referred as XICS (POWER7/8). 1862306a36Sopenharmony_ci 1962306a36Sopenharmony_ci* Device Mappings 2062306a36Sopenharmony_ci 2162306a36Sopenharmony_ci The KVM device exposes different MMIO ranges of the XIVE HW which 2262306a36Sopenharmony_ci are required for interrupt management. These are exposed to the 2362306a36Sopenharmony_ci guest in VMAs populated with a custom VM fault handler. 2462306a36Sopenharmony_ci 2562306a36Sopenharmony_ci 1. Thread Interrupt Management Area (TIMA) 2662306a36Sopenharmony_ci 2762306a36Sopenharmony_ci Each thread has an associated Thread Interrupt Management context 2862306a36Sopenharmony_ci composed of a set of registers. These registers let the thread 2962306a36Sopenharmony_ci handle priority management and interrupt acknowledgment. The most 3062306a36Sopenharmony_ci important are : 3162306a36Sopenharmony_ci 3262306a36Sopenharmony_ci - Interrupt Pending Buffer (IPB) 3362306a36Sopenharmony_ci - Current Processor Priority (CPPR) 3462306a36Sopenharmony_ci - Notification Source Register (NSR) 3562306a36Sopenharmony_ci 3662306a36Sopenharmony_ci They are exposed to software in four different pages each proposing 3762306a36Sopenharmony_ci a view with a different privilege. The first page is for the 3862306a36Sopenharmony_ci physical thread context and the second for the hypervisor. Only the 3962306a36Sopenharmony_ci third (operating system) and the fourth (user level) are exposed the 4062306a36Sopenharmony_ci guest. 4162306a36Sopenharmony_ci 4262306a36Sopenharmony_ci 2. Event State Buffer (ESB) 4362306a36Sopenharmony_ci 4462306a36Sopenharmony_ci Each source is associated with an Event State Buffer (ESB) with 4562306a36Sopenharmony_ci either a pair of even/odd pair of pages which provides commands to 4662306a36Sopenharmony_ci manage the source: to trigger, to EOI, to turn off the source for 4762306a36Sopenharmony_ci instance. 4862306a36Sopenharmony_ci 4962306a36Sopenharmony_ci 3. Device pass-through 5062306a36Sopenharmony_ci 5162306a36Sopenharmony_ci When a device is passed-through into the guest, the source 5262306a36Sopenharmony_ci interrupts are from a different HW controller (PHB4) and the ESB 5362306a36Sopenharmony_ci pages exposed to the guest should accommodate this change. 5462306a36Sopenharmony_ci 5562306a36Sopenharmony_ci The passthru_irq helpers, kvmppc_xive_set_mapped() and 5662306a36Sopenharmony_ci kvmppc_xive_clr_mapped() are called when the device HW irqs are 5762306a36Sopenharmony_ci mapped into or unmapped from the guest IRQ number space. The KVM 5862306a36Sopenharmony_ci device extends these helpers to clear the ESB pages of the guest IRQ 5962306a36Sopenharmony_ci number being mapped and then lets the VM fault handler repopulate. 6062306a36Sopenharmony_ci The handler will insert the ESB page corresponding to the HW 6162306a36Sopenharmony_ci interrupt of the device being passed-through or the initial IPI ESB 6262306a36Sopenharmony_ci page if the device has being removed. 6362306a36Sopenharmony_ci 6462306a36Sopenharmony_ci The ESB remapping is fully transparent to the guest and the OS 6562306a36Sopenharmony_ci device driver. All handling is done within VFIO and the above 6662306a36Sopenharmony_ci helpers in KVM-PPC. 6762306a36Sopenharmony_ci 6862306a36Sopenharmony_ci* Groups: 6962306a36Sopenharmony_ci 7062306a36Sopenharmony_ci1. KVM_DEV_XIVE_GRP_CTRL 7162306a36Sopenharmony_ci Provides global controls on the device 7262306a36Sopenharmony_ci 7362306a36Sopenharmony_ci Attributes: 7462306a36Sopenharmony_ci 1.1 KVM_DEV_XIVE_RESET (write only) 7562306a36Sopenharmony_ci Resets the interrupt controller configuration for sources and event 7662306a36Sopenharmony_ci queues. To be used by kexec and kdump. 7762306a36Sopenharmony_ci 7862306a36Sopenharmony_ci Errors: none 7962306a36Sopenharmony_ci 8062306a36Sopenharmony_ci 1.2 KVM_DEV_XIVE_EQ_SYNC (write only) 8162306a36Sopenharmony_ci Sync all the sources and queues and mark the EQ pages dirty. This 8262306a36Sopenharmony_ci to make sure that a consistent memory state is captured when 8362306a36Sopenharmony_ci migrating the VM. 8462306a36Sopenharmony_ci 8562306a36Sopenharmony_ci Errors: none 8662306a36Sopenharmony_ci 8762306a36Sopenharmony_ci 1.3 KVM_DEV_XIVE_NR_SERVERS (write only) 8862306a36Sopenharmony_ci The kvm_device_attr.addr points to a __u32 value which is the number of 8962306a36Sopenharmony_ci interrupt server numbers (ie, highest possible vcpu id plus one). 9062306a36Sopenharmony_ci 9162306a36Sopenharmony_ci Errors: 9262306a36Sopenharmony_ci 9362306a36Sopenharmony_ci ======= ========================================== 9462306a36Sopenharmony_ci -EINVAL Value greater than KVM_MAX_VCPU_IDS. 9562306a36Sopenharmony_ci -EFAULT Invalid user pointer for attr->addr. 9662306a36Sopenharmony_ci -EBUSY A vCPU is already connected to the device. 9762306a36Sopenharmony_ci ======= ========================================== 9862306a36Sopenharmony_ci 9962306a36Sopenharmony_ci2. KVM_DEV_XIVE_GRP_SOURCE (write only) 10062306a36Sopenharmony_ci Initializes a new source in the XIVE device and mask it. 10162306a36Sopenharmony_ci 10262306a36Sopenharmony_ci Attributes: 10362306a36Sopenharmony_ci Interrupt source number (64-bit) 10462306a36Sopenharmony_ci 10562306a36Sopenharmony_ci The kvm_device_attr.addr points to a __u64 value:: 10662306a36Sopenharmony_ci 10762306a36Sopenharmony_ci bits: | 63 .... 2 | 1 | 0 10862306a36Sopenharmony_ci values: | unused | level | type 10962306a36Sopenharmony_ci 11062306a36Sopenharmony_ci - type: 0:MSI 1:LSI 11162306a36Sopenharmony_ci - level: assertion level in case of an LSI. 11262306a36Sopenharmony_ci 11362306a36Sopenharmony_ci Errors: 11462306a36Sopenharmony_ci 11562306a36Sopenharmony_ci ======= ========================================== 11662306a36Sopenharmony_ci -E2BIG Interrupt source number is out of range 11762306a36Sopenharmony_ci -ENOMEM Could not create a new source block 11862306a36Sopenharmony_ci -EFAULT Invalid user pointer for attr->addr. 11962306a36Sopenharmony_ci -ENXIO Could not allocate underlying HW interrupt 12062306a36Sopenharmony_ci ======= ========================================== 12162306a36Sopenharmony_ci 12262306a36Sopenharmony_ci3. KVM_DEV_XIVE_GRP_SOURCE_CONFIG (write only) 12362306a36Sopenharmony_ci Configures source targeting 12462306a36Sopenharmony_ci 12562306a36Sopenharmony_ci Attributes: 12662306a36Sopenharmony_ci Interrupt source number (64-bit) 12762306a36Sopenharmony_ci 12862306a36Sopenharmony_ci The kvm_device_attr.addr points to a __u64 value:: 12962306a36Sopenharmony_ci 13062306a36Sopenharmony_ci bits: | 63 .... 33 | 32 | 31 .. 3 | 2 .. 0 13162306a36Sopenharmony_ci values: | eisn | mask | server | priority 13262306a36Sopenharmony_ci 13362306a36Sopenharmony_ci - priority: 0-7 interrupt priority level 13462306a36Sopenharmony_ci - server: CPU number chosen to handle the interrupt 13562306a36Sopenharmony_ci - mask: mask flag (unused) 13662306a36Sopenharmony_ci - eisn: Effective Interrupt Source Number 13762306a36Sopenharmony_ci 13862306a36Sopenharmony_ci Errors: 13962306a36Sopenharmony_ci 14062306a36Sopenharmony_ci ======= ======================================================= 14162306a36Sopenharmony_ci -ENOENT Unknown source number 14262306a36Sopenharmony_ci -EINVAL Not initialized source number 14362306a36Sopenharmony_ci -EINVAL Invalid priority 14462306a36Sopenharmony_ci -EINVAL Invalid CPU number. 14562306a36Sopenharmony_ci -EFAULT Invalid user pointer for attr->addr. 14662306a36Sopenharmony_ci -ENXIO CPU event queues not configured or configuration of the 14762306a36Sopenharmony_ci underlying HW interrupt failed 14862306a36Sopenharmony_ci -EBUSY No CPU available to serve interrupt 14962306a36Sopenharmony_ci ======= ======================================================= 15062306a36Sopenharmony_ci 15162306a36Sopenharmony_ci4. KVM_DEV_XIVE_GRP_EQ_CONFIG (read-write) 15262306a36Sopenharmony_ci Configures an event queue of a CPU 15362306a36Sopenharmony_ci 15462306a36Sopenharmony_ci Attributes: 15562306a36Sopenharmony_ci EQ descriptor identifier (64-bit) 15662306a36Sopenharmony_ci 15762306a36Sopenharmony_ci The EQ descriptor identifier is a tuple (server, priority):: 15862306a36Sopenharmony_ci 15962306a36Sopenharmony_ci bits: | 63 .... 32 | 31 .. 3 | 2 .. 0 16062306a36Sopenharmony_ci values: | unused | server | priority 16162306a36Sopenharmony_ci 16262306a36Sopenharmony_ci The kvm_device_attr.addr points to:: 16362306a36Sopenharmony_ci 16462306a36Sopenharmony_ci struct kvm_ppc_xive_eq { 16562306a36Sopenharmony_ci __u32 flags; 16662306a36Sopenharmony_ci __u32 qshift; 16762306a36Sopenharmony_ci __u64 qaddr; 16862306a36Sopenharmony_ci __u32 qtoggle; 16962306a36Sopenharmony_ci __u32 qindex; 17062306a36Sopenharmony_ci __u8 pad[40]; 17162306a36Sopenharmony_ci }; 17262306a36Sopenharmony_ci 17362306a36Sopenharmony_ci - flags: queue flags 17462306a36Sopenharmony_ci KVM_XIVE_EQ_ALWAYS_NOTIFY (required) 17562306a36Sopenharmony_ci forces notification without using the coalescing mechanism 17662306a36Sopenharmony_ci provided by the XIVE END ESBs. 17762306a36Sopenharmony_ci - qshift: queue size (power of 2) 17862306a36Sopenharmony_ci - qaddr: real address of queue 17962306a36Sopenharmony_ci - qtoggle: current queue toggle bit 18062306a36Sopenharmony_ci - qindex: current queue index 18162306a36Sopenharmony_ci - pad: reserved for future use 18262306a36Sopenharmony_ci 18362306a36Sopenharmony_ci Errors: 18462306a36Sopenharmony_ci 18562306a36Sopenharmony_ci ======= ========================================= 18662306a36Sopenharmony_ci -ENOENT Invalid CPU number 18762306a36Sopenharmony_ci -EINVAL Invalid priority 18862306a36Sopenharmony_ci -EINVAL Invalid flags 18962306a36Sopenharmony_ci -EINVAL Invalid queue size 19062306a36Sopenharmony_ci -EINVAL Invalid queue address 19162306a36Sopenharmony_ci -EFAULT Invalid user pointer for attr->addr. 19262306a36Sopenharmony_ci -EIO Configuration of the underlying HW failed 19362306a36Sopenharmony_ci ======= ========================================= 19462306a36Sopenharmony_ci 19562306a36Sopenharmony_ci5. KVM_DEV_XIVE_GRP_SOURCE_SYNC (write only) 19662306a36Sopenharmony_ci Synchronize the source to flush event notifications 19762306a36Sopenharmony_ci 19862306a36Sopenharmony_ci Attributes: 19962306a36Sopenharmony_ci Interrupt source number (64-bit) 20062306a36Sopenharmony_ci 20162306a36Sopenharmony_ci Errors: 20262306a36Sopenharmony_ci 20362306a36Sopenharmony_ci ======= ============================= 20462306a36Sopenharmony_ci -ENOENT Unknown source number 20562306a36Sopenharmony_ci -EINVAL Not initialized source number 20662306a36Sopenharmony_ci ======= ============================= 20762306a36Sopenharmony_ci 20862306a36Sopenharmony_ci* VCPU state 20962306a36Sopenharmony_ci 21062306a36Sopenharmony_ci The XIVE IC maintains VP interrupt state in an internal structure 21162306a36Sopenharmony_ci called the NVT. When a VP is not dispatched on a HW processor 21262306a36Sopenharmony_ci thread, this structure can be updated by HW if the VP is the target 21362306a36Sopenharmony_ci of an event notification. 21462306a36Sopenharmony_ci 21562306a36Sopenharmony_ci It is important for migration to capture the cached IPB from the NVT 21662306a36Sopenharmony_ci as it synthesizes the priorities of the pending interrupts. We 21762306a36Sopenharmony_ci capture a bit more to report debug information. 21862306a36Sopenharmony_ci 21962306a36Sopenharmony_ci KVM_REG_PPC_VP_STATE (2 * 64bits):: 22062306a36Sopenharmony_ci 22162306a36Sopenharmony_ci bits: | 63 .... 32 | 31 .... 0 | 22262306a36Sopenharmony_ci values: | TIMA word0 | TIMA word1 | 22362306a36Sopenharmony_ci bits: | 127 .......... 64 | 22462306a36Sopenharmony_ci values: | unused | 22562306a36Sopenharmony_ci 22662306a36Sopenharmony_ci* Migration: 22762306a36Sopenharmony_ci 22862306a36Sopenharmony_ci Saving the state of a VM using the XIVE native exploitation mode 22962306a36Sopenharmony_ci should follow a specific sequence. When the VM is stopped : 23062306a36Sopenharmony_ci 23162306a36Sopenharmony_ci 1. Mask all sources (PQ=01) to stop the flow of events. 23262306a36Sopenharmony_ci 23362306a36Sopenharmony_ci 2. Sync the XIVE device with the KVM control KVM_DEV_XIVE_EQ_SYNC to 23462306a36Sopenharmony_ci flush any in-flight event notification and to stabilize the EQs. At 23562306a36Sopenharmony_ci this stage, the EQ pages are marked dirty to make sure they are 23662306a36Sopenharmony_ci transferred in the migration sequence. 23762306a36Sopenharmony_ci 23862306a36Sopenharmony_ci 3. Capture the state of the source targeting, the EQs configuration 23962306a36Sopenharmony_ci and the state of thread interrupt context registers. 24062306a36Sopenharmony_ci 24162306a36Sopenharmony_ci Restore is similar: 24262306a36Sopenharmony_ci 24362306a36Sopenharmony_ci 1. Restore the EQ configuration. As targeting depends on it. 24462306a36Sopenharmony_ci 2. Restore targeting 24562306a36Sopenharmony_ci 3. Restore the thread interrupt contexts 24662306a36Sopenharmony_ci 4. Restore the source states 24762306a36Sopenharmony_ci 5. Let the vCPU run 248