162306a36Sopenharmony_ci.. SPDX-License-Identifier: GPL-2.0 262306a36Sopenharmony_ci.. iommu: 362306a36Sopenharmony_ci 462306a36Sopenharmony_ci===================================== 562306a36Sopenharmony_ciIOMMU Userspace API 662306a36Sopenharmony_ci===================================== 762306a36Sopenharmony_ci 862306a36Sopenharmony_ciIOMMU UAPI is used for virtualization cases where communications are 962306a36Sopenharmony_cineeded between physical and virtual IOMMU drivers. For baremetal 1062306a36Sopenharmony_ciusage, the IOMMU is a system device which does not need to communicate 1162306a36Sopenharmony_ciwith userspace directly. 1262306a36Sopenharmony_ci 1362306a36Sopenharmony_ciThe primary use cases are guest Shared Virtual Address (SVA) and 1462306a36Sopenharmony_ciguest IO virtual address (IOVA), wherein the vIOMMU implementation 1562306a36Sopenharmony_cirelies on the physical IOMMU and for this reason requires interactions 1662306a36Sopenharmony_ciwith the host driver. 1762306a36Sopenharmony_ci 1862306a36Sopenharmony_ci.. contents:: :local: 1962306a36Sopenharmony_ci 2062306a36Sopenharmony_ciFunctionalities 2162306a36Sopenharmony_ci=============== 2262306a36Sopenharmony_ciCommunications of user and kernel involve both directions. The 2362306a36Sopenharmony_cisupported user-kernel APIs are as follows: 2462306a36Sopenharmony_ci 2562306a36Sopenharmony_ci1. Bind/Unbind guest PASID (e.g. Intel VT-d) 2662306a36Sopenharmony_ci2. Bind/Unbind guest PASID table (e.g. ARM SMMU) 2762306a36Sopenharmony_ci3. Invalidate IOMMU caches upon guest requests 2862306a36Sopenharmony_ci4. Report errors to the guest and serve page requests 2962306a36Sopenharmony_ci 3062306a36Sopenharmony_ciRequirements 3162306a36Sopenharmony_ci============ 3262306a36Sopenharmony_ciThe IOMMU UAPIs are generic and extensible to meet the following 3362306a36Sopenharmony_cirequirements: 3462306a36Sopenharmony_ci 3562306a36Sopenharmony_ci1. Emulated and para-virtualised vIOMMUs 3662306a36Sopenharmony_ci2. Multiple vendors (Intel VT-d, ARM SMMU, etc.) 3762306a36Sopenharmony_ci3. Extensions to the UAPI shall not break existing userspace 3862306a36Sopenharmony_ci 3962306a36Sopenharmony_ciInterfaces 4062306a36Sopenharmony_ci========== 4162306a36Sopenharmony_ciAlthough the data structures defined in IOMMU UAPI are self-contained, 4262306a36Sopenharmony_cithere are no user API functions introduced. Instead, IOMMU UAPI is 4362306a36Sopenharmony_cidesigned to work with existing user driver frameworks such as VFIO. 4462306a36Sopenharmony_ci 4562306a36Sopenharmony_ciExtension Rules & Precautions 4662306a36Sopenharmony_ci----------------------------- 4762306a36Sopenharmony_ciWhen IOMMU UAPI gets extended, the data structures can *only* be 4862306a36Sopenharmony_cimodified in two ways: 4962306a36Sopenharmony_ci 5062306a36Sopenharmony_ci1. Adding new fields by re-purposing the padding[] field. No size change. 5162306a36Sopenharmony_ci2. Adding new union members at the end. May increase the structure sizes. 5262306a36Sopenharmony_ci 5362306a36Sopenharmony_ciNo new fields can be added *after* the variable sized union in that it 5462306a36Sopenharmony_ciwill break backward compatibility when offset moves. A new flag must 5562306a36Sopenharmony_cibe introduced whenever a change affects the structure using either 5662306a36Sopenharmony_cimethod. The IOMMU driver processes the data based on flags which 5762306a36Sopenharmony_ciensures backward compatibility. 5862306a36Sopenharmony_ci 5962306a36Sopenharmony_ciVersion field is only reserved for the unlikely event of UAPI upgrade 6062306a36Sopenharmony_ciat its entirety. 6162306a36Sopenharmony_ci 6262306a36Sopenharmony_ciIt's *always* the caller's responsibility to indicate the size of the 6362306a36Sopenharmony_cistructure passed by setting argsz appropriately. 6462306a36Sopenharmony_ciThough at the same time, argsz is user provided data which is not 6562306a36Sopenharmony_citrusted. The argsz field allows the user app to indicate how much data 6662306a36Sopenharmony_ciit is providing; it's still the kernel's responsibility to validate 6762306a36Sopenharmony_ciwhether it's correct and sufficient for the requested operation. 6862306a36Sopenharmony_ci 6962306a36Sopenharmony_ciCompatibility Checking 7062306a36Sopenharmony_ci---------------------- 7162306a36Sopenharmony_ciWhen IOMMU UAPI extension results in some structure size increase, 7262306a36Sopenharmony_ciIOMMU UAPI code shall handle the following cases: 7362306a36Sopenharmony_ci 7462306a36Sopenharmony_ci1. User and kernel has exact size match 7562306a36Sopenharmony_ci2. An older user with older kernel header (smaller UAPI size) running on a 7662306a36Sopenharmony_ci newer kernel (larger UAPI size) 7762306a36Sopenharmony_ci3. A newer user with newer kernel header (larger UAPI size) running 7862306a36Sopenharmony_ci on an older kernel. 7962306a36Sopenharmony_ci4. A malicious/misbehaving user passing illegal/invalid size but within 8062306a36Sopenharmony_ci range. The data may contain garbage. 8162306a36Sopenharmony_ci 8262306a36Sopenharmony_ciFeature Checking 8362306a36Sopenharmony_ci---------------- 8462306a36Sopenharmony_ciWhile launching a guest with vIOMMU, it is strongly advised to check 8562306a36Sopenharmony_cithe compatibility upfront, as some subsequent errors happening during 8662306a36Sopenharmony_civIOMMU operation, such as cache invalidation failures cannot be nicely 8762306a36Sopenharmony_ciescalated to the guest due to IOMMU specifications. This can lead to 8862306a36Sopenharmony_cicatastrophic failures for the users. 8962306a36Sopenharmony_ci 9062306a36Sopenharmony_ciUser applications such as QEMU are expected to import kernel UAPI 9162306a36Sopenharmony_ciheaders. Backward compatibility is supported per feature flags. 9262306a36Sopenharmony_ciFor example, an older QEMU (with older kernel header) can run on newer 9362306a36Sopenharmony_cikernel. Newer QEMU (with new kernel header) may refuse to initialize 9462306a36Sopenharmony_cion an older kernel if new feature flags are not supported by older 9562306a36Sopenharmony_cikernel. Simply recompiling existing code with newer kernel header should 9662306a36Sopenharmony_cinot be an issue in that only existing flags are used. 9762306a36Sopenharmony_ci 9862306a36Sopenharmony_ciIOMMU vendor driver should report the below features to IOMMU UAPI 9962306a36Sopenharmony_ciconsumers (e.g. via VFIO). 10062306a36Sopenharmony_ci 10162306a36Sopenharmony_ci1. IOMMU_NESTING_FEAT_SYSWIDE_PASID 10262306a36Sopenharmony_ci2. IOMMU_NESTING_FEAT_BIND_PGTBL 10362306a36Sopenharmony_ci3. IOMMU_NESTING_FEAT_BIND_PASID_TABLE 10462306a36Sopenharmony_ci4. IOMMU_NESTING_FEAT_CACHE_INVLD 10562306a36Sopenharmony_ci5. IOMMU_NESTING_FEAT_PAGE_REQUEST 10662306a36Sopenharmony_ci 10762306a36Sopenharmony_ciTake VFIO as example, upon request from VFIO userspace (e.g. QEMU), 10862306a36Sopenharmony_ciVFIO kernel code shall query IOMMU vendor driver for the support of 10962306a36Sopenharmony_cithe above features. Query result can then be reported back to the 11062306a36Sopenharmony_ciuserspace caller. Details can be found in 11162306a36Sopenharmony_ciDocumentation/driver-api/vfio.rst. 11262306a36Sopenharmony_ci 11362306a36Sopenharmony_ci 11462306a36Sopenharmony_ciData Passing Example with VFIO 11562306a36Sopenharmony_ci------------------------------ 11662306a36Sopenharmony_ciAs the ubiquitous userspace driver framework, VFIO is already IOMMU 11762306a36Sopenharmony_ciaware and shares many key concepts such as device model, group, and 11862306a36Sopenharmony_ciprotection domain. Other user driver frameworks can also be extended 11962306a36Sopenharmony_cito support IOMMU UAPI but it is outside the scope of this document. 12062306a36Sopenharmony_ci 12162306a36Sopenharmony_ciIn this tight-knit VFIO-IOMMU interface, the ultimate consumer of the 12262306a36Sopenharmony_ciIOMMU UAPI data is the host IOMMU driver. VFIO facilitates user-kernel 12362306a36Sopenharmony_citransport, capability checking, security, and life cycle management of 12462306a36Sopenharmony_ciprocess address space ID (PASID). 12562306a36Sopenharmony_ci 12662306a36Sopenharmony_ciVFIO layer conveys the data structures down to the IOMMU driver. It 12762306a36Sopenharmony_cifollows the pattern below:: 12862306a36Sopenharmony_ci 12962306a36Sopenharmony_ci struct { 13062306a36Sopenharmony_ci __u32 argsz; 13162306a36Sopenharmony_ci __u32 flags; 13262306a36Sopenharmony_ci __u8 data[]; 13362306a36Sopenharmony_ci }; 13462306a36Sopenharmony_ci 13562306a36Sopenharmony_ciHere data[] contains the IOMMU UAPI data structures. VFIO has the 13662306a36Sopenharmony_cifreedom to bundle the data as well as parse data size based on its own flags. 13762306a36Sopenharmony_ci 13862306a36Sopenharmony_ciIn order to determine the size and feature set of the user data, argsz 13962306a36Sopenharmony_ciand flags (or the equivalent) are also embedded in the IOMMU UAPI data 14062306a36Sopenharmony_cistructures. 14162306a36Sopenharmony_ci 14262306a36Sopenharmony_ciA "__u32 argsz" field is *always* at the beginning of each structure. 14362306a36Sopenharmony_ci 14462306a36Sopenharmony_ciFor example: 14562306a36Sopenharmony_ci:: 14662306a36Sopenharmony_ci 14762306a36Sopenharmony_ci struct iommu_cache_invalidate_info { 14862306a36Sopenharmony_ci __u32 argsz; 14962306a36Sopenharmony_ci #define IOMMU_CACHE_INVALIDATE_INFO_VERSION_1 1 15062306a36Sopenharmony_ci __u32 version; 15162306a36Sopenharmony_ci /* IOMMU paging structure cache */ 15262306a36Sopenharmony_ci #define IOMMU_CACHE_INV_TYPE_IOTLB (1 << 0) /* IOMMU IOTLB */ 15362306a36Sopenharmony_ci #define IOMMU_CACHE_INV_TYPE_DEV_IOTLB (1 << 1) /* Device IOTLB */ 15462306a36Sopenharmony_ci #define IOMMU_CACHE_INV_TYPE_PASID (1 << 2) /* PASID cache */ 15562306a36Sopenharmony_ci #define IOMMU_CACHE_INV_TYPE_NR (3) 15662306a36Sopenharmony_ci __u8 cache; 15762306a36Sopenharmony_ci __u8 granularity; 15862306a36Sopenharmony_ci __u8 padding[6]; 15962306a36Sopenharmony_ci union { 16062306a36Sopenharmony_ci struct iommu_inv_pasid_info pasid_info; 16162306a36Sopenharmony_ci struct iommu_inv_addr_info addr_info; 16262306a36Sopenharmony_ci } granu; 16362306a36Sopenharmony_ci }; 16462306a36Sopenharmony_ci 16562306a36Sopenharmony_ciVFIO is responsible for checking its own argsz and flags. It then 16662306a36Sopenharmony_ciinvokes appropriate IOMMU UAPI functions. The user pointers are passed 16762306a36Sopenharmony_cito the IOMMU layer for further processing. The responsibilities are 16862306a36Sopenharmony_cidivided as follows: 16962306a36Sopenharmony_ci 17062306a36Sopenharmony_ci- Generic IOMMU layer checks argsz range based on UAPI data in the 17162306a36Sopenharmony_ci current kernel version. 17262306a36Sopenharmony_ci 17362306a36Sopenharmony_ci- Generic IOMMU layer checks content of the UAPI data for non-zero 17462306a36Sopenharmony_ci reserved bits in flags, padding fields, and unsupported version. 17562306a36Sopenharmony_ci This is to ensure not breaking userspace in the future when these 17662306a36Sopenharmony_ci fields or flags are used. 17762306a36Sopenharmony_ci 17862306a36Sopenharmony_ci- Vendor IOMMU driver checks argsz based on vendor flags. UAPI data 17962306a36Sopenharmony_ci is consumed based on flags. Vendor driver has access to 18062306a36Sopenharmony_ci unadulterated argsz value in case of vendor specific future 18162306a36Sopenharmony_ci extensions. Currently, it does not perform the copy_from_user() 18262306a36Sopenharmony_ci itself. A __user pointer can be provided in some future scenarios 18362306a36Sopenharmony_ci where there's vendor data outside of the structure definition. 18462306a36Sopenharmony_ci 18562306a36Sopenharmony_ciIOMMU code treats UAPI data in two categories: 18662306a36Sopenharmony_ci 18762306a36Sopenharmony_ci- structure contains vendor data 18862306a36Sopenharmony_ci (Example: iommu_uapi_cache_invalidate()) 18962306a36Sopenharmony_ci 19062306a36Sopenharmony_ci- structure contains only generic data 19162306a36Sopenharmony_ci (Example: iommu_uapi_sva_bind_gpasid()) 19262306a36Sopenharmony_ci 19362306a36Sopenharmony_ci 19462306a36Sopenharmony_ci 19562306a36Sopenharmony_ciSharing UAPI with in-kernel users 19662306a36Sopenharmony_ci--------------------------------- 19762306a36Sopenharmony_ciFor UAPIs that are shared with in-kernel users, a wrapper function is 19862306a36Sopenharmony_ciprovided to distinguish the callers. For example, 19962306a36Sopenharmony_ci 20062306a36Sopenharmony_ciUserspace caller :: 20162306a36Sopenharmony_ci 20262306a36Sopenharmony_ci int iommu_uapi_sva_unbind_gpasid(struct iommu_domain *domain, 20362306a36Sopenharmony_ci struct device *dev, 20462306a36Sopenharmony_ci void __user *udata) 20562306a36Sopenharmony_ci 20662306a36Sopenharmony_ciIn-kernel caller :: 20762306a36Sopenharmony_ci 20862306a36Sopenharmony_ci int iommu_sva_unbind_gpasid(struct iommu_domain *domain, 20962306a36Sopenharmony_ci struct device *dev, ioasid_t ioasid); 210