162306a36Sopenharmony_ci.. SPDX-License-Identifier: GPL-2.0
262306a36Sopenharmony_ci.. iommu:
362306a36Sopenharmony_ci
462306a36Sopenharmony_ci=====================================
562306a36Sopenharmony_ciIOMMU Userspace API
662306a36Sopenharmony_ci=====================================
762306a36Sopenharmony_ci
862306a36Sopenharmony_ciIOMMU UAPI is used for virtualization cases where communications are
962306a36Sopenharmony_cineeded between physical and virtual IOMMU drivers. For baremetal
1062306a36Sopenharmony_ciusage, the IOMMU is a system device which does not need to communicate
1162306a36Sopenharmony_ciwith userspace directly.
1262306a36Sopenharmony_ci
1362306a36Sopenharmony_ciThe primary use cases are guest Shared Virtual Address (SVA) and
1462306a36Sopenharmony_ciguest IO virtual address (IOVA), wherein the vIOMMU implementation
1562306a36Sopenharmony_cirelies on the physical IOMMU and for this reason requires interactions
1662306a36Sopenharmony_ciwith the host driver.
1762306a36Sopenharmony_ci
1862306a36Sopenharmony_ci.. contents:: :local:
1962306a36Sopenharmony_ci
2062306a36Sopenharmony_ciFunctionalities
2162306a36Sopenharmony_ci===============
2262306a36Sopenharmony_ciCommunications of user and kernel involve both directions. The
2362306a36Sopenharmony_cisupported user-kernel APIs are as follows:
2462306a36Sopenharmony_ci
2562306a36Sopenharmony_ci1. Bind/Unbind guest PASID (e.g. Intel VT-d)
2662306a36Sopenharmony_ci2. Bind/Unbind guest PASID table (e.g. ARM SMMU)
2762306a36Sopenharmony_ci3. Invalidate IOMMU caches upon guest requests
2862306a36Sopenharmony_ci4. Report errors to the guest and serve page requests
2962306a36Sopenharmony_ci
3062306a36Sopenharmony_ciRequirements
3162306a36Sopenharmony_ci============
3262306a36Sopenharmony_ciThe IOMMU UAPIs are generic and extensible to meet the following
3362306a36Sopenharmony_cirequirements:
3462306a36Sopenharmony_ci
3562306a36Sopenharmony_ci1. Emulated and para-virtualised vIOMMUs
3662306a36Sopenharmony_ci2. Multiple vendors (Intel VT-d, ARM SMMU, etc.)
3762306a36Sopenharmony_ci3. Extensions to the UAPI shall not break existing userspace
3862306a36Sopenharmony_ci
3962306a36Sopenharmony_ciInterfaces
4062306a36Sopenharmony_ci==========
4162306a36Sopenharmony_ciAlthough the data structures defined in IOMMU UAPI are self-contained,
4262306a36Sopenharmony_cithere are no user API functions introduced. Instead, IOMMU UAPI is
4362306a36Sopenharmony_cidesigned to work with existing user driver frameworks such as VFIO.
4462306a36Sopenharmony_ci
4562306a36Sopenharmony_ciExtension Rules & Precautions
4662306a36Sopenharmony_ci-----------------------------
4762306a36Sopenharmony_ciWhen IOMMU UAPI gets extended, the data structures can *only* be
4862306a36Sopenharmony_cimodified in two ways:
4962306a36Sopenharmony_ci
5062306a36Sopenharmony_ci1. Adding new fields by re-purposing the padding[] field. No size change.
5162306a36Sopenharmony_ci2. Adding new union members at the end. May increase the structure sizes.
5262306a36Sopenharmony_ci
5362306a36Sopenharmony_ciNo new fields can be added *after* the variable sized union in that it
5462306a36Sopenharmony_ciwill break backward compatibility when offset moves. A new flag must
5562306a36Sopenharmony_cibe introduced whenever a change affects the structure using either
5662306a36Sopenharmony_cimethod. The IOMMU driver processes the data based on flags which
5762306a36Sopenharmony_ciensures backward compatibility.
5862306a36Sopenharmony_ci
5962306a36Sopenharmony_ciVersion field is only reserved for the unlikely event of UAPI upgrade
6062306a36Sopenharmony_ciat its entirety.
6162306a36Sopenharmony_ci
6262306a36Sopenharmony_ciIt's *always* the caller's responsibility to indicate the size of the
6362306a36Sopenharmony_cistructure passed by setting argsz appropriately.
6462306a36Sopenharmony_ciThough at the same time, argsz is user provided data which is not
6562306a36Sopenharmony_citrusted. The argsz field allows the user app to indicate how much data
6662306a36Sopenharmony_ciit is providing; it's still the kernel's responsibility to validate
6762306a36Sopenharmony_ciwhether it's correct and sufficient for the requested operation.
6862306a36Sopenharmony_ci
6962306a36Sopenharmony_ciCompatibility Checking
7062306a36Sopenharmony_ci----------------------
7162306a36Sopenharmony_ciWhen IOMMU UAPI extension results in some structure size increase,
7262306a36Sopenharmony_ciIOMMU UAPI code shall handle the following cases:
7362306a36Sopenharmony_ci
7462306a36Sopenharmony_ci1. User and kernel has exact size match
7562306a36Sopenharmony_ci2. An older user with older kernel header (smaller UAPI size) running on a
7662306a36Sopenharmony_ci   newer kernel (larger UAPI size)
7762306a36Sopenharmony_ci3. A newer user with newer kernel header (larger UAPI size) running
7862306a36Sopenharmony_ci   on an older kernel.
7962306a36Sopenharmony_ci4. A malicious/misbehaving user passing illegal/invalid size but within
8062306a36Sopenharmony_ci   range. The data may contain garbage.
8162306a36Sopenharmony_ci
8262306a36Sopenharmony_ciFeature Checking
8362306a36Sopenharmony_ci----------------
8462306a36Sopenharmony_ciWhile launching a guest with vIOMMU, it is strongly advised to check
8562306a36Sopenharmony_cithe compatibility upfront, as some subsequent errors happening during
8662306a36Sopenharmony_civIOMMU operation, such as cache invalidation failures cannot be nicely
8762306a36Sopenharmony_ciescalated to the guest due to IOMMU specifications. This can lead to
8862306a36Sopenharmony_cicatastrophic failures for the users.
8962306a36Sopenharmony_ci
9062306a36Sopenharmony_ciUser applications such as QEMU are expected to import kernel UAPI
9162306a36Sopenharmony_ciheaders. Backward compatibility is supported per feature flags.
9262306a36Sopenharmony_ciFor example, an older QEMU (with older kernel header) can run on newer
9362306a36Sopenharmony_cikernel. Newer QEMU (with new kernel header) may refuse to initialize
9462306a36Sopenharmony_cion an older kernel if new feature flags are not supported by older
9562306a36Sopenharmony_cikernel. Simply recompiling existing code with newer kernel header should
9662306a36Sopenharmony_cinot be an issue in that only existing flags are used.
9762306a36Sopenharmony_ci
9862306a36Sopenharmony_ciIOMMU vendor driver should report the below features to IOMMU UAPI
9962306a36Sopenharmony_ciconsumers (e.g. via VFIO).
10062306a36Sopenharmony_ci
10162306a36Sopenharmony_ci1. IOMMU_NESTING_FEAT_SYSWIDE_PASID
10262306a36Sopenharmony_ci2. IOMMU_NESTING_FEAT_BIND_PGTBL
10362306a36Sopenharmony_ci3. IOMMU_NESTING_FEAT_BIND_PASID_TABLE
10462306a36Sopenharmony_ci4. IOMMU_NESTING_FEAT_CACHE_INVLD
10562306a36Sopenharmony_ci5. IOMMU_NESTING_FEAT_PAGE_REQUEST
10662306a36Sopenharmony_ci
10762306a36Sopenharmony_ciTake VFIO as example, upon request from VFIO userspace (e.g. QEMU),
10862306a36Sopenharmony_ciVFIO kernel code shall query IOMMU vendor driver for the support of
10962306a36Sopenharmony_cithe above features. Query result can then be reported back to the
11062306a36Sopenharmony_ciuserspace caller. Details can be found in
11162306a36Sopenharmony_ciDocumentation/driver-api/vfio.rst.
11262306a36Sopenharmony_ci
11362306a36Sopenharmony_ci
11462306a36Sopenharmony_ciData Passing Example with VFIO
11562306a36Sopenharmony_ci------------------------------
11662306a36Sopenharmony_ciAs the ubiquitous userspace driver framework, VFIO is already IOMMU
11762306a36Sopenharmony_ciaware and shares many key concepts such as device model, group, and
11862306a36Sopenharmony_ciprotection domain. Other user driver frameworks can also be extended
11962306a36Sopenharmony_cito support IOMMU UAPI but it is outside the scope of this document.
12062306a36Sopenharmony_ci
12162306a36Sopenharmony_ciIn this tight-knit VFIO-IOMMU interface, the ultimate consumer of the
12262306a36Sopenharmony_ciIOMMU UAPI data is the host IOMMU driver. VFIO facilitates user-kernel
12362306a36Sopenharmony_citransport, capability checking, security, and life cycle management of
12462306a36Sopenharmony_ciprocess address space ID (PASID).
12562306a36Sopenharmony_ci
12662306a36Sopenharmony_ciVFIO layer conveys the data structures down to the IOMMU driver. It
12762306a36Sopenharmony_cifollows the pattern below::
12862306a36Sopenharmony_ci
12962306a36Sopenharmony_ci   struct {
13062306a36Sopenharmony_ci	__u32 argsz;
13162306a36Sopenharmony_ci	__u32 flags;
13262306a36Sopenharmony_ci	__u8  data[];
13362306a36Sopenharmony_ci   };
13462306a36Sopenharmony_ci
13562306a36Sopenharmony_ciHere data[] contains the IOMMU UAPI data structures. VFIO has the
13662306a36Sopenharmony_cifreedom to bundle the data as well as parse data size based on its own flags.
13762306a36Sopenharmony_ci
13862306a36Sopenharmony_ciIn order to determine the size and feature set of the user data, argsz
13962306a36Sopenharmony_ciand flags (or the equivalent) are also embedded in the IOMMU UAPI data
14062306a36Sopenharmony_cistructures.
14162306a36Sopenharmony_ci
14262306a36Sopenharmony_ciA "__u32 argsz" field is *always* at the beginning of each structure.
14362306a36Sopenharmony_ci
14462306a36Sopenharmony_ciFor example:
14562306a36Sopenharmony_ci::
14662306a36Sopenharmony_ci
14762306a36Sopenharmony_ci   struct iommu_cache_invalidate_info {
14862306a36Sopenharmony_ci	__u32	argsz;
14962306a36Sopenharmony_ci	#define IOMMU_CACHE_INVALIDATE_INFO_VERSION_1 1
15062306a36Sopenharmony_ci	__u32	version;
15162306a36Sopenharmony_ci	/* IOMMU paging structure cache */
15262306a36Sopenharmony_ci	#define IOMMU_CACHE_INV_TYPE_IOTLB	(1 << 0) /* IOMMU IOTLB */
15362306a36Sopenharmony_ci	#define IOMMU_CACHE_INV_TYPE_DEV_IOTLB	(1 << 1) /* Device IOTLB */
15462306a36Sopenharmony_ci	#define IOMMU_CACHE_INV_TYPE_PASID	(1 << 2) /* PASID cache */
15562306a36Sopenharmony_ci	#define IOMMU_CACHE_INV_TYPE_NR		(3)
15662306a36Sopenharmony_ci	__u8	cache;
15762306a36Sopenharmony_ci	__u8	granularity;
15862306a36Sopenharmony_ci	__u8	padding[6];
15962306a36Sopenharmony_ci	union {
16062306a36Sopenharmony_ci		struct iommu_inv_pasid_info pasid_info;
16162306a36Sopenharmony_ci		struct iommu_inv_addr_info addr_info;
16262306a36Sopenharmony_ci	} granu;
16362306a36Sopenharmony_ci   };
16462306a36Sopenharmony_ci
16562306a36Sopenharmony_ciVFIO is responsible for checking its own argsz and flags. It then
16662306a36Sopenharmony_ciinvokes appropriate IOMMU UAPI functions. The user pointers are passed
16762306a36Sopenharmony_cito the IOMMU layer for further processing. The responsibilities are
16862306a36Sopenharmony_cidivided as follows:
16962306a36Sopenharmony_ci
17062306a36Sopenharmony_ci- Generic IOMMU layer checks argsz range based on UAPI data in the
17162306a36Sopenharmony_ci  current kernel version.
17262306a36Sopenharmony_ci
17362306a36Sopenharmony_ci- Generic IOMMU layer checks content of the UAPI data for non-zero
17462306a36Sopenharmony_ci  reserved bits in flags, padding fields, and unsupported version.
17562306a36Sopenharmony_ci  This is to ensure not breaking userspace in the future when these
17662306a36Sopenharmony_ci  fields or flags are used.
17762306a36Sopenharmony_ci
17862306a36Sopenharmony_ci- Vendor IOMMU driver checks argsz based on vendor flags. UAPI data
17962306a36Sopenharmony_ci  is consumed based on flags. Vendor driver has access to
18062306a36Sopenharmony_ci  unadulterated argsz value in case of vendor specific future
18162306a36Sopenharmony_ci  extensions. Currently, it does not perform the copy_from_user()
18262306a36Sopenharmony_ci  itself. A __user pointer can be provided in some future scenarios
18362306a36Sopenharmony_ci  where there's vendor data outside of the structure definition.
18462306a36Sopenharmony_ci
18562306a36Sopenharmony_ciIOMMU code treats UAPI data in two categories:
18662306a36Sopenharmony_ci
18762306a36Sopenharmony_ci- structure contains vendor data
18862306a36Sopenharmony_ci  (Example: iommu_uapi_cache_invalidate())
18962306a36Sopenharmony_ci
19062306a36Sopenharmony_ci- structure contains only generic data
19162306a36Sopenharmony_ci  (Example: iommu_uapi_sva_bind_gpasid())
19262306a36Sopenharmony_ci
19362306a36Sopenharmony_ci
19462306a36Sopenharmony_ci
19562306a36Sopenharmony_ciSharing UAPI with in-kernel users
19662306a36Sopenharmony_ci---------------------------------
19762306a36Sopenharmony_ciFor UAPIs that are shared with in-kernel users, a wrapper function is
19862306a36Sopenharmony_ciprovided to distinguish the callers. For example,
19962306a36Sopenharmony_ci
20062306a36Sopenharmony_ciUserspace caller ::
20162306a36Sopenharmony_ci
20262306a36Sopenharmony_ci  int iommu_uapi_sva_unbind_gpasid(struct iommu_domain *domain,
20362306a36Sopenharmony_ci                                   struct device *dev,
20462306a36Sopenharmony_ci                                   void __user *udata)
20562306a36Sopenharmony_ci
20662306a36Sopenharmony_ciIn-kernel caller ::
20762306a36Sopenharmony_ci
20862306a36Sopenharmony_ci  int iommu_sva_unbind_gpasid(struct iommu_domain *domain,
20962306a36Sopenharmony_ci                              struct device *dev, ioasid_t ioasid);
210