18c2ecf20Sopenharmony_ciBuffer Sharing and Synchronization 28c2ecf20Sopenharmony_ci================================== 38c2ecf20Sopenharmony_ci 48c2ecf20Sopenharmony_ciThe dma-buf subsystem provides the framework for sharing buffers for 58c2ecf20Sopenharmony_cihardware (DMA) access across multiple device drivers and subsystems, and 68c2ecf20Sopenharmony_cifor synchronizing asynchronous hardware access. 78c2ecf20Sopenharmony_ci 88c2ecf20Sopenharmony_ciThis is used, for example, by drm "prime" multi-GPU support, but is of 98c2ecf20Sopenharmony_cicourse not limited to GPU use cases. 108c2ecf20Sopenharmony_ci 118c2ecf20Sopenharmony_ciThe three main components of this are: (1) dma-buf, representing a 128c2ecf20Sopenharmony_cisg_table and exposed to userspace as a file descriptor to allow passing 138c2ecf20Sopenharmony_cibetween devices, (2) fence, which provides a mechanism to signal when 148c2ecf20Sopenharmony_cione device has finished access, and (3) reservation, which manages the 158c2ecf20Sopenharmony_cishared or exclusive fence(s) associated with the buffer. 168c2ecf20Sopenharmony_ci 178c2ecf20Sopenharmony_ciShared DMA Buffers 188c2ecf20Sopenharmony_ci------------------ 198c2ecf20Sopenharmony_ci 208c2ecf20Sopenharmony_ciThis document serves as a guide to device-driver writers on what is the dma-buf 218c2ecf20Sopenharmony_cibuffer sharing API, how to use it for exporting and using shared buffers. 228c2ecf20Sopenharmony_ci 238c2ecf20Sopenharmony_ciAny device driver which wishes to be a part of DMA buffer sharing, can do so as 248c2ecf20Sopenharmony_cieither the 'exporter' of buffers, or the 'user' or 'importer' of buffers. 258c2ecf20Sopenharmony_ci 268c2ecf20Sopenharmony_ciSay a driver A wants to use buffers created by driver B, then we call B as the 278c2ecf20Sopenharmony_ciexporter, and A as buffer-user/importer. 288c2ecf20Sopenharmony_ci 298c2ecf20Sopenharmony_ciThe exporter 308c2ecf20Sopenharmony_ci 318c2ecf20Sopenharmony_ci - implements and manages operations in :c:type:`struct dma_buf_ops 328c2ecf20Sopenharmony_ci <dma_buf_ops>` for the buffer, 338c2ecf20Sopenharmony_ci - allows other users to share the buffer by using dma_buf sharing APIs, 348c2ecf20Sopenharmony_ci - manages the details of buffer allocation, wrapped in a :c:type:`struct 358c2ecf20Sopenharmony_ci dma_buf <dma_buf>`, 368c2ecf20Sopenharmony_ci - decides about the actual backing storage where this allocation happens, 378c2ecf20Sopenharmony_ci - and takes care of any migration of scatterlist - for all (shared) users of 388c2ecf20Sopenharmony_ci this buffer. 398c2ecf20Sopenharmony_ci 408c2ecf20Sopenharmony_ciThe buffer-user 418c2ecf20Sopenharmony_ci 428c2ecf20Sopenharmony_ci - is one of (many) sharing users of the buffer. 438c2ecf20Sopenharmony_ci - doesn't need to worry about how the buffer is allocated, or where. 448c2ecf20Sopenharmony_ci - and needs a mechanism to get access to the scatterlist that makes up this 458c2ecf20Sopenharmony_ci buffer in memory, mapped into its own address space, so it can access the 468c2ecf20Sopenharmony_ci same area of memory. This interface is provided by :c:type:`struct 478c2ecf20Sopenharmony_ci dma_buf_attachment <dma_buf_attachment>`. 488c2ecf20Sopenharmony_ci 498c2ecf20Sopenharmony_ciAny exporters or users of the dma-buf buffer sharing framework must have a 508c2ecf20Sopenharmony_ci'select DMA_SHARED_BUFFER' in their respective Kconfigs. 518c2ecf20Sopenharmony_ci 528c2ecf20Sopenharmony_ciUserspace Interface Notes 538c2ecf20Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~~~~~ 548c2ecf20Sopenharmony_ci 558c2ecf20Sopenharmony_ciMostly a DMA buffer file descriptor is simply an opaque object for userspace, 568c2ecf20Sopenharmony_ciand hence the generic interface exposed is very minimal. There's a few things to 578c2ecf20Sopenharmony_ciconsider though: 588c2ecf20Sopenharmony_ci 598c2ecf20Sopenharmony_ci- Since kernel 3.12 the dma-buf FD supports the llseek system call, but only 608c2ecf20Sopenharmony_ci with offset=0 and whence=SEEK_END|SEEK_SET. SEEK_SET is supported to allow 618c2ecf20Sopenharmony_ci the usual size discover pattern size = SEEK_END(0); SEEK_SET(0). Every other 628c2ecf20Sopenharmony_ci llseek operation will report -EINVAL. 638c2ecf20Sopenharmony_ci 648c2ecf20Sopenharmony_ci If llseek on dma-buf FDs isn't support the kernel will report -ESPIPE for all 658c2ecf20Sopenharmony_ci cases. Userspace can use this to detect support for discovering the dma-buf 668c2ecf20Sopenharmony_ci size using llseek. 678c2ecf20Sopenharmony_ci 688c2ecf20Sopenharmony_ci- In order to avoid fd leaks on exec, the FD_CLOEXEC flag must be set 698c2ecf20Sopenharmony_ci on the file descriptor. This is not just a resource leak, but a 708c2ecf20Sopenharmony_ci potential security hole. It could give the newly exec'd application 718c2ecf20Sopenharmony_ci access to buffers, via the leaked fd, to which it should otherwise 728c2ecf20Sopenharmony_ci not be permitted access. 738c2ecf20Sopenharmony_ci 748c2ecf20Sopenharmony_ci The problem with doing this via a separate fcntl() call, versus doing it 758c2ecf20Sopenharmony_ci atomically when the fd is created, is that this is inherently racy in a 768c2ecf20Sopenharmony_ci multi-threaded app[3]. The issue is made worse when it is library code 778c2ecf20Sopenharmony_ci opening/creating the file descriptor, as the application may not even be 788c2ecf20Sopenharmony_ci aware of the fd's. 798c2ecf20Sopenharmony_ci 808c2ecf20Sopenharmony_ci To avoid this problem, userspace must have a way to request O_CLOEXEC 818c2ecf20Sopenharmony_ci flag be set when the dma-buf fd is created. So any API provided by 828c2ecf20Sopenharmony_ci the exporting driver to create a dmabuf fd must provide a way to let 838c2ecf20Sopenharmony_ci userspace control setting of O_CLOEXEC flag passed in to dma_buf_fd(). 848c2ecf20Sopenharmony_ci 858c2ecf20Sopenharmony_ci- Memory mapping the contents of the DMA buffer is also supported. See the 868c2ecf20Sopenharmony_ci discussion below on `CPU Access to DMA Buffer Objects`_ for the full details. 878c2ecf20Sopenharmony_ci 888c2ecf20Sopenharmony_ci- The DMA buffer FD is also pollable, see `Implicit Fence Poll Support`_ below for 898c2ecf20Sopenharmony_ci details. 908c2ecf20Sopenharmony_ci 918c2ecf20Sopenharmony_ciBasic Operation and Device DMA Access 928c2ecf20Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 938c2ecf20Sopenharmony_ci 948c2ecf20Sopenharmony_ci.. kernel-doc:: drivers/dma-buf/dma-buf.c 958c2ecf20Sopenharmony_ci :doc: dma buf device access 968c2ecf20Sopenharmony_ci 978c2ecf20Sopenharmony_ciCPU Access to DMA Buffer Objects 988c2ecf20Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 998c2ecf20Sopenharmony_ci 1008c2ecf20Sopenharmony_ci.. kernel-doc:: drivers/dma-buf/dma-buf.c 1018c2ecf20Sopenharmony_ci :doc: cpu access 1028c2ecf20Sopenharmony_ci 1038c2ecf20Sopenharmony_ciImplicit Fence Poll Support 1048c2ecf20Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1058c2ecf20Sopenharmony_ci 1068c2ecf20Sopenharmony_ci.. kernel-doc:: drivers/dma-buf/dma-buf.c 1078c2ecf20Sopenharmony_ci :doc: implicit fence polling 1088c2ecf20Sopenharmony_ci 1098c2ecf20Sopenharmony_ciDMA-BUF statistics 1108c2ecf20Sopenharmony_ci~~~~~~~~~~~~~~~~~~ 1118c2ecf20Sopenharmony_ci.. kernel-doc:: drivers/dma-buf/dma-buf-sysfs-stats.c 1128c2ecf20Sopenharmony_ci :doc: overview 1138c2ecf20Sopenharmony_ci 1148c2ecf20Sopenharmony_ciKernel Functions and Structures Reference 1158c2ecf20Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1168c2ecf20Sopenharmony_ci 1178c2ecf20Sopenharmony_ci.. kernel-doc:: drivers/dma-buf/dma-buf.c 1188c2ecf20Sopenharmony_ci :export: 1198c2ecf20Sopenharmony_ci 1208c2ecf20Sopenharmony_ci.. kernel-doc:: include/linux/dma-buf.h 1218c2ecf20Sopenharmony_ci :internal: 1228c2ecf20Sopenharmony_ci 1238c2ecf20Sopenharmony_ciReservation Objects 1248c2ecf20Sopenharmony_ci------------------- 1258c2ecf20Sopenharmony_ci 1268c2ecf20Sopenharmony_ci.. kernel-doc:: drivers/dma-buf/dma-resv.c 1278c2ecf20Sopenharmony_ci :doc: Reservation Object Overview 1288c2ecf20Sopenharmony_ci 1298c2ecf20Sopenharmony_ci.. kernel-doc:: drivers/dma-buf/dma-resv.c 1308c2ecf20Sopenharmony_ci :export: 1318c2ecf20Sopenharmony_ci 1328c2ecf20Sopenharmony_ci.. kernel-doc:: include/linux/dma-resv.h 1338c2ecf20Sopenharmony_ci :internal: 1348c2ecf20Sopenharmony_ci 1358c2ecf20Sopenharmony_ciDMA Fences 1368c2ecf20Sopenharmony_ci---------- 1378c2ecf20Sopenharmony_ci 1388c2ecf20Sopenharmony_ci.. kernel-doc:: drivers/dma-buf/dma-fence.c 1398c2ecf20Sopenharmony_ci :doc: DMA fences overview 1408c2ecf20Sopenharmony_ci 1418c2ecf20Sopenharmony_ciDMA Fence Cross-Driver Contract 1428c2ecf20Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1438c2ecf20Sopenharmony_ci 1448c2ecf20Sopenharmony_ci.. kernel-doc:: drivers/dma-buf/dma-fence.c 1458c2ecf20Sopenharmony_ci :doc: fence cross-driver contract 1468c2ecf20Sopenharmony_ci 1478c2ecf20Sopenharmony_ciDMA Fence Signalling Annotations 1488c2ecf20Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1498c2ecf20Sopenharmony_ci 1508c2ecf20Sopenharmony_ci.. kernel-doc:: drivers/dma-buf/dma-fence.c 1518c2ecf20Sopenharmony_ci :doc: fence signalling annotation 1528c2ecf20Sopenharmony_ci 1538c2ecf20Sopenharmony_ciDMA Fences Functions Reference 1548c2ecf20Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1558c2ecf20Sopenharmony_ci 1568c2ecf20Sopenharmony_ci.. kernel-doc:: drivers/dma-buf/dma-fence.c 1578c2ecf20Sopenharmony_ci :export: 1588c2ecf20Sopenharmony_ci 1598c2ecf20Sopenharmony_ci.. kernel-doc:: include/linux/dma-fence.h 1608c2ecf20Sopenharmony_ci :internal: 1618c2ecf20Sopenharmony_ci 1628c2ecf20Sopenharmony_ciSeqno Hardware Fences 1638c2ecf20Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~ 1648c2ecf20Sopenharmony_ci 1658c2ecf20Sopenharmony_ci.. kernel-doc:: include/linux/seqno-fence.h 1668c2ecf20Sopenharmony_ci :internal: 1678c2ecf20Sopenharmony_ci 1688c2ecf20Sopenharmony_ciDMA Fence Array 1698c2ecf20Sopenharmony_ci~~~~~~~~~~~~~~~ 1708c2ecf20Sopenharmony_ci 1718c2ecf20Sopenharmony_ci.. kernel-doc:: drivers/dma-buf/dma-fence-array.c 1728c2ecf20Sopenharmony_ci :export: 1738c2ecf20Sopenharmony_ci 1748c2ecf20Sopenharmony_ci.. kernel-doc:: include/linux/dma-fence-array.h 1758c2ecf20Sopenharmony_ci :internal: 1768c2ecf20Sopenharmony_ci 1778c2ecf20Sopenharmony_ciDMA Fence uABI/Sync File 1788c2ecf20Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~~~~ 1798c2ecf20Sopenharmony_ci 1808c2ecf20Sopenharmony_ci.. kernel-doc:: drivers/dma-buf/sync_file.c 1818c2ecf20Sopenharmony_ci :export: 1828c2ecf20Sopenharmony_ci 1838c2ecf20Sopenharmony_ci.. kernel-doc:: include/linux/sync_file.h 1848c2ecf20Sopenharmony_ci :internal: 1858c2ecf20Sopenharmony_ci 1868c2ecf20Sopenharmony_ciIndefinite DMA Fences 1878c2ecf20Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~ 1888c2ecf20Sopenharmony_ci 1898c2ecf20Sopenharmony_ciAt various times &dma_fence with an indefinite time until dma_fence_wait() 1908c2ecf20Sopenharmony_cifinishes have been proposed. Examples include: 1918c2ecf20Sopenharmony_ci 1928c2ecf20Sopenharmony_ci* Future fences, used in HWC1 to signal when a buffer isn't used by the display 1938c2ecf20Sopenharmony_ci any longer, and created with the screen update that makes the buffer visible. 1948c2ecf20Sopenharmony_ci The time this fence completes is entirely under userspace's control. 1958c2ecf20Sopenharmony_ci 1968c2ecf20Sopenharmony_ci* Proxy fences, proposed to handle &drm_syncobj for which the fence has not yet 1978c2ecf20Sopenharmony_ci been set. Used to asynchronously delay command submission. 1988c2ecf20Sopenharmony_ci 1998c2ecf20Sopenharmony_ci* Userspace fences or gpu futexes, fine-grained locking within a command buffer 2008c2ecf20Sopenharmony_ci that userspace uses for synchronization across engines or with the CPU, which 2018c2ecf20Sopenharmony_ci are then imported as a DMA fence for integration into existing winsys 2028c2ecf20Sopenharmony_ci protocols. 2038c2ecf20Sopenharmony_ci 2048c2ecf20Sopenharmony_ci* Long-running compute command buffers, while still using traditional end of 2058c2ecf20Sopenharmony_ci batch DMA fences for memory management instead of context preemption DMA 2068c2ecf20Sopenharmony_ci fences which get reattached when the compute job is rescheduled. 2078c2ecf20Sopenharmony_ci 2088c2ecf20Sopenharmony_ciCommon to all these schemes is that userspace controls the dependencies of these 2098c2ecf20Sopenharmony_cifences and controls when they fire. Mixing indefinite fences with normal 2108c2ecf20Sopenharmony_ciin-kernel DMA fences does not work, even when a fallback timeout is included to 2118c2ecf20Sopenharmony_ciprotect against malicious userspace: 2128c2ecf20Sopenharmony_ci 2138c2ecf20Sopenharmony_ci* Only the kernel knows about all DMA fence dependencies, userspace is not aware 2148c2ecf20Sopenharmony_ci of dependencies injected due to memory management or scheduler decisions. 2158c2ecf20Sopenharmony_ci 2168c2ecf20Sopenharmony_ci* Only userspace knows about all dependencies in indefinite fences and when 2178c2ecf20Sopenharmony_ci exactly they will complete, the kernel has no visibility. 2188c2ecf20Sopenharmony_ci 2198c2ecf20Sopenharmony_ciFurthermore the kernel has to be able to hold up userspace command submission 2208c2ecf20Sopenharmony_cifor memory management needs, which means we must support indefinite fences being 2218c2ecf20Sopenharmony_cidependent upon DMA fences. If the kernel also support indefinite fences in the 2228c2ecf20Sopenharmony_cikernel like a DMA fence, like any of the above proposal would, there is the 2238c2ecf20Sopenharmony_cipotential for deadlocks. 2248c2ecf20Sopenharmony_ci 2258c2ecf20Sopenharmony_ci.. kernel-render:: DOT 2268c2ecf20Sopenharmony_ci :alt: Indefinite Fencing Dependency Cycle 2278c2ecf20Sopenharmony_ci :caption: Indefinite Fencing Dependency Cycle 2288c2ecf20Sopenharmony_ci 2298c2ecf20Sopenharmony_ci digraph "Fencing Cycle" { 2308c2ecf20Sopenharmony_ci node [shape=box bgcolor=grey style=filled] 2318c2ecf20Sopenharmony_ci kernel [label="Kernel DMA Fences"] 2328c2ecf20Sopenharmony_ci userspace [label="userspace controlled fences"] 2338c2ecf20Sopenharmony_ci kernel -> userspace [label="memory management"] 2348c2ecf20Sopenharmony_ci userspace -> kernel [label="Future fence, fence proxy, ..."] 2358c2ecf20Sopenharmony_ci 2368c2ecf20Sopenharmony_ci { rank=same; kernel userspace } 2378c2ecf20Sopenharmony_ci } 2388c2ecf20Sopenharmony_ci 2398c2ecf20Sopenharmony_ciThis means that the kernel might accidentally create deadlocks 2408c2ecf20Sopenharmony_cithrough memory management dependencies which userspace is unaware of, which 2418c2ecf20Sopenharmony_cirandomly hangs workloads until the timeout kicks in. Workloads, which from 2428c2ecf20Sopenharmony_ciuserspace's perspective, do not contain a deadlock. In such a mixed fencing 2438c2ecf20Sopenharmony_ciarchitecture there is no single entity with knowledge of all dependencies. 2448c2ecf20Sopenharmony_ciThefore preventing such deadlocks from within the kernel is not possible. 2458c2ecf20Sopenharmony_ci 2468c2ecf20Sopenharmony_ciThe only solution to avoid dependencies loops is by not allowing indefinite 2478c2ecf20Sopenharmony_cifences in the kernel. This means: 2488c2ecf20Sopenharmony_ci 2498c2ecf20Sopenharmony_ci* No future fences, proxy fences or userspace fences imported as DMA fences, 2508c2ecf20Sopenharmony_ci with or without a timeout. 2518c2ecf20Sopenharmony_ci 2528c2ecf20Sopenharmony_ci* No DMA fences that signal end of batchbuffer for command submission where 2538c2ecf20Sopenharmony_ci userspace is allowed to use userspace fencing or long running compute 2548c2ecf20Sopenharmony_ci workloads. This also means no implicit fencing for shared buffers in these 2558c2ecf20Sopenharmony_ci cases. 256