18c2ecf20Sopenharmony_ci.. _userfaultfd:
28c2ecf20Sopenharmony_ci
38c2ecf20Sopenharmony_ci===========
48c2ecf20Sopenharmony_ciUserfaultfd
58c2ecf20Sopenharmony_ci===========
68c2ecf20Sopenharmony_ci
78c2ecf20Sopenharmony_ciObjective
88c2ecf20Sopenharmony_ci=========
98c2ecf20Sopenharmony_ci
108c2ecf20Sopenharmony_ciUserfaults allow the implementation of on-demand paging from userland
118c2ecf20Sopenharmony_ciand more generally they allow userland to take control of various
128c2ecf20Sopenharmony_cimemory page faults, something otherwise only the kernel code could do.
138c2ecf20Sopenharmony_ci
148c2ecf20Sopenharmony_ciFor example userfaults allows a proper and more optimal implementation
158c2ecf20Sopenharmony_ciof the ``PROT_NONE+SIGSEGV`` trick.
168c2ecf20Sopenharmony_ci
178c2ecf20Sopenharmony_ciDesign
188c2ecf20Sopenharmony_ci======
198c2ecf20Sopenharmony_ci
208c2ecf20Sopenharmony_ciUserfaults are delivered and resolved through the ``userfaultfd`` syscall.
218c2ecf20Sopenharmony_ci
228c2ecf20Sopenharmony_ciThe ``userfaultfd`` (aside from registering and unregistering virtual
238c2ecf20Sopenharmony_cimemory ranges) provides two primary functionalities:
248c2ecf20Sopenharmony_ci
258c2ecf20Sopenharmony_ci1) ``read/POLLIN`` protocol to notify a userland thread of the faults
268c2ecf20Sopenharmony_ci   happening
278c2ecf20Sopenharmony_ci
288c2ecf20Sopenharmony_ci2) various ``UFFDIO_*`` ioctls that can manage the virtual memory regions
298c2ecf20Sopenharmony_ci   registered in the ``userfaultfd`` that allows userland to efficiently
308c2ecf20Sopenharmony_ci   resolve the userfaults it receives via 1) or to manage the virtual
318c2ecf20Sopenharmony_ci   memory in the background
328c2ecf20Sopenharmony_ci
338c2ecf20Sopenharmony_ciThe real advantage of userfaults if compared to regular virtual memory
348c2ecf20Sopenharmony_cimanagement of mremap/mprotect is that the userfaults in all their
358c2ecf20Sopenharmony_cioperations never involve heavyweight structures like vmas (in fact the
368c2ecf20Sopenharmony_ci``userfaultfd`` runtime load never takes the mmap_lock for writing).
378c2ecf20Sopenharmony_ci
388c2ecf20Sopenharmony_ciVmas are not suitable for page- (or hugepage) granular fault tracking
398c2ecf20Sopenharmony_ciwhen dealing with virtual address spaces that could span
408c2ecf20Sopenharmony_ciTerabytes. Too many vmas would be needed for that.
418c2ecf20Sopenharmony_ci
428c2ecf20Sopenharmony_ciThe ``userfaultfd`` once opened by invoking the syscall, can also be
438c2ecf20Sopenharmony_cipassed using unix domain sockets to a manager process, so the same
448c2ecf20Sopenharmony_cimanager process could handle the userfaults of a multitude of
458c2ecf20Sopenharmony_cidifferent processes without them being aware about what is going on
468c2ecf20Sopenharmony_ci(well of course unless they later try to use the ``userfaultfd``
478c2ecf20Sopenharmony_cithemselves on the same region the manager is already tracking, which
488c2ecf20Sopenharmony_ciis a corner case that would currently return ``-EBUSY``).
498c2ecf20Sopenharmony_ci
508c2ecf20Sopenharmony_ciAPI
518c2ecf20Sopenharmony_ci===
528c2ecf20Sopenharmony_ci
538c2ecf20Sopenharmony_ciWhen first opened the ``userfaultfd`` must be enabled invoking the
548c2ecf20Sopenharmony_ci``UFFDIO_API`` ioctl specifying a ``uffdio_api.api`` value set to ``UFFD_API`` (or
558c2ecf20Sopenharmony_cia later API version) which will specify the ``read/POLLIN`` protocol
568c2ecf20Sopenharmony_ciuserland intends to speak on the ``UFFD`` and the ``uffdio_api.features``
578c2ecf20Sopenharmony_ciuserland requires. The ``UFFDIO_API`` ioctl if successful (i.e. if the
588c2ecf20Sopenharmony_cirequested ``uffdio_api.api`` is spoken also by the running kernel and the
598c2ecf20Sopenharmony_cirequested features are going to be enabled) will return into
608c2ecf20Sopenharmony_ci``uffdio_api.features`` and ``uffdio_api.ioctls`` two 64bit bitmasks of
618c2ecf20Sopenharmony_cirespectively all the available features of the read(2) protocol and
628c2ecf20Sopenharmony_cithe generic ioctl available.
638c2ecf20Sopenharmony_ci
648c2ecf20Sopenharmony_ciThe ``uffdio_api.features`` bitmask returned by the ``UFFDIO_API`` ioctl
658c2ecf20Sopenharmony_cidefines what memory types are supported by the ``userfaultfd`` and what
668c2ecf20Sopenharmony_cievents, except page fault notifications, may be generated.
678c2ecf20Sopenharmony_ci
688c2ecf20Sopenharmony_ciIf the kernel supports registering ``userfaultfd`` ranges on hugetlbfs
698c2ecf20Sopenharmony_civirtual memory areas, ``UFFD_FEATURE_MISSING_HUGETLBFS`` will be set in
708c2ecf20Sopenharmony_ci``uffdio_api.features``. Similarly, ``UFFD_FEATURE_MISSING_SHMEM`` will be
718c2ecf20Sopenharmony_ciset if the kernel supports registering ``userfaultfd`` ranges on shared
728c2ecf20Sopenharmony_cimemory (covering all shmem APIs, i.e. tmpfs, ``IPCSHM``, ``/dev/zero``,
738c2ecf20Sopenharmony_ci``MAP_SHARED``, ``memfd_create``, etc).
748c2ecf20Sopenharmony_ci
758c2ecf20Sopenharmony_ciThe userland application that wants to use ``userfaultfd`` with hugetlbfs
768c2ecf20Sopenharmony_cior shared memory need to set the corresponding flag in
778c2ecf20Sopenharmony_ci``uffdio_api.features`` to enable those features.
788c2ecf20Sopenharmony_ci
798c2ecf20Sopenharmony_ciIf the userland desires to receive notifications for events other than
808c2ecf20Sopenharmony_cipage faults, it has to verify that ``uffdio_api.features`` has appropriate
818c2ecf20Sopenharmony_ci``UFFD_FEATURE_EVENT_*`` bits set. These events are described in more
828c2ecf20Sopenharmony_cidetail below in `Non-cooperative userfaultfd`_ section.
838c2ecf20Sopenharmony_ci
848c2ecf20Sopenharmony_ciOnce the ``userfaultfd`` has been enabled the ``UFFDIO_REGISTER`` ioctl should
858c2ecf20Sopenharmony_cibe invoked (if present in the returned ``uffdio_api.ioctls`` bitmask) to
868c2ecf20Sopenharmony_ciregister a memory range in the ``userfaultfd`` by setting the
878c2ecf20Sopenharmony_ciuffdio_register structure accordingly. The ``uffdio_register.mode``
888c2ecf20Sopenharmony_cibitmask will specify to the kernel which kind of faults to track for
898c2ecf20Sopenharmony_cithe range (``UFFDIO_REGISTER_MODE_MISSING`` would track missing
908c2ecf20Sopenharmony_cipages). The ``UFFDIO_REGISTER`` ioctl will return the
918c2ecf20Sopenharmony_ci``uffdio_register.ioctls`` bitmask of ioctls that are suitable to resolve
928c2ecf20Sopenharmony_ciuserfaults on the range registered. Not all ioctls will necessarily be
938c2ecf20Sopenharmony_cisupported for all memory types depending on the underlying virtual
948c2ecf20Sopenharmony_cimemory backend (anonymous memory vs tmpfs vs real filebacked
958c2ecf20Sopenharmony_cimappings).
968c2ecf20Sopenharmony_ci
978c2ecf20Sopenharmony_ciUserland can use the ``uffdio_register.ioctls`` to manage the virtual
988c2ecf20Sopenharmony_ciaddress space in the background (to add or potentially also remove
998c2ecf20Sopenharmony_cimemory from the ``userfaultfd`` registered range). This means a userfault
1008c2ecf20Sopenharmony_cicould be triggering just before userland maps in the background the
1018c2ecf20Sopenharmony_ciuser-faulted page.
1028c2ecf20Sopenharmony_ci
1038c2ecf20Sopenharmony_ciThe primary ioctl to resolve userfaults is ``UFFDIO_COPY``. That
1048c2ecf20Sopenharmony_ciatomically copies a page into the userfault registered range and wakes
1058c2ecf20Sopenharmony_ciup the blocked userfaults
1068c2ecf20Sopenharmony_ci(unless ``uffdio_copy.mode & UFFDIO_COPY_MODE_DONTWAKE`` is set).
1078c2ecf20Sopenharmony_ciOther ioctl works similarly to ``UFFDIO_COPY``. They're atomic as in
1088c2ecf20Sopenharmony_ciguaranteeing that nothing can see an half copied page since it'll
1098c2ecf20Sopenharmony_cikeep userfaulting until the copy has finished.
1108c2ecf20Sopenharmony_ci
1118c2ecf20Sopenharmony_ciNotes:
1128c2ecf20Sopenharmony_ci
1138c2ecf20Sopenharmony_ci- If you requested ``UFFDIO_REGISTER_MODE_MISSING`` when registering then
1148c2ecf20Sopenharmony_ci  you must provide some kind of page in your thread after reading from
1158c2ecf20Sopenharmony_ci  the uffd.  You must provide either ``UFFDIO_COPY`` or ``UFFDIO_ZEROPAGE``.
1168c2ecf20Sopenharmony_ci  The normal behavior of the OS automatically providing a zero page on
1178c2ecf20Sopenharmony_ci  an annonymous mmaping is not in place.
1188c2ecf20Sopenharmony_ci
1198c2ecf20Sopenharmony_ci- None of the page-delivering ioctls default to the range that you
1208c2ecf20Sopenharmony_ci  registered with.  You must fill in all fields for the appropriate
1218c2ecf20Sopenharmony_ci  ioctl struct including the range.
1228c2ecf20Sopenharmony_ci
1238c2ecf20Sopenharmony_ci- You get the address of the access that triggered the missing page
1248c2ecf20Sopenharmony_ci  event out of a struct uffd_msg that you read in the thread from the
1258c2ecf20Sopenharmony_ci  uffd.  You can supply as many pages as you want with ``UFFDIO_COPY`` or
1268c2ecf20Sopenharmony_ci  ``UFFDIO_ZEROPAGE``.  Keep in mind that unless you used DONTWAKE then
1278c2ecf20Sopenharmony_ci  the first of any of those IOCTLs wakes up the faulting thread.
1288c2ecf20Sopenharmony_ci
1298c2ecf20Sopenharmony_ci- Be sure to test for all errors including
1308c2ecf20Sopenharmony_ci  (``pollfd[0].revents & POLLERR``).  This can happen, e.g. when ranges
1318c2ecf20Sopenharmony_ci  supplied were incorrect.
1328c2ecf20Sopenharmony_ci
1338c2ecf20Sopenharmony_ciWrite Protect Notifications
1348c2ecf20Sopenharmony_ci---------------------------
1358c2ecf20Sopenharmony_ci
1368c2ecf20Sopenharmony_ciThis is equivalent to (but faster than) using mprotect and a SIGSEGV
1378c2ecf20Sopenharmony_cisignal handler.
1388c2ecf20Sopenharmony_ci
1398c2ecf20Sopenharmony_ciFirstly you need to register a range with ``UFFDIO_REGISTER_MODE_WP``.
1408c2ecf20Sopenharmony_ciInstead of using mprotect(2) you use
1418c2ecf20Sopenharmony_ci``ioctl(uffd, UFFDIO_WRITEPROTECT, struct *uffdio_writeprotect)``
1428c2ecf20Sopenharmony_ciwhile ``mode = UFFDIO_WRITEPROTECT_MODE_WP``
1438c2ecf20Sopenharmony_ciin the struct passed in.  The range does not default to and does not
1448c2ecf20Sopenharmony_cihave to be identical to the range you registered with.  You can write
1458c2ecf20Sopenharmony_ciprotect as many ranges as you like (inside the registered range).
1468c2ecf20Sopenharmony_ciThen, in the thread reading from uffd the struct will have
1478c2ecf20Sopenharmony_ci``msg.arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WP`` set. Now you send
1488c2ecf20Sopenharmony_ci``ioctl(uffd, UFFDIO_WRITEPROTECT, struct *uffdio_writeprotect)``
1498c2ecf20Sopenharmony_ciagain while ``pagefault.mode`` does not have ``UFFDIO_WRITEPROTECT_MODE_WP``
1508c2ecf20Sopenharmony_ciset. This wakes up the thread which will continue to run with writes. This
1518c2ecf20Sopenharmony_ciallows you to do the bookkeeping about the write in the uffd reading
1528c2ecf20Sopenharmony_cithread before the ioctl.
1538c2ecf20Sopenharmony_ci
1548c2ecf20Sopenharmony_ciIf you registered with both ``UFFDIO_REGISTER_MODE_MISSING`` and
1558c2ecf20Sopenharmony_ci``UFFDIO_REGISTER_MODE_WP`` then you need to think about the sequence in
1568c2ecf20Sopenharmony_ciwhich you supply a page and undo write protect.  Note that there is a
1578c2ecf20Sopenharmony_cidifference between writes into a WP area and into a !WP area.  The
1588c2ecf20Sopenharmony_ciformer will have ``UFFD_PAGEFAULT_FLAG_WP`` set, the latter
1598c2ecf20Sopenharmony_ci``UFFD_PAGEFAULT_FLAG_WRITE``.  The latter did not fail on protection but
1608c2ecf20Sopenharmony_ciyou still need to supply a page when ``UFFDIO_REGISTER_MODE_MISSING`` was
1618c2ecf20Sopenharmony_ciused.
1628c2ecf20Sopenharmony_ci
1638c2ecf20Sopenharmony_ciQEMU/KVM
1648c2ecf20Sopenharmony_ci========
1658c2ecf20Sopenharmony_ci
1668c2ecf20Sopenharmony_ciQEMU/KVM is using the ``userfaultfd`` syscall to implement postcopy live
1678c2ecf20Sopenharmony_cimigration. Postcopy live migration is one form of memory
1688c2ecf20Sopenharmony_ciexternalization consisting of a virtual machine running with part or
1698c2ecf20Sopenharmony_ciall of its memory residing on a different node in the cloud. The
1708c2ecf20Sopenharmony_ci``userfaultfd`` abstraction is generic enough that not a single line of
1718c2ecf20Sopenharmony_ciKVM kernel code had to be modified in order to add postcopy live
1728c2ecf20Sopenharmony_cimigration to QEMU.
1738c2ecf20Sopenharmony_ci
1748c2ecf20Sopenharmony_ciGuest async page faults, ``FOLL_NOWAIT`` and all other ``GUP*`` features work
1758c2ecf20Sopenharmony_cijust fine in combination with userfaults. Userfaults trigger async
1768c2ecf20Sopenharmony_cipage faults in the guest scheduler so those guest processes that
1778c2ecf20Sopenharmony_ciaren't waiting for userfaults (i.e. network bound) can keep running in
1788c2ecf20Sopenharmony_cithe guest vcpus.
1798c2ecf20Sopenharmony_ci
1808c2ecf20Sopenharmony_ciIt is generally beneficial to run one pass of precopy live migration
1818c2ecf20Sopenharmony_cijust before starting postcopy live migration, in order to avoid
1828c2ecf20Sopenharmony_cigenerating userfaults for readonly guest regions.
1838c2ecf20Sopenharmony_ci
1848c2ecf20Sopenharmony_ciThe implementation of postcopy live migration currently uses one
1858c2ecf20Sopenharmony_cisingle bidirectional socket but in the future two different sockets
1868c2ecf20Sopenharmony_ciwill be used (to reduce the latency of the userfaults to the minimum
1878c2ecf20Sopenharmony_cipossible without having to decrease ``/proc/sys/net/ipv4/tcp_wmem``).
1888c2ecf20Sopenharmony_ci
1898c2ecf20Sopenharmony_ciThe QEMU in the source node writes all pages that it knows are missing
1908c2ecf20Sopenharmony_ciin the destination node, into the socket, and the migration thread of
1918c2ecf20Sopenharmony_cithe QEMU running in the destination node runs ``UFFDIO_COPY|ZEROPAGE``
1928c2ecf20Sopenharmony_ciioctls on the ``userfaultfd`` in order to map the received pages into the
1938c2ecf20Sopenharmony_ciguest (``UFFDIO_ZEROCOPY`` is used if the source page was a zero page).
1948c2ecf20Sopenharmony_ci
1958c2ecf20Sopenharmony_ciA different postcopy thread in the destination node listens with
1968c2ecf20Sopenharmony_cipoll() to the ``userfaultfd`` in parallel. When a ``POLLIN`` event is
1978c2ecf20Sopenharmony_cigenerated after a userfault triggers, the postcopy thread read() from
1988c2ecf20Sopenharmony_cithe ``userfaultfd`` and receives the fault address (or ``-EAGAIN`` in case the
1998c2ecf20Sopenharmony_ciuserfault was already resolved and waken by a ``UFFDIO_COPY|ZEROPAGE`` run
2008c2ecf20Sopenharmony_ciby the parallel QEMU migration thread).
2018c2ecf20Sopenharmony_ci
2028c2ecf20Sopenharmony_ciAfter the QEMU postcopy thread (running in the destination node) gets
2038c2ecf20Sopenharmony_cithe userfault address it writes the information about the missing page
2048c2ecf20Sopenharmony_ciinto the socket. The QEMU source node receives the information and
2058c2ecf20Sopenharmony_ciroughly "seeks" to that page address and continues sending all
2068c2ecf20Sopenharmony_ciremaining missing pages from that new page offset. Soon after that
2078c2ecf20Sopenharmony_ci(just the time to flush the tcp_wmem queue through the network) the
2088c2ecf20Sopenharmony_cimigration thread in the QEMU running in the destination node will
2098c2ecf20Sopenharmony_cireceive the page that triggered the userfault and it'll map it as
2108c2ecf20Sopenharmony_ciusual with the ``UFFDIO_COPY|ZEROPAGE`` (without actually knowing if it
2118c2ecf20Sopenharmony_ciwas spontaneously sent by the source or if it was an urgent page
2128c2ecf20Sopenharmony_cirequested through a userfault).
2138c2ecf20Sopenharmony_ci
2148c2ecf20Sopenharmony_ciBy the time the userfaults start, the QEMU in the destination node
2158c2ecf20Sopenharmony_cidoesn't need to keep any per-page state bitmap relative to the live
2168c2ecf20Sopenharmony_cimigration around and a single per-page bitmap has to be maintained in
2178c2ecf20Sopenharmony_cithe QEMU running in the source node to know which pages are still
2188c2ecf20Sopenharmony_cimissing in the destination node. The bitmap in the source node is
2198c2ecf20Sopenharmony_cichecked to find which missing pages to send in round robin and we seek
2208c2ecf20Sopenharmony_ciover it when receiving incoming userfaults. After sending each page of
2218c2ecf20Sopenharmony_cicourse the bitmap is updated accordingly. It's also useful to avoid
2228c2ecf20Sopenharmony_cisending the same page twice (in case the userfault is read by the
2238c2ecf20Sopenharmony_cipostcopy thread just before ``UFFDIO_COPY|ZEROPAGE`` runs in the migration
2248c2ecf20Sopenharmony_cithread).
2258c2ecf20Sopenharmony_ci
2268c2ecf20Sopenharmony_ciNon-cooperative userfaultfd
2278c2ecf20Sopenharmony_ci===========================
2288c2ecf20Sopenharmony_ci
2298c2ecf20Sopenharmony_ciWhen the ``userfaultfd`` is monitored by an external manager, the manager
2308c2ecf20Sopenharmony_cimust be able to track changes in the process virtual memory
2318c2ecf20Sopenharmony_cilayout. Userfaultfd can notify the manager about such changes using
2328c2ecf20Sopenharmony_cithe same read(2) protocol as for the page fault notifications. The
2338c2ecf20Sopenharmony_cimanager has to explicitly enable these events by setting appropriate
2348c2ecf20Sopenharmony_cibits in ``uffdio_api.features`` passed to ``UFFDIO_API`` ioctl:
2358c2ecf20Sopenharmony_ci
2368c2ecf20Sopenharmony_ci``UFFD_FEATURE_EVENT_FORK``
2378c2ecf20Sopenharmony_ci	enable ``userfaultfd`` hooks for fork(). When this feature is
2388c2ecf20Sopenharmony_ci	enabled, the ``userfaultfd`` context of the parent process is
2398c2ecf20Sopenharmony_ci	duplicated into the newly created process. The manager
2408c2ecf20Sopenharmony_ci	receives ``UFFD_EVENT_FORK`` with file descriptor of the new
2418c2ecf20Sopenharmony_ci	``userfaultfd`` context in the ``uffd_msg.fork``.
2428c2ecf20Sopenharmony_ci
2438c2ecf20Sopenharmony_ci``UFFD_FEATURE_EVENT_REMAP``
2448c2ecf20Sopenharmony_ci	enable notifications about mremap() calls. When the
2458c2ecf20Sopenharmony_ci	non-cooperative process moves a virtual memory area to a
2468c2ecf20Sopenharmony_ci	different location, the manager will receive
2478c2ecf20Sopenharmony_ci	``UFFD_EVENT_REMAP``. The ``uffd_msg.remap`` will contain the old and
2488c2ecf20Sopenharmony_ci	new addresses of the area and its original length.
2498c2ecf20Sopenharmony_ci
2508c2ecf20Sopenharmony_ci``UFFD_FEATURE_EVENT_REMOVE``
2518c2ecf20Sopenharmony_ci	enable notifications about madvise(MADV_REMOVE) and
2528c2ecf20Sopenharmony_ci	madvise(MADV_DONTNEED) calls. The event ``UFFD_EVENT_REMOVE`` will
2538c2ecf20Sopenharmony_ci	be generated upon these calls to madvise(). The ``uffd_msg.remove``
2548c2ecf20Sopenharmony_ci	will contain start and end addresses of the removed area.
2558c2ecf20Sopenharmony_ci
2568c2ecf20Sopenharmony_ci``UFFD_FEATURE_EVENT_UNMAP``
2578c2ecf20Sopenharmony_ci	enable notifications about memory unmapping. The manager will
2588c2ecf20Sopenharmony_ci	get ``UFFD_EVENT_UNMAP`` with ``uffd_msg.remove`` containing start and
2598c2ecf20Sopenharmony_ci	end addresses of the unmapped area.
2608c2ecf20Sopenharmony_ci
2618c2ecf20Sopenharmony_ciAlthough the ``UFFD_FEATURE_EVENT_REMOVE`` and ``UFFD_FEATURE_EVENT_UNMAP``
2628c2ecf20Sopenharmony_ciare pretty similar, they quite differ in the action expected from the
2638c2ecf20Sopenharmony_ci``userfaultfd`` manager. In the former case, the virtual memory is
2648c2ecf20Sopenharmony_ciremoved, but the area is not, the area remains monitored by the
2658c2ecf20Sopenharmony_ci``userfaultfd``, and if a page fault occurs in that area it will be
2668c2ecf20Sopenharmony_cidelivered to the manager. The proper resolution for such page fault is
2678c2ecf20Sopenharmony_cito zeromap the faulting address. However, in the latter case, when an
2688c2ecf20Sopenharmony_ciarea is unmapped, either explicitly (with munmap() system call), or
2698c2ecf20Sopenharmony_ciimplicitly (e.g. during mremap()), the area is removed and in turn the
2708c2ecf20Sopenharmony_ci``userfaultfd`` context for such area disappears too and the manager will
2718c2ecf20Sopenharmony_cinot get further userland page faults from the removed area. Still, the
2728c2ecf20Sopenharmony_cinotification is required in order to prevent manager from using
2738c2ecf20Sopenharmony_ci``UFFDIO_COPY`` on the unmapped area.
2748c2ecf20Sopenharmony_ci
2758c2ecf20Sopenharmony_ciUnlike userland page faults which have to be synchronous and require
2768c2ecf20Sopenharmony_ciexplicit or implicit wakeup, all the events are delivered
2778c2ecf20Sopenharmony_ciasynchronously and the non-cooperative process resumes execution as
2788c2ecf20Sopenharmony_cisoon as manager executes read(). The ``userfaultfd`` manager should
2798c2ecf20Sopenharmony_cicarefully synchronize calls to ``UFFDIO_COPY`` with the events
2808c2ecf20Sopenharmony_ciprocessing. To aid the synchronization, the ``UFFDIO_COPY`` ioctl will
2818c2ecf20Sopenharmony_cireturn ``-ENOSPC`` when the monitored process exits at the time of
2828c2ecf20Sopenharmony_ci``UFFDIO_COPY``, and ``-ENOENT``, when the non-cooperative process has changed
2838c2ecf20Sopenharmony_ciits virtual memory layout simultaneously with outstanding ``UFFDIO_COPY``
2848c2ecf20Sopenharmony_cioperation.
2858c2ecf20Sopenharmony_ci
2868c2ecf20Sopenharmony_ciThe current asynchronous model of the event delivery is optimal for
2878c2ecf20Sopenharmony_cisingle threaded non-cooperative ``userfaultfd`` manager implementations. A
2888c2ecf20Sopenharmony_cisynchronous event delivery model can be added later as a new
2898c2ecf20Sopenharmony_ci``userfaultfd`` feature to facilitate multithreading enhancements of the
2908c2ecf20Sopenharmony_cinon cooperative manager, for example to allow ``UFFDIO_COPY`` ioctls to
2918c2ecf20Sopenharmony_cirun in parallel to the event reception. Single threaded
2928c2ecf20Sopenharmony_ciimplementations should continue to use the current async event
2938c2ecf20Sopenharmony_cidelivery model instead.
294