18c2ecf20Sopenharmony_ci==========================================
28c2ecf20Sopenharmony_ciExplicit volatile write back cache control
38c2ecf20Sopenharmony_ci==========================================
48c2ecf20Sopenharmony_ci
58c2ecf20Sopenharmony_ciIntroduction
68c2ecf20Sopenharmony_ci------------
78c2ecf20Sopenharmony_ci
88c2ecf20Sopenharmony_ciMany storage devices, especially in the consumer market, come with volatile
98c2ecf20Sopenharmony_ciwrite back caches.  That means the devices signal I/O completion to the
108c2ecf20Sopenharmony_cioperating system before data actually has hit the non-volatile storage.  This
118c2ecf20Sopenharmony_cibehavior obviously speeds up various workloads, but it means the operating
128c2ecf20Sopenharmony_cisystem needs to force data out to the non-volatile storage when it performs
138c2ecf20Sopenharmony_cia data integrity operation like fsync, sync or an unmount.
148c2ecf20Sopenharmony_ci
158c2ecf20Sopenharmony_ciThe Linux block layer provides two simple mechanisms that let filesystems
168c2ecf20Sopenharmony_cicontrol the caching behavior of the storage device.  These mechanisms are
178c2ecf20Sopenharmony_cia forced cache flush, and the Force Unit Access (FUA) flag for requests.
188c2ecf20Sopenharmony_ci
198c2ecf20Sopenharmony_ci
208c2ecf20Sopenharmony_ciExplicit cache flushes
218c2ecf20Sopenharmony_ci----------------------
228c2ecf20Sopenharmony_ci
238c2ecf20Sopenharmony_ciThe REQ_PREFLUSH flag can be OR ed into the r/w flags of a bio submitted from
248c2ecf20Sopenharmony_cithe filesystem and will make sure the volatile cache of the storage device
258c2ecf20Sopenharmony_cihas been flushed before the actual I/O operation is started.  This explicitly
268c2ecf20Sopenharmony_ciguarantees that previously completed write requests are on non-volatile
278c2ecf20Sopenharmony_cistorage before the flagged bio starts. In addition the REQ_PREFLUSH flag can be
288c2ecf20Sopenharmony_ciset on an otherwise empty bio structure, which causes only an explicit cache
298c2ecf20Sopenharmony_ciflush without any dependent I/O.  It is recommend to use
308c2ecf20Sopenharmony_cithe blkdev_issue_flush() helper for a pure cache flush.
318c2ecf20Sopenharmony_ci
328c2ecf20Sopenharmony_ci
338c2ecf20Sopenharmony_ciForced Unit Access
348c2ecf20Sopenharmony_ci------------------
358c2ecf20Sopenharmony_ci
368c2ecf20Sopenharmony_ciThe REQ_FUA flag can be OR ed into the r/w flags of a bio submitted from the
378c2ecf20Sopenharmony_cifilesystem and will make sure that I/O completion for this request is only
388c2ecf20Sopenharmony_cisignaled after the data has been committed to non-volatile storage.
398c2ecf20Sopenharmony_ci
408c2ecf20Sopenharmony_ci
418c2ecf20Sopenharmony_ciImplementation details for filesystems
428c2ecf20Sopenharmony_ci--------------------------------------
438c2ecf20Sopenharmony_ci
448c2ecf20Sopenharmony_ciFilesystems can simply set the REQ_PREFLUSH and REQ_FUA bits and do not have to
458c2ecf20Sopenharmony_ciworry if the underlying devices need any explicit cache flushing and how
468c2ecf20Sopenharmony_cithe Forced Unit Access is implemented.  The REQ_PREFLUSH and REQ_FUA flags
478c2ecf20Sopenharmony_cimay both be set on a single bio.
488c2ecf20Sopenharmony_ci
498c2ecf20Sopenharmony_ci
508c2ecf20Sopenharmony_ciImplementation details for bio based block drivers
518c2ecf20Sopenharmony_ci--------------------------------------------------------------
528c2ecf20Sopenharmony_ci
538c2ecf20Sopenharmony_ciThese drivers will always see the REQ_PREFLUSH and REQ_FUA bits as they sit
548c2ecf20Sopenharmony_cidirectly below the submit_bio interface.  For remapping drivers the REQ_FUA
558c2ecf20Sopenharmony_cibits need to be propagated to underlying devices, and a global flush needs
568c2ecf20Sopenharmony_cito be implemented for bios with the REQ_PREFLUSH bit set.  For real device
578c2ecf20Sopenharmony_cidrivers that do not have a volatile cache the REQ_PREFLUSH and REQ_FUA bits
588c2ecf20Sopenharmony_cion non-empty bios can simply be ignored, and REQ_PREFLUSH requests without
598c2ecf20Sopenharmony_cidata can be completed successfully without doing any work.  Drivers for
608c2ecf20Sopenharmony_cidevices with volatile caches need to implement the support for these
618c2ecf20Sopenharmony_ciflags themselves without any help from the block layer.
628c2ecf20Sopenharmony_ci
638c2ecf20Sopenharmony_ci
648c2ecf20Sopenharmony_ciImplementation details for request_fn based block drivers
658c2ecf20Sopenharmony_ci---------------------------------------------------------
668c2ecf20Sopenharmony_ci
678c2ecf20Sopenharmony_ciFor devices that do not support volatile write caches there is no driver
688c2ecf20Sopenharmony_cisupport required, the block layer completes empty REQ_PREFLUSH requests before
698c2ecf20Sopenharmony_cientering the driver and strips off the REQ_PREFLUSH and REQ_FUA bits from
708c2ecf20Sopenharmony_cirequests that have a payload.  For devices with volatile write caches the
718c2ecf20Sopenharmony_cidriver needs to tell the block layer that it supports flushing caches by
728c2ecf20Sopenharmony_cidoing::
738c2ecf20Sopenharmony_ci
748c2ecf20Sopenharmony_ci	blk_queue_write_cache(sdkp->disk->queue, true, false);
758c2ecf20Sopenharmony_ci
768c2ecf20Sopenharmony_ciand handle empty REQ_OP_FLUSH requests in its prep_fn/request_fn.  Note that
778c2ecf20Sopenharmony_ciREQ_PREFLUSH requests with a payload are automatically turned into a sequence
788c2ecf20Sopenharmony_ciof an empty REQ_OP_FLUSH request followed by the actual write by the block
798c2ecf20Sopenharmony_cilayer.  For devices that also support the FUA bit the block layer needs
808c2ecf20Sopenharmony_cito be told to pass through the REQ_FUA bit using::
818c2ecf20Sopenharmony_ci
828c2ecf20Sopenharmony_ci	blk_queue_write_cache(sdkp->disk->queue, true, true);
838c2ecf20Sopenharmony_ci
848c2ecf20Sopenharmony_ciand the driver must handle write requests that have the REQ_FUA bit set
858c2ecf20Sopenharmony_ciin prep_fn/request_fn.  If the FUA bit is not natively supported the block
868c2ecf20Sopenharmony_cilayer turns it into an empty REQ_OP_FLUSH request after the actual write.
87