18c2ecf20Sopenharmony_ci========================================== 28c2ecf20Sopenharmony_ciExplicit volatile write back cache control 38c2ecf20Sopenharmony_ci========================================== 48c2ecf20Sopenharmony_ci 58c2ecf20Sopenharmony_ciIntroduction 68c2ecf20Sopenharmony_ci------------ 78c2ecf20Sopenharmony_ci 88c2ecf20Sopenharmony_ciMany storage devices, especially in the consumer market, come with volatile 98c2ecf20Sopenharmony_ciwrite back caches. That means the devices signal I/O completion to the 108c2ecf20Sopenharmony_cioperating system before data actually has hit the non-volatile storage. This 118c2ecf20Sopenharmony_cibehavior obviously speeds up various workloads, but it means the operating 128c2ecf20Sopenharmony_cisystem needs to force data out to the non-volatile storage when it performs 138c2ecf20Sopenharmony_cia data integrity operation like fsync, sync or an unmount. 148c2ecf20Sopenharmony_ci 158c2ecf20Sopenharmony_ciThe Linux block layer provides two simple mechanisms that let filesystems 168c2ecf20Sopenharmony_cicontrol the caching behavior of the storage device. These mechanisms are 178c2ecf20Sopenharmony_cia forced cache flush, and the Force Unit Access (FUA) flag for requests. 188c2ecf20Sopenharmony_ci 198c2ecf20Sopenharmony_ci 208c2ecf20Sopenharmony_ciExplicit cache flushes 218c2ecf20Sopenharmony_ci---------------------- 228c2ecf20Sopenharmony_ci 238c2ecf20Sopenharmony_ciThe REQ_PREFLUSH flag can be OR ed into the r/w flags of a bio submitted from 248c2ecf20Sopenharmony_cithe filesystem and will make sure the volatile cache of the storage device 258c2ecf20Sopenharmony_cihas been flushed before the actual I/O operation is started. This explicitly 268c2ecf20Sopenharmony_ciguarantees that previously completed write requests are on non-volatile 278c2ecf20Sopenharmony_cistorage before the flagged bio starts. In addition the REQ_PREFLUSH flag can be 288c2ecf20Sopenharmony_ciset on an otherwise empty bio structure, which causes only an explicit cache 298c2ecf20Sopenharmony_ciflush without any dependent I/O. It is recommend to use 308c2ecf20Sopenharmony_cithe blkdev_issue_flush() helper for a pure cache flush. 318c2ecf20Sopenharmony_ci 328c2ecf20Sopenharmony_ci 338c2ecf20Sopenharmony_ciForced Unit Access 348c2ecf20Sopenharmony_ci------------------ 358c2ecf20Sopenharmony_ci 368c2ecf20Sopenharmony_ciThe REQ_FUA flag can be OR ed into the r/w flags of a bio submitted from the 378c2ecf20Sopenharmony_cifilesystem and will make sure that I/O completion for this request is only 388c2ecf20Sopenharmony_cisignaled after the data has been committed to non-volatile storage. 398c2ecf20Sopenharmony_ci 408c2ecf20Sopenharmony_ci 418c2ecf20Sopenharmony_ciImplementation details for filesystems 428c2ecf20Sopenharmony_ci-------------------------------------- 438c2ecf20Sopenharmony_ci 448c2ecf20Sopenharmony_ciFilesystems can simply set the REQ_PREFLUSH and REQ_FUA bits and do not have to 458c2ecf20Sopenharmony_ciworry if the underlying devices need any explicit cache flushing and how 468c2ecf20Sopenharmony_cithe Forced Unit Access is implemented. The REQ_PREFLUSH and REQ_FUA flags 478c2ecf20Sopenharmony_cimay both be set on a single bio. 488c2ecf20Sopenharmony_ci 498c2ecf20Sopenharmony_ci 508c2ecf20Sopenharmony_ciImplementation details for bio based block drivers 518c2ecf20Sopenharmony_ci-------------------------------------------------------------- 528c2ecf20Sopenharmony_ci 538c2ecf20Sopenharmony_ciThese drivers will always see the REQ_PREFLUSH and REQ_FUA bits as they sit 548c2ecf20Sopenharmony_cidirectly below the submit_bio interface. For remapping drivers the REQ_FUA 558c2ecf20Sopenharmony_cibits need to be propagated to underlying devices, and a global flush needs 568c2ecf20Sopenharmony_cito be implemented for bios with the REQ_PREFLUSH bit set. For real device 578c2ecf20Sopenharmony_cidrivers that do not have a volatile cache the REQ_PREFLUSH and REQ_FUA bits 588c2ecf20Sopenharmony_cion non-empty bios can simply be ignored, and REQ_PREFLUSH requests without 598c2ecf20Sopenharmony_cidata can be completed successfully without doing any work. Drivers for 608c2ecf20Sopenharmony_cidevices with volatile caches need to implement the support for these 618c2ecf20Sopenharmony_ciflags themselves without any help from the block layer. 628c2ecf20Sopenharmony_ci 638c2ecf20Sopenharmony_ci 648c2ecf20Sopenharmony_ciImplementation details for request_fn based block drivers 658c2ecf20Sopenharmony_ci--------------------------------------------------------- 668c2ecf20Sopenharmony_ci 678c2ecf20Sopenharmony_ciFor devices that do not support volatile write caches there is no driver 688c2ecf20Sopenharmony_cisupport required, the block layer completes empty REQ_PREFLUSH requests before 698c2ecf20Sopenharmony_cientering the driver and strips off the REQ_PREFLUSH and REQ_FUA bits from 708c2ecf20Sopenharmony_cirequests that have a payload. For devices with volatile write caches the 718c2ecf20Sopenharmony_cidriver needs to tell the block layer that it supports flushing caches by 728c2ecf20Sopenharmony_cidoing:: 738c2ecf20Sopenharmony_ci 748c2ecf20Sopenharmony_ci blk_queue_write_cache(sdkp->disk->queue, true, false); 758c2ecf20Sopenharmony_ci 768c2ecf20Sopenharmony_ciand handle empty REQ_OP_FLUSH requests in its prep_fn/request_fn. Note that 778c2ecf20Sopenharmony_ciREQ_PREFLUSH requests with a payload are automatically turned into a sequence 788c2ecf20Sopenharmony_ciof an empty REQ_OP_FLUSH request followed by the actual write by the block 798c2ecf20Sopenharmony_cilayer. For devices that also support the FUA bit the block layer needs 808c2ecf20Sopenharmony_cito be told to pass through the REQ_FUA bit using:: 818c2ecf20Sopenharmony_ci 828c2ecf20Sopenharmony_ci blk_queue_write_cache(sdkp->disk->queue, true, true); 838c2ecf20Sopenharmony_ci 848c2ecf20Sopenharmony_ciand the driver must handle write requests that have the REQ_FUA bit set 858c2ecf20Sopenharmony_ciin prep_fn/request_fn. If the FUA bit is not natively supported the block 868c2ecf20Sopenharmony_cilayer turns it into an empty REQ_OP_FLUSH request after the actual write. 87