18c2ecf20Sopenharmony_ci.. SPDX-License-Identifier: GPL-2.0 28c2ecf20Sopenharmony_ci 38c2ecf20Sopenharmony_ci====== 48c2ecf20Sopenharmony_ciNILFS2 58c2ecf20Sopenharmony_ci====== 68c2ecf20Sopenharmony_ci 78c2ecf20Sopenharmony_ciNILFS2 is a log-structured file system (LFS) supporting continuous 88c2ecf20Sopenharmony_cisnapshotting. In addition to versioning capability of the entire file 98c2ecf20Sopenharmony_cisystem, users can even restore files mistakenly overwritten or 108c2ecf20Sopenharmony_cidestroyed just a few seconds ago. Since NILFS2 can keep consistency 118c2ecf20Sopenharmony_cilike conventional LFS, it achieves quick recovery after system 128c2ecf20Sopenharmony_cicrashes. 138c2ecf20Sopenharmony_ci 148c2ecf20Sopenharmony_ciNILFS2 creates a number of checkpoints every few seconds or per 158c2ecf20Sopenharmony_cisynchronous write basis (unless there is no change). Users can select 168c2ecf20Sopenharmony_cisignificant versions among continuously created checkpoints, and can 178c2ecf20Sopenharmony_cichange them into snapshots which will be preserved until they are 188c2ecf20Sopenharmony_cichanged back to checkpoints. 198c2ecf20Sopenharmony_ci 208c2ecf20Sopenharmony_ciThere is no limit on the number of snapshots until the volume gets 218c2ecf20Sopenharmony_cifull. Each snapshot is mountable as a read-only file system 228c2ecf20Sopenharmony_ciconcurrently with its writable mount, and this feature is convenient 238c2ecf20Sopenharmony_cifor online backup. 248c2ecf20Sopenharmony_ci 258c2ecf20Sopenharmony_ciThe userland tools are included in nilfs-utils package, which is 268c2ecf20Sopenharmony_ciavailable from the following download page. At least "mkfs.nilfs2", 278c2ecf20Sopenharmony_ci"mount.nilfs2", "umount.nilfs2", and "nilfs_cleanerd" (so called 288c2ecf20Sopenharmony_cicleaner or garbage collector) are required. Details on the tools are 298c2ecf20Sopenharmony_cidescribed in the man pages included in the package. 308c2ecf20Sopenharmony_ci 318c2ecf20Sopenharmony_ci:Project web page: https://nilfs.sourceforge.io/ 328c2ecf20Sopenharmony_ci:Download page: https://nilfs.sourceforge.io/en/download.html 338c2ecf20Sopenharmony_ci:List info: http://vger.kernel.org/vger-lists.html#linux-nilfs 348c2ecf20Sopenharmony_ci 358c2ecf20Sopenharmony_ciCaveats 368c2ecf20Sopenharmony_ci======= 378c2ecf20Sopenharmony_ci 388c2ecf20Sopenharmony_ciFeatures which NILFS2 does not support yet: 398c2ecf20Sopenharmony_ci 408c2ecf20Sopenharmony_ci - atime 418c2ecf20Sopenharmony_ci - extended attributes 428c2ecf20Sopenharmony_ci - POSIX ACLs 438c2ecf20Sopenharmony_ci - quotas 448c2ecf20Sopenharmony_ci - fsck 458c2ecf20Sopenharmony_ci - defragmentation 468c2ecf20Sopenharmony_ci 478c2ecf20Sopenharmony_ciMount options 488c2ecf20Sopenharmony_ci============= 498c2ecf20Sopenharmony_ci 508c2ecf20Sopenharmony_ciNILFS2 supports the following mount options: 518c2ecf20Sopenharmony_ci(*) == default 528c2ecf20Sopenharmony_ci 538c2ecf20Sopenharmony_ci======================= ======================================================= 548c2ecf20Sopenharmony_cibarrier(*) This enables/disables the use of write barriers. This 558c2ecf20Sopenharmony_cinobarrier requires an IO stack which can support barriers, and 568c2ecf20Sopenharmony_ci if nilfs gets an error on a barrier write, it will 578c2ecf20Sopenharmony_ci disable again with a warning. 588c2ecf20Sopenharmony_cierrors=continue Keep going on a filesystem error. 598c2ecf20Sopenharmony_cierrors=remount-ro(*) Remount the filesystem read-only on an error. 608c2ecf20Sopenharmony_cierrors=panic Panic and halt the machine if an error occurs. 618c2ecf20Sopenharmony_cicp=n Specify the checkpoint-number of the snapshot to be 628c2ecf20Sopenharmony_ci mounted. Checkpoints and snapshots are listed by lscp 638c2ecf20Sopenharmony_ci user command. Only the checkpoints marked as snapshot 648c2ecf20Sopenharmony_ci are mountable with this option. Snapshot is read-only, 658c2ecf20Sopenharmony_ci so a read-only mount option must be specified together. 668c2ecf20Sopenharmony_ciorder=relaxed(*) Apply relaxed order semantics that allows modified data 678c2ecf20Sopenharmony_ci blocks to be written to disk without making a 688c2ecf20Sopenharmony_ci checkpoint if no metadata update is going. This mode 698c2ecf20Sopenharmony_ci is equivalent to the ordered data mode of the ext3 708c2ecf20Sopenharmony_ci filesystem except for the updates on data blocks still 718c2ecf20Sopenharmony_ci conserve atomicity. This will improve synchronous 728c2ecf20Sopenharmony_ci write performance for overwriting. 738c2ecf20Sopenharmony_ciorder=strict Apply strict in-order semantics that preserves sequence 748c2ecf20Sopenharmony_ci of all file operations including overwriting of data 758c2ecf20Sopenharmony_ci blocks. That means, it is guaranteed that no 768c2ecf20Sopenharmony_ci overtaking of events occurs in the recovered file 778c2ecf20Sopenharmony_ci system after a crash. 788c2ecf20Sopenharmony_cinorecovery Disable recovery of the filesystem on mount. 798c2ecf20Sopenharmony_ci This disables every write access on the device for 808c2ecf20Sopenharmony_ci read-only mounts or snapshots. This option will fail 818c2ecf20Sopenharmony_ci for r/w mounts on an unclean volume. 828c2ecf20Sopenharmony_cidiscard This enables/disables the use of discard/TRIM commands. 838c2ecf20Sopenharmony_cinodiscard(*) The discard/TRIM commands are sent to the underlying 848c2ecf20Sopenharmony_ci block device when blocks are freed. This is useful 858c2ecf20Sopenharmony_ci for SSD devices and sparse/thinly-provisioned LUNs. 868c2ecf20Sopenharmony_ci======================= ======================================================= 878c2ecf20Sopenharmony_ci 888c2ecf20Sopenharmony_ciIoctls 898c2ecf20Sopenharmony_ci====== 908c2ecf20Sopenharmony_ci 918c2ecf20Sopenharmony_ciThere is some NILFS2 specific functionality which can be accessed by applications 928c2ecf20Sopenharmony_cithrough the system call interfaces. The list of all NILFS2 specific ioctls are 938c2ecf20Sopenharmony_cishown in the table below. 948c2ecf20Sopenharmony_ci 958c2ecf20Sopenharmony_ciTable of NILFS2 specific ioctls: 968c2ecf20Sopenharmony_ci 978c2ecf20Sopenharmony_ci ============================== =============================================== 988c2ecf20Sopenharmony_ci Ioctl Description 998c2ecf20Sopenharmony_ci ============================== =============================================== 1008c2ecf20Sopenharmony_ci NILFS_IOCTL_CHANGE_CPMODE Change mode of given checkpoint between 1018c2ecf20Sopenharmony_ci checkpoint and snapshot state. This ioctl is 1028c2ecf20Sopenharmony_ci used in chcp and mkcp utilities. 1038c2ecf20Sopenharmony_ci 1048c2ecf20Sopenharmony_ci NILFS_IOCTL_DELETE_CHECKPOINT Remove checkpoint from NILFS2 file system. 1058c2ecf20Sopenharmony_ci This ioctl is used in rmcp utility. 1068c2ecf20Sopenharmony_ci 1078c2ecf20Sopenharmony_ci NILFS_IOCTL_GET_CPINFO Return info about requested checkpoints. This 1088c2ecf20Sopenharmony_ci ioctl is used in lscp utility and by 1098c2ecf20Sopenharmony_ci nilfs_cleanerd daemon. 1108c2ecf20Sopenharmony_ci 1118c2ecf20Sopenharmony_ci NILFS_IOCTL_GET_CPSTAT Return checkpoints statistics. This ioctl is 1128c2ecf20Sopenharmony_ci used by lscp, rmcp utilities and by 1138c2ecf20Sopenharmony_ci nilfs_cleanerd daemon. 1148c2ecf20Sopenharmony_ci 1158c2ecf20Sopenharmony_ci NILFS_IOCTL_GET_SUINFO Return segment usage info about requested 1168c2ecf20Sopenharmony_ci segments. This ioctl is used in lssu, 1178c2ecf20Sopenharmony_ci nilfs_resize utilities and by nilfs_cleanerd 1188c2ecf20Sopenharmony_ci daemon. 1198c2ecf20Sopenharmony_ci 1208c2ecf20Sopenharmony_ci NILFS_IOCTL_SET_SUINFO Modify segment usage info of requested 1218c2ecf20Sopenharmony_ci segments. This ioctl is used by 1228c2ecf20Sopenharmony_ci nilfs_cleanerd daemon to skip unnecessary 1238c2ecf20Sopenharmony_ci cleaning operation of segments and reduce 1248c2ecf20Sopenharmony_ci performance penalty or wear of flash device 1258c2ecf20Sopenharmony_ci due to redundant move of in-use blocks. 1268c2ecf20Sopenharmony_ci 1278c2ecf20Sopenharmony_ci NILFS_IOCTL_GET_SUSTAT Return segment usage statistics. This ioctl 1288c2ecf20Sopenharmony_ci is used in lssu, nilfs_resize utilities and 1298c2ecf20Sopenharmony_ci by nilfs_cleanerd daemon. 1308c2ecf20Sopenharmony_ci 1318c2ecf20Sopenharmony_ci NILFS_IOCTL_GET_VINFO Return information on virtual block addresses. 1328c2ecf20Sopenharmony_ci This ioctl is used by nilfs_cleanerd daemon. 1338c2ecf20Sopenharmony_ci 1348c2ecf20Sopenharmony_ci NILFS_IOCTL_GET_BDESCS Return information about descriptors of disk 1358c2ecf20Sopenharmony_ci block numbers. This ioctl is used by 1368c2ecf20Sopenharmony_ci nilfs_cleanerd daemon. 1378c2ecf20Sopenharmony_ci 1388c2ecf20Sopenharmony_ci NILFS_IOCTL_CLEAN_SEGMENTS Do garbage collection operation in the 1398c2ecf20Sopenharmony_ci environment of requested parameters from 1408c2ecf20Sopenharmony_ci userspace. This ioctl is used by 1418c2ecf20Sopenharmony_ci nilfs_cleanerd daemon. 1428c2ecf20Sopenharmony_ci 1438c2ecf20Sopenharmony_ci NILFS_IOCTL_SYNC Make a checkpoint. This ioctl is used in 1448c2ecf20Sopenharmony_ci mkcp utility. 1458c2ecf20Sopenharmony_ci 1468c2ecf20Sopenharmony_ci NILFS_IOCTL_RESIZE Resize NILFS2 volume. This ioctl is used 1478c2ecf20Sopenharmony_ci by nilfs_resize utility. 1488c2ecf20Sopenharmony_ci 1498c2ecf20Sopenharmony_ci NILFS_IOCTL_SET_ALLOC_RANGE Define lower limit of segments in bytes and 1508c2ecf20Sopenharmony_ci upper limit of segments in bytes. This ioctl 1518c2ecf20Sopenharmony_ci is used by nilfs_resize utility. 1528c2ecf20Sopenharmony_ci ============================== =============================================== 1538c2ecf20Sopenharmony_ci 1548c2ecf20Sopenharmony_ciNILFS2 usage 1558c2ecf20Sopenharmony_ci============ 1568c2ecf20Sopenharmony_ci 1578c2ecf20Sopenharmony_ciTo use nilfs2 as a local file system, simply:: 1588c2ecf20Sopenharmony_ci 1598c2ecf20Sopenharmony_ci # mkfs -t nilfs2 /dev/block_device 1608c2ecf20Sopenharmony_ci # mount -t nilfs2 /dev/block_device /dir 1618c2ecf20Sopenharmony_ci 1628c2ecf20Sopenharmony_ciThis will also invoke the cleaner through the mount helper program 1638c2ecf20Sopenharmony_ci(mount.nilfs2). 1648c2ecf20Sopenharmony_ci 1658c2ecf20Sopenharmony_ciCheckpoints and snapshots are managed by the following commands. 1668c2ecf20Sopenharmony_ciTheir manpages are included in the nilfs-utils package above. 1678c2ecf20Sopenharmony_ci 1688c2ecf20Sopenharmony_ci ==== =========================================================== 1698c2ecf20Sopenharmony_ci lscp list checkpoints or snapshots. 1708c2ecf20Sopenharmony_ci mkcp make a checkpoint or a snapshot. 1718c2ecf20Sopenharmony_ci chcp change an existing checkpoint to a snapshot or vice versa. 1728c2ecf20Sopenharmony_ci rmcp invalidate specified checkpoint(s). 1738c2ecf20Sopenharmony_ci ==== =========================================================== 1748c2ecf20Sopenharmony_ci 1758c2ecf20Sopenharmony_ciTo mount a snapshot:: 1768c2ecf20Sopenharmony_ci 1778c2ecf20Sopenharmony_ci # mount -t nilfs2 -r -o cp=<cno> /dev/block_device /snap_dir 1788c2ecf20Sopenharmony_ci 1798c2ecf20Sopenharmony_ciwhere <cno> is the checkpoint number of the snapshot. 1808c2ecf20Sopenharmony_ci 1818c2ecf20Sopenharmony_ciTo unmount the NILFS2 mount point or snapshot, simply:: 1828c2ecf20Sopenharmony_ci 1838c2ecf20Sopenharmony_ci # umount /dir 1848c2ecf20Sopenharmony_ci 1858c2ecf20Sopenharmony_ciThen, the cleaner daemon is automatically shut down by the umount 1868c2ecf20Sopenharmony_cihelper program (umount.nilfs2). 1878c2ecf20Sopenharmony_ci 1888c2ecf20Sopenharmony_ciDisk format 1898c2ecf20Sopenharmony_ci=========== 1908c2ecf20Sopenharmony_ci 1918c2ecf20Sopenharmony_ciA nilfs2 volume is equally divided into a number of segments except 1928c2ecf20Sopenharmony_cifor the super block (SB) and segment #0. A segment is the container 1938c2ecf20Sopenharmony_ciof logs. Each log is composed of summary information blocks, payload 1948c2ecf20Sopenharmony_ciblocks, and an optional super root block (SR):: 1958c2ecf20Sopenharmony_ci 1968c2ecf20Sopenharmony_ci ______________________________________________________ 1978c2ecf20Sopenharmony_ci | |SB| | Segment | Segment | Segment | ... | Segment | | 1988c2ecf20Sopenharmony_ci |_|__|_|____0____|____1____|____2____|_____|____N____|_| 1998c2ecf20Sopenharmony_ci 0 +1K +4K +8M +16M +24M +(8MB x N) 2008c2ecf20Sopenharmony_ci . . (Typical offsets for 4KB-block) 2018c2ecf20Sopenharmony_ci . . 2028c2ecf20Sopenharmony_ci .______________________. 2038c2ecf20Sopenharmony_ci | log | log |... | log | 2048c2ecf20Sopenharmony_ci |__1__|__2__|____|__m__| 2058c2ecf20Sopenharmony_ci . . 2068c2ecf20Sopenharmony_ci . . 2078c2ecf20Sopenharmony_ci . . 2088c2ecf20Sopenharmony_ci .______________________________. 2098c2ecf20Sopenharmony_ci | Summary | Payload blocks |SR| 2108c2ecf20Sopenharmony_ci |_blocks__|_________________|__| 2118c2ecf20Sopenharmony_ci 2128c2ecf20Sopenharmony_ciThe payload blocks are organized per file, and each file consists of 2138c2ecf20Sopenharmony_cidata blocks and B-tree node blocks:: 2148c2ecf20Sopenharmony_ci 2158c2ecf20Sopenharmony_ci |<--- File-A --->|<--- File-B --->| 2168c2ecf20Sopenharmony_ci _______________________________________________________________ 2178c2ecf20Sopenharmony_ci | Data blocks | B-tree blocks | Data blocks | B-tree blocks | ... 2188c2ecf20Sopenharmony_ci _|_____________|_______________|_____________|_______________|_ 2198c2ecf20Sopenharmony_ci 2208c2ecf20Sopenharmony_ci 2218c2ecf20Sopenharmony_ciSince only the modified blocks are written in the log, it may have 2228c2ecf20Sopenharmony_cifiles without data blocks or B-tree node blocks. 2238c2ecf20Sopenharmony_ci 2248c2ecf20Sopenharmony_ciThe organization of the blocks is recorded in the summary information 2258c2ecf20Sopenharmony_ciblocks, which contains a header structure (nilfs_segment_summary), per 2268c2ecf20Sopenharmony_cifile structures (nilfs_finfo), and per block structures (nilfs_binfo):: 2278c2ecf20Sopenharmony_ci 2288c2ecf20Sopenharmony_ci _________________________________________________________________________ 2298c2ecf20Sopenharmony_ci | Summary | finfo | binfo | ... | binfo | finfo | binfo | ... | binfo |... 2308c2ecf20Sopenharmony_ci |_blocks__|___A___|_(A,1)_|_____|(A,Na)_|___B___|_(B,1)_|_____|(B,Nb)_|___ 2318c2ecf20Sopenharmony_ci 2328c2ecf20Sopenharmony_ci 2338c2ecf20Sopenharmony_ciThe logs include regular files, directory files, symbolic link files 2348c2ecf20Sopenharmony_ciand several meta data files. The mata data files are the files used 2358c2ecf20Sopenharmony_cito maintain file system meta data. The current version of NILFS2 uses 2368c2ecf20Sopenharmony_cithe following meta data files:: 2378c2ecf20Sopenharmony_ci 2388c2ecf20Sopenharmony_ci 1) Inode file (ifile) -- Stores on-disk inodes 2398c2ecf20Sopenharmony_ci 2) Checkpoint file (cpfile) -- Stores checkpoints 2408c2ecf20Sopenharmony_ci 3) Segment usage file (sufile) -- Stores allocation state of segments 2418c2ecf20Sopenharmony_ci 4) Data address translation file -- Maps virtual block numbers to usual 2428c2ecf20Sopenharmony_ci (DAT) block numbers. This file serves to 2438c2ecf20Sopenharmony_ci make on-disk blocks relocatable. 2448c2ecf20Sopenharmony_ci 2458c2ecf20Sopenharmony_ciThe following figure shows a typical organization of the logs:: 2468c2ecf20Sopenharmony_ci 2478c2ecf20Sopenharmony_ci _________________________________________________________________________ 2488c2ecf20Sopenharmony_ci | Summary | regular file | file | ... | ifile | cpfile | sufile | DAT |SR| 2498c2ecf20Sopenharmony_ci |_blocks__|_or_directory_|_______|_____|_______|________|________|_____|__| 2508c2ecf20Sopenharmony_ci 2518c2ecf20Sopenharmony_ci 2528c2ecf20Sopenharmony_ciTo stride over segment boundaries, this sequence of files may be split 2538c2ecf20Sopenharmony_ciinto multiple logs. The sequence of logs that should be treated as 2548c2ecf20Sopenharmony_cilogically one log, is delimited with flags marked in the segment 2558c2ecf20Sopenharmony_cisummary. The recovery code of nilfs2 looks this boundary information 2568c2ecf20Sopenharmony_cito ensure atomicity of updates. 2578c2ecf20Sopenharmony_ci 2588c2ecf20Sopenharmony_ciThe super root block is inserted for every checkpoints. It includes 2598c2ecf20Sopenharmony_cithree special inodes, inodes for the DAT, cpfile, and sufile. Inodes 2608c2ecf20Sopenharmony_ciof regular files, directories, symlinks and other special files, are 2618c2ecf20Sopenharmony_ciincluded in the ifile. The inode of ifile itself is included in the 2628c2ecf20Sopenharmony_cicorresponding checkpoint entry in the cpfile. Thus, the hierarchy 2638c2ecf20Sopenharmony_ciamong NILFS2 files can be depicted as follows:: 2648c2ecf20Sopenharmony_ci 2658c2ecf20Sopenharmony_ci Super block (SB) 2668c2ecf20Sopenharmony_ci | 2678c2ecf20Sopenharmony_ci v 2688c2ecf20Sopenharmony_ci Super root block (the latest cno=xx) 2698c2ecf20Sopenharmony_ci |-- DAT 2708c2ecf20Sopenharmony_ci |-- sufile 2718c2ecf20Sopenharmony_ci `-- cpfile 2728c2ecf20Sopenharmony_ci |-- ifile (cno=c1) 2738c2ecf20Sopenharmony_ci |-- ifile (cno=c2) ---- file (ino=i1) 2748c2ecf20Sopenharmony_ci : : |-- file (ino=i2) 2758c2ecf20Sopenharmony_ci `-- ifile (cno=xx) |-- file (ino=i3) 2768c2ecf20Sopenharmony_ci : : 2778c2ecf20Sopenharmony_ci `-- file (ino=yy) 2788c2ecf20Sopenharmony_ci ( regular file, directory, or symlink ) 2798c2ecf20Sopenharmony_ci 2808c2ecf20Sopenharmony_ciFor detail on the format of each file, please see nilfs2_ondisk.h 2818c2ecf20Sopenharmony_cilocated at include/uapi/linux directory. 2828c2ecf20Sopenharmony_ci 2838c2ecf20Sopenharmony_ciThere are no patents or other intellectual property that we protect 2848c2ecf20Sopenharmony_ciwith regard to the design of NILFS2. It is allowed to replicate the 2858c2ecf20Sopenharmony_cidesign in hopes that other operating systems could share (mount, read, 2868c2ecf20Sopenharmony_ciwrite, etc.) data stored in this format. 287