18c2ecf20Sopenharmony_ci.. SPDX-License-Identifier: GPL-2.0
28c2ecf20Sopenharmony_ci
38c2ecf20Sopenharmony_ci======
48c2ecf20Sopenharmony_ciNILFS2
58c2ecf20Sopenharmony_ci======
68c2ecf20Sopenharmony_ci
78c2ecf20Sopenharmony_ciNILFS2 is a log-structured file system (LFS) supporting continuous
88c2ecf20Sopenharmony_cisnapshotting.  In addition to versioning capability of the entire file
98c2ecf20Sopenharmony_cisystem, users can even restore files mistakenly overwritten or
108c2ecf20Sopenharmony_cidestroyed just a few seconds ago.  Since NILFS2 can keep consistency
118c2ecf20Sopenharmony_cilike conventional LFS, it achieves quick recovery after system
128c2ecf20Sopenharmony_cicrashes.
138c2ecf20Sopenharmony_ci
148c2ecf20Sopenharmony_ciNILFS2 creates a number of checkpoints every few seconds or per
158c2ecf20Sopenharmony_cisynchronous write basis (unless there is no change).  Users can select
168c2ecf20Sopenharmony_cisignificant versions among continuously created checkpoints, and can
178c2ecf20Sopenharmony_cichange them into snapshots which will be preserved until they are
188c2ecf20Sopenharmony_cichanged back to checkpoints.
198c2ecf20Sopenharmony_ci
208c2ecf20Sopenharmony_ciThere is no limit on the number of snapshots until the volume gets
218c2ecf20Sopenharmony_cifull.  Each snapshot is mountable as a read-only file system
228c2ecf20Sopenharmony_ciconcurrently with its writable mount, and this feature is convenient
238c2ecf20Sopenharmony_cifor online backup.
248c2ecf20Sopenharmony_ci
258c2ecf20Sopenharmony_ciThe userland tools are included in nilfs-utils package, which is
268c2ecf20Sopenharmony_ciavailable from the following download page.  At least "mkfs.nilfs2",
278c2ecf20Sopenharmony_ci"mount.nilfs2", "umount.nilfs2", and "nilfs_cleanerd" (so called
288c2ecf20Sopenharmony_cicleaner or garbage collector) are required.  Details on the tools are
298c2ecf20Sopenharmony_cidescribed in the man pages included in the package.
308c2ecf20Sopenharmony_ci
318c2ecf20Sopenharmony_ci:Project web page:    https://nilfs.sourceforge.io/
328c2ecf20Sopenharmony_ci:Download page:       https://nilfs.sourceforge.io/en/download.html
338c2ecf20Sopenharmony_ci:List info:           http://vger.kernel.org/vger-lists.html#linux-nilfs
348c2ecf20Sopenharmony_ci
358c2ecf20Sopenharmony_ciCaveats
368c2ecf20Sopenharmony_ci=======
378c2ecf20Sopenharmony_ci
388c2ecf20Sopenharmony_ciFeatures which NILFS2 does not support yet:
398c2ecf20Sopenharmony_ci
408c2ecf20Sopenharmony_ci	- atime
418c2ecf20Sopenharmony_ci	- extended attributes
428c2ecf20Sopenharmony_ci	- POSIX ACLs
438c2ecf20Sopenharmony_ci	- quotas
448c2ecf20Sopenharmony_ci	- fsck
458c2ecf20Sopenharmony_ci	- defragmentation
468c2ecf20Sopenharmony_ci
478c2ecf20Sopenharmony_ciMount options
488c2ecf20Sopenharmony_ci=============
498c2ecf20Sopenharmony_ci
508c2ecf20Sopenharmony_ciNILFS2 supports the following mount options:
518c2ecf20Sopenharmony_ci(*) == default
528c2ecf20Sopenharmony_ci
538c2ecf20Sopenharmony_ci======================= =======================================================
548c2ecf20Sopenharmony_cibarrier(*)		This enables/disables the use of write barriers.  This
558c2ecf20Sopenharmony_cinobarrier		requires an IO stack which can support barriers, and
568c2ecf20Sopenharmony_ci			if nilfs gets an error on a barrier write, it will
578c2ecf20Sopenharmony_ci			disable again with a warning.
588c2ecf20Sopenharmony_cierrors=continue		Keep going on a filesystem error.
598c2ecf20Sopenharmony_cierrors=remount-ro(*)	Remount the filesystem read-only on an error.
608c2ecf20Sopenharmony_cierrors=panic		Panic and halt the machine if an error occurs.
618c2ecf20Sopenharmony_cicp=n			Specify the checkpoint-number of the snapshot to be
628c2ecf20Sopenharmony_ci			mounted.  Checkpoints and snapshots are listed by lscp
638c2ecf20Sopenharmony_ci			user command.  Only the checkpoints marked as snapshot
648c2ecf20Sopenharmony_ci			are mountable with this option.  Snapshot is read-only,
658c2ecf20Sopenharmony_ci			so a read-only mount option must be specified together.
668c2ecf20Sopenharmony_ciorder=relaxed(*)	Apply relaxed order semantics that allows modified data
678c2ecf20Sopenharmony_ci			blocks to be written to disk without making a
688c2ecf20Sopenharmony_ci			checkpoint if no metadata update is going.  This mode
698c2ecf20Sopenharmony_ci			is equivalent to the ordered data mode of the ext3
708c2ecf20Sopenharmony_ci			filesystem except for the updates on data blocks still
718c2ecf20Sopenharmony_ci			conserve atomicity.  This will improve synchronous
728c2ecf20Sopenharmony_ci			write performance for overwriting.
738c2ecf20Sopenharmony_ciorder=strict		Apply strict in-order semantics that preserves sequence
748c2ecf20Sopenharmony_ci			of all file operations including overwriting of data
758c2ecf20Sopenharmony_ci			blocks.  That means, it is guaranteed that no
768c2ecf20Sopenharmony_ci			overtaking of events occurs in the recovered file
778c2ecf20Sopenharmony_ci			system after a crash.
788c2ecf20Sopenharmony_cinorecovery		Disable recovery of the filesystem on mount.
798c2ecf20Sopenharmony_ci			This disables every write access on the device for
808c2ecf20Sopenharmony_ci			read-only mounts or snapshots.  This option will fail
818c2ecf20Sopenharmony_ci			for r/w mounts on an unclean volume.
828c2ecf20Sopenharmony_cidiscard			This enables/disables the use of discard/TRIM commands.
838c2ecf20Sopenharmony_cinodiscard(*)		The discard/TRIM commands are sent to the underlying
848c2ecf20Sopenharmony_ci			block device when blocks are freed.  This is useful
858c2ecf20Sopenharmony_ci			for SSD devices and sparse/thinly-provisioned LUNs.
868c2ecf20Sopenharmony_ci======================= =======================================================
878c2ecf20Sopenharmony_ci
888c2ecf20Sopenharmony_ciIoctls
898c2ecf20Sopenharmony_ci======
908c2ecf20Sopenharmony_ci
918c2ecf20Sopenharmony_ciThere is some NILFS2 specific functionality which can be accessed by applications
928c2ecf20Sopenharmony_cithrough the system call interfaces. The list of all NILFS2 specific ioctls are
938c2ecf20Sopenharmony_cishown in the table below.
948c2ecf20Sopenharmony_ci
958c2ecf20Sopenharmony_ciTable of NILFS2 specific ioctls:
968c2ecf20Sopenharmony_ci
978c2ecf20Sopenharmony_ci ============================== ===============================================
988c2ecf20Sopenharmony_ci Ioctl			        Description
998c2ecf20Sopenharmony_ci ============================== ===============================================
1008c2ecf20Sopenharmony_ci NILFS_IOCTL_CHANGE_CPMODE      Change mode of given checkpoint between
1018c2ecf20Sopenharmony_ci			        checkpoint and snapshot state. This ioctl is
1028c2ecf20Sopenharmony_ci			        used in chcp and mkcp utilities.
1038c2ecf20Sopenharmony_ci
1048c2ecf20Sopenharmony_ci NILFS_IOCTL_DELETE_CHECKPOINT  Remove checkpoint from NILFS2 file system.
1058c2ecf20Sopenharmony_ci			        This ioctl is used in rmcp utility.
1068c2ecf20Sopenharmony_ci
1078c2ecf20Sopenharmony_ci NILFS_IOCTL_GET_CPINFO         Return info about requested checkpoints. This
1088c2ecf20Sopenharmony_ci			        ioctl is used in lscp utility and by
1098c2ecf20Sopenharmony_ci			        nilfs_cleanerd daemon.
1108c2ecf20Sopenharmony_ci
1118c2ecf20Sopenharmony_ci NILFS_IOCTL_GET_CPSTAT         Return checkpoints statistics. This ioctl is
1128c2ecf20Sopenharmony_ci			        used by lscp, rmcp utilities and by
1138c2ecf20Sopenharmony_ci			        nilfs_cleanerd daemon.
1148c2ecf20Sopenharmony_ci
1158c2ecf20Sopenharmony_ci NILFS_IOCTL_GET_SUINFO         Return segment usage info about requested
1168c2ecf20Sopenharmony_ci			        segments. This ioctl is used in lssu,
1178c2ecf20Sopenharmony_ci			        nilfs_resize utilities and by nilfs_cleanerd
1188c2ecf20Sopenharmony_ci			        daemon.
1198c2ecf20Sopenharmony_ci
1208c2ecf20Sopenharmony_ci NILFS_IOCTL_SET_SUINFO         Modify segment usage info of requested
1218c2ecf20Sopenharmony_ci				segments. This ioctl is used by
1228c2ecf20Sopenharmony_ci				nilfs_cleanerd daemon to skip unnecessary
1238c2ecf20Sopenharmony_ci				cleaning operation of segments and reduce
1248c2ecf20Sopenharmony_ci				performance penalty or wear of flash device
1258c2ecf20Sopenharmony_ci				due to redundant move of in-use blocks.
1268c2ecf20Sopenharmony_ci
1278c2ecf20Sopenharmony_ci NILFS_IOCTL_GET_SUSTAT         Return segment usage statistics. This ioctl
1288c2ecf20Sopenharmony_ci			        is used in lssu, nilfs_resize utilities and
1298c2ecf20Sopenharmony_ci			        by nilfs_cleanerd daemon.
1308c2ecf20Sopenharmony_ci
1318c2ecf20Sopenharmony_ci NILFS_IOCTL_GET_VINFO          Return information on virtual block addresses.
1328c2ecf20Sopenharmony_ci			        This ioctl is used by nilfs_cleanerd daemon.
1338c2ecf20Sopenharmony_ci
1348c2ecf20Sopenharmony_ci NILFS_IOCTL_GET_BDESCS         Return information about descriptors of disk
1358c2ecf20Sopenharmony_ci			        block numbers. This ioctl is used by
1368c2ecf20Sopenharmony_ci			        nilfs_cleanerd daemon.
1378c2ecf20Sopenharmony_ci
1388c2ecf20Sopenharmony_ci NILFS_IOCTL_CLEAN_SEGMENTS     Do garbage collection operation in the
1398c2ecf20Sopenharmony_ci			        environment of requested parameters from
1408c2ecf20Sopenharmony_ci			        userspace. This ioctl is used by
1418c2ecf20Sopenharmony_ci			        nilfs_cleanerd daemon.
1428c2ecf20Sopenharmony_ci
1438c2ecf20Sopenharmony_ci NILFS_IOCTL_SYNC               Make a checkpoint. This ioctl is used in
1448c2ecf20Sopenharmony_ci			        mkcp utility.
1458c2ecf20Sopenharmony_ci
1468c2ecf20Sopenharmony_ci NILFS_IOCTL_RESIZE             Resize NILFS2 volume. This ioctl is used
1478c2ecf20Sopenharmony_ci			        by nilfs_resize utility.
1488c2ecf20Sopenharmony_ci
1498c2ecf20Sopenharmony_ci NILFS_IOCTL_SET_ALLOC_RANGE    Define lower limit of segments in bytes and
1508c2ecf20Sopenharmony_ci			        upper limit of segments in bytes. This ioctl
1518c2ecf20Sopenharmony_ci			        is used by nilfs_resize utility.
1528c2ecf20Sopenharmony_ci ============================== ===============================================
1538c2ecf20Sopenharmony_ci
1548c2ecf20Sopenharmony_ciNILFS2 usage
1558c2ecf20Sopenharmony_ci============
1568c2ecf20Sopenharmony_ci
1578c2ecf20Sopenharmony_ciTo use nilfs2 as a local file system, simply::
1588c2ecf20Sopenharmony_ci
1598c2ecf20Sopenharmony_ci # mkfs -t nilfs2 /dev/block_device
1608c2ecf20Sopenharmony_ci # mount -t nilfs2 /dev/block_device /dir
1618c2ecf20Sopenharmony_ci
1628c2ecf20Sopenharmony_ciThis will also invoke the cleaner through the mount helper program
1638c2ecf20Sopenharmony_ci(mount.nilfs2).
1648c2ecf20Sopenharmony_ci
1658c2ecf20Sopenharmony_ciCheckpoints and snapshots are managed by the following commands.
1668c2ecf20Sopenharmony_ciTheir manpages are included in the nilfs-utils package above.
1678c2ecf20Sopenharmony_ci
1688c2ecf20Sopenharmony_ci  ====     ===========================================================
1698c2ecf20Sopenharmony_ci  lscp     list checkpoints or snapshots.
1708c2ecf20Sopenharmony_ci  mkcp     make a checkpoint or a snapshot.
1718c2ecf20Sopenharmony_ci  chcp     change an existing checkpoint to a snapshot or vice versa.
1728c2ecf20Sopenharmony_ci  rmcp     invalidate specified checkpoint(s).
1738c2ecf20Sopenharmony_ci  ====     ===========================================================
1748c2ecf20Sopenharmony_ci
1758c2ecf20Sopenharmony_ciTo mount a snapshot::
1768c2ecf20Sopenharmony_ci
1778c2ecf20Sopenharmony_ci # mount -t nilfs2 -r -o cp=<cno> /dev/block_device /snap_dir
1788c2ecf20Sopenharmony_ci
1798c2ecf20Sopenharmony_ciwhere <cno> is the checkpoint number of the snapshot.
1808c2ecf20Sopenharmony_ci
1818c2ecf20Sopenharmony_ciTo unmount the NILFS2 mount point or snapshot, simply::
1828c2ecf20Sopenharmony_ci
1838c2ecf20Sopenharmony_ci # umount /dir
1848c2ecf20Sopenharmony_ci
1858c2ecf20Sopenharmony_ciThen, the cleaner daemon is automatically shut down by the umount
1868c2ecf20Sopenharmony_cihelper program (umount.nilfs2).
1878c2ecf20Sopenharmony_ci
1888c2ecf20Sopenharmony_ciDisk format
1898c2ecf20Sopenharmony_ci===========
1908c2ecf20Sopenharmony_ci
1918c2ecf20Sopenharmony_ciA nilfs2 volume is equally divided into a number of segments except
1928c2ecf20Sopenharmony_cifor the super block (SB) and segment #0.  A segment is the container
1938c2ecf20Sopenharmony_ciof logs.  Each log is composed of summary information blocks, payload
1948c2ecf20Sopenharmony_ciblocks, and an optional super root block (SR)::
1958c2ecf20Sopenharmony_ci
1968c2ecf20Sopenharmony_ci   ______________________________________________________
1978c2ecf20Sopenharmony_ci  | |SB| | Segment | Segment | Segment | ... | Segment | |
1988c2ecf20Sopenharmony_ci  |_|__|_|____0____|____1____|____2____|_____|____N____|_|
1998c2ecf20Sopenharmony_ci  0 +1K +4K       +8M       +16M      +24M  +(8MB x N)
2008c2ecf20Sopenharmony_ci       .             .            (Typical offsets for 4KB-block)
2018c2ecf20Sopenharmony_ci    .                  .
2028c2ecf20Sopenharmony_ci  .______________________.
2038c2ecf20Sopenharmony_ci  | log | log |... | log |
2048c2ecf20Sopenharmony_ci  |__1__|__2__|____|__m__|
2058c2ecf20Sopenharmony_ci        .       .
2068c2ecf20Sopenharmony_ci      .               .
2078c2ecf20Sopenharmony_ci    .                       .
2088c2ecf20Sopenharmony_ci  .______________________________.
2098c2ecf20Sopenharmony_ci  | Summary | Payload blocks  |SR|
2108c2ecf20Sopenharmony_ci  |_blocks__|_________________|__|
2118c2ecf20Sopenharmony_ci
2128c2ecf20Sopenharmony_ciThe payload blocks are organized per file, and each file consists of
2138c2ecf20Sopenharmony_cidata blocks and B-tree node blocks::
2148c2ecf20Sopenharmony_ci
2158c2ecf20Sopenharmony_ci    |<---       File-A        --->|<---       File-B        --->|
2168c2ecf20Sopenharmony_ci   _______________________________________________________________
2178c2ecf20Sopenharmony_ci    | Data blocks | B-tree blocks | Data blocks | B-tree blocks | ...
2188c2ecf20Sopenharmony_ci   _|_____________|_______________|_____________|_______________|_
2198c2ecf20Sopenharmony_ci
2208c2ecf20Sopenharmony_ci
2218c2ecf20Sopenharmony_ciSince only the modified blocks are written in the log, it may have
2228c2ecf20Sopenharmony_cifiles without data blocks or B-tree node blocks.
2238c2ecf20Sopenharmony_ci
2248c2ecf20Sopenharmony_ciThe organization of the blocks is recorded in the summary information
2258c2ecf20Sopenharmony_ciblocks, which contains a header structure (nilfs_segment_summary), per
2268c2ecf20Sopenharmony_cifile structures (nilfs_finfo), and per block structures (nilfs_binfo)::
2278c2ecf20Sopenharmony_ci
2288c2ecf20Sopenharmony_ci  _________________________________________________________________________
2298c2ecf20Sopenharmony_ci | Summary | finfo | binfo | ... | binfo | finfo | binfo | ... | binfo |...
2308c2ecf20Sopenharmony_ci |_blocks__|___A___|_(A,1)_|_____|(A,Na)_|___B___|_(B,1)_|_____|(B,Nb)_|___
2318c2ecf20Sopenharmony_ci
2328c2ecf20Sopenharmony_ci
2338c2ecf20Sopenharmony_ciThe logs include regular files, directory files, symbolic link files
2348c2ecf20Sopenharmony_ciand several meta data files.  The mata data files are the files used
2358c2ecf20Sopenharmony_cito maintain file system meta data.  The current version of NILFS2 uses
2368c2ecf20Sopenharmony_cithe following meta data files::
2378c2ecf20Sopenharmony_ci
2388c2ecf20Sopenharmony_ci 1) Inode file (ifile)             -- Stores on-disk inodes
2398c2ecf20Sopenharmony_ci 2) Checkpoint file (cpfile)       -- Stores checkpoints
2408c2ecf20Sopenharmony_ci 3) Segment usage file (sufile)    -- Stores allocation state of segments
2418c2ecf20Sopenharmony_ci 4) Data address translation file  -- Maps virtual block numbers to usual
2428c2ecf20Sopenharmony_ci    (DAT)                             block numbers.  This file serves to
2438c2ecf20Sopenharmony_ci                                      make on-disk blocks relocatable.
2448c2ecf20Sopenharmony_ci
2458c2ecf20Sopenharmony_ciThe following figure shows a typical organization of the logs::
2468c2ecf20Sopenharmony_ci
2478c2ecf20Sopenharmony_ci  _________________________________________________________________________
2488c2ecf20Sopenharmony_ci | Summary | regular file | file  | ... | ifile | cpfile | sufile | DAT |SR|
2498c2ecf20Sopenharmony_ci |_blocks__|_or_directory_|_______|_____|_______|________|________|_____|__|
2508c2ecf20Sopenharmony_ci
2518c2ecf20Sopenharmony_ci
2528c2ecf20Sopenharmony_ciTo stride over segment boundaries, this sequence of files may be split
2538c2ecf20Sopenharmony_ciinto multiple logs.  The sequence of logs that should be treated as
2548c2ecf20Sopenharmony_cilogically one log, is delimited with flags marked in the segment
2558c2ecf20Sopenharmony_cisummary.  The recovery code of nilfs2 looks this boundary information
2568c2ecf20Sopenharmony_cito ensure atomicity of updates.
2578c2ecf20Sopenharmony_ci
2588c2ecf20Sopenharmony_ciThe super root block is inserted for every checkpoints.  It includes
2598c2ecf20Sopenharmony_cithree special inodes, inodes for the DAT, cpfile, and sufile.  Inodes
2608c2ecf20Sopenharmony_ciof regular files, directories, symlinks and other special files, are
2618c2ecf20Sopenharmony_ciincluded in the ifile.  The inode of ifile itself is included in the
2628c2ecf20Sopenharmony_cicorresponding checkpoint entry in the cpfile.  Thus, the hierarchy
2638c2ecf20Sopenharmony_ciamong NILFS2 files can be depicted as follows::
2648c2ecf20Sopenharmony_ci
2658c2ecf20Sopenharmony_ci  Super block (SB)
2668c2ecf20Sopenharmony_ci       |
2678c2ecf20Sopenharmony_ci       v
2688c2ecf20Sopenharmony_ci  Super root block (the latest cno=xx)
2698c2ecf20Sopenharmony_ci       |-- DAT
2708c2ecf20Sopenharmony_ci       |-- sufile
2718c2ecf20Sopenharmony_ci       `-- cpfile
2728c2ecf20Sopenharmony_ci              |-- ifile (cno=c1)
2738c2ecf20Sopenharmony_ci              |-- ifile (cno=c2) ---- file (ino=i1)
2748c2ecf20Sopenharmony_ci              :        :          |-- file (ino=i2)
2758c2ecf20Sopenharmony_ci              `-- ifile (cno=xx)  |-- file (ino=i3)
2768c2ecf20Sopenharmony_ci                                  :        :
2778c2ecf20Sopenharmony_ci                                  `-- file (ino=yy)
2788c2ecf20Sopenharmony_ci                                    ( regular file, directory, or symlink )
2798c2ecf20Sopenharmony_ci
2808c2ecf20Sopenharmony_ciFor detail on the format of each file, please see nilfs2_ondisk.h
2818c2ecf20Sopenharmony_cilocated at include/uapi/linux directory.
2828c2ecf20Sopenharmony_ci
2838c2ecf20Sopenharmony_ciThere are no patents or other intellectual property that we protect
2848c2ecf20Sopenharmony_ciwith regard to the design of NILFS2.  It is allowed to replicate the
2858c2ecf20Sopenharmony_cidesign in hopes that other operating systems could share (mount, read,
2868c2ecf20Sopenharmony_ciwrite, etc.) data stored in this format.
287