162306a36Sopenharmony_ci.. SPDX-License-Identifier: GPL-2.0 262306a36Sopenharmony_ci 362306a36Sopenharmony_ci======================== 462306a36Sopenharmony_ciext4 General Information 562306a36Sopenharmony_ci======================== 662306a36Sopenharmony_ci 762306a36Sopenharmony_ciExt4 is an advanced level of the ext3 filesystem which incorporates 862306a36Sopenharmony_ciscalability and reliability enhancements for supporting large filesystems 962306a36Sopenharmony_ci(64 bit) in keeping with increasing disk capacities and state-of-the-art 1062306a36Sopenharmony_cifeature requirements. 1162306a36Sopenharmony_ci 1262306a36Sopenharmony_ciMailing list: linux-ext4@vger.kernel.org 1362306a36Sopenharmony_ciWeb site: http://ext4.wiki.kernel.org 1462306a36Sopenharmony_ci 1562306a36Sopenharmony_ci 1662306a36Sopenharmony_ciQuick usage instructions 1762306a36Sopenharmony_ci======================== 1862306a36Sopenharmony_ci 1962306a36Sopenharmony_ciNote: More extensive information for getting started with ext4 can be 2062306a36Sopenharmony_cifound at the ext4 wiki site at the URL: 2162306a36Sopenharmony_cihttp://ext4.wiki.kernel.org/index.php/Ext4_Howto 2262306a36Sopenharmony_ci 2362306a36Sopenharmony_ci - The latest version of e2fsprogs can be found at: 2462306a36Sopenharmony_ci 2562306a36Sopenharmony_ci https://www.kernel.org/pub/linux/kernel/people/tytso/e2fsprogs/ 2662306a36Sopenharmony_ci 2762306a36Sopenharmony_ci or 2862306a36Sopenharmony_ci 2962306a36Sopenharmony_ci http://sourceforge.net/project/showfiles.php?group_id=2406 3062306a36Sopenharmony_ci 3162306a36Sopenharmony_ci or grab the latest git repository from: 3262306a36Sopenharmony_ci 3362306a36Sopenharmony_ci https://git.kernel.org/pub/scm/fs/ext2/e2fsprogs.git 3462306a36Sopenharmony_ci 3562306a36Sopenharmony_ci - Create a new filesystem using the ext4 filesystem type: 3662306a36Sopenharmony_ci 3762306a36Sopenharmony_ci # mke2fs -t ext4 /dev/hda1 3862306a36Sopenharmony_ci 3962306a36Sopenharmony_ci Or to configure an existing ext3 filesystem to support extents: 4062306a36Sopenharmony_ci 4162306a36Sopenharmony_ci # tune2fs -O extents /dev/hda1 4262306a36Sopenharmony_ci 4362306a36Sopenharmony_ci If the filesystem was created with 128 byte inodes, it can be 4462306a36Sopenharmony_ci converted to use 256 byte for greater efficiency via: 4562306a36Sopenharmony_ci 4662306a36Sopenharmony_ci # tune2fs -I 256 /dev/hda1 4762306a36Sopenharmony_ci 4862306a36Sopenharmony_ci - Mounting: 4962306a36Sopenharmony_ci 5062306a36Sopenharmony_ci # mount -t ext4 /dev/hda1 /wherever 5162306a36Sopenharmony_ci 5262306a36Sopenharmony_ci - When comparing performance with other filesystems, it's always 5362306a36Sopenharmony_ci important to try multiple workloads; very often a subtle change in a 5462306a36Sopenharmony_ci workload parameter can completely change the ranking of which 5562306a36Sopenharmony_ci filesystems do well compared to others. When comparing versus ext3, 5662306a36Sopenharmony_ci note that ext4 enables write barriers by default, while ext3 does 5762306a36Sopenharmony_ci not enable write barriers by default. So it is useful to use 5862306a36Sopenharmony_ci explicitly specify whether barriers are enabled or not when via the 5962306a36Sopenharmony_ci '-o barriers=[0|1]' mount option for both ext3 and ext4 filesystems 6062306a36Sopenharmony_ci for a fair comparison. When tuning ext3 for best benchmark numbers, 6162306a36Sopenharmony_ci it is often worthwhile to try changing the data journaling mode; '-o 6262306a36Sopenharmony_ci data=writeback' can be faster for some workloads. (Note however that 6362306a36Sopenharmony_ci running mounted with data=writeback can potentially leave stale data 6462306a36Sopenharmony_ci exposed in recently written files in case of an unclean shutdown, 6562306a36Sopenharmony_ci which could be a security exposure in some situations.) Configuring 6662306a36Sopenharmony_ci the filesystem with a large journal can also be helpful for 6762306a36Sopenharmony_ci metadata-intensive workloads. 6862306a36Sopenharmony_ci 6962306a36Sopenharmony_ciFeatures 7062306a36Sopenharmony_ci======== 7162306a36Sopenharmony_ci 7262306a36Sopenharmony_ciCurrently Available 7362306a36Sopenharmony_ci------------------- 7462306a36Sopenharmony_ci 7562306a36Sopenharmony_ci* ability to use filesystems > 16TB (e2fsprogs support not available yet) 7662306a36Sopenharmony_ci* extent format reduces metadata overhead (RAM, IO for access, transactions) 7762306a36Sopenharmony_ci* extent format more robust in face of on-disk corruption due to magics, 7862306a36Sopenharmony_ci* internal redundancy in tree 7962306a36Sopenharmony_ci* improved file allocation (multi-block alloc) 8062306a36Sopenharmony_ci* lift 32000 subdirectory limit imposed by i_links_count[1] 8162306a36Sopenharmony_ci* nsec timestamps for mtime, atime, ctime, create time 8262306a36Sopenharmony_ci* inode version field on disk (NFSv4, Lustre) 8362306a36Sopenharmony_ci* reduced e2fsck time via uninit_bg feature 8462306a36Sopenharmony_ci* journal checksumming for robustness, performance 8562306a36Sopenharmony_ci* persistent file preallocation (e.g for streaming media, databases) 8662306a36Sopenharmony_ci* ability to pack bitmaps and inode tables into larger virtual groups via the 8762306a36Sopenharmony_ci flex_bg feature 8862306a36Sopenharmony_ci* large file support 8962306a36Sopenharmony_ci* inode allocation using large virtual block groups via flex_bg 9062306a36Sopenharmony_ci* delayed allocation 9162306a36Sopenharmony_ci* large block (up to pagesize) support 9262306a36Sopenharmony_ci* efficient new ordered mode in JBD2 and ext4 (avoid using buffer head to force 9362306a36Sopenharmony_ci the ordering) 9462306a36Sopenharmony_ci* Case-insensitive file name lookups 9562306a36Sopenharmony_ci* file-based encryption support (fscrypt) 9662306a36Sopenharmony_ci* file-based verity support (fsverity) 9762306a36Sopenharmony_ci 9862306a36Sopenharmony_ci[1] Filesystems with a block size of 1k may see a limit imposed by the 9962306a36Sopenharmony_cidirectory hash tree having a maximum depth of two. 10062306a36Sopenharmony_ci 10162306a36Sopenharmony_cicase-insensitive file name lookups 10262306a36Sopenharmony_ci====================================================== 10362306a36Sopenharmony_ci 10462306a36Sopenharmony_ciThe case-insensitive file name lookup feature is supported on a 10562306a36Sopenharmony_ciper-directory basis, allowing the user to mix case-insensitive and 10662306a36Sopenharmony_cicase-sensitive directories in the same filesystem. It is enabled by 10762306a36Sopenharmony_ciflipping the +F inode attribute of an empty directory. The 10862306a36Sopenharmony_cicase-insensitive string match operation is only defined when we know how 10962306a36Sopenharmony_citext in encoded in a byte sequence. For that reason, in order to enable 11062306a36Sopenharmony_cicase-insensitive directories, the filesystem must have the 11162306a36Sopenharmony_cicasefold feature, which stores the filesystem-wide encoding 11262306a36Sopenharmony_cimodel used. By default, the charset adopted is the latest version of 11362306a36Sopenharmony_ciUnicode (12.1.0, by the time of this writing), encoded in the UTF-8 11462306a36Sopenharmony_ciform. The comparison algorithm is implemented by normalizing the 11562306a36Sopenharmony_cistrings to the Canonical decomposition form, as defined by Unicode, 11662306a36Sopenharmony_cifollowed by a byte per byte comparison. 11762306a36Sopenharmony_ci 11862306a36Sopenharmony_ciThe case-awareness is name-preserving on the disk, meaning that the file 11962306a36Sopenharmony_ciname provided by userspace is a byte-per-byte match to what is actually 12062306a36Sopenharmony_ciwritten in the disk. The Unicode normalization format used by the 12162306a36Sopenharmony_cikernel is thus an internal representation, and not exposed to the 12262306a36Sopenharmony_ciuserspace nor to the disk, with the important exception of disk hashes, 12362306a36Sopenharmony_ciused on large case-insensitive directories with DX feature. On DX 12462306a36Sopenharmony_cidirectories, the hash must be calculated using the casefolded version of 12562306a36Sopenharmony_cithe filename, meaning that the normalization format used actually has an 12662306a36Sopenharmony_ciimpact on where the directory entry is stored. 12762306a36Sopenharmony_ci 12862306a36Sopenharmony_ciWhen we change from viewing filenames as opaque byte sequences to seeing 12962306a36Sopenharmony_cithem as encoded strings we need to address what happens when a program 13062306a36Sopenharmony_citries to create a file with an invalid name. The Unicode subsystem 13162306a36Sopenharmony_ciwithin the kernel leaves the decision of what to do in this case to the 13262306a36Sopenharmony_cifilesystem, which select its preferred behavior by enabling/disabling 13362306a36Sopenharmony_cithe strict mode. When Ext4 encounters one of those strings and the 13462306a36Sopenharmony_cifilesystem did not require strict mode, it falls back to considering the 13562306a36Sopenharmony_cientire string as an opaque byte sequence, which still allows the user to 13662306a36Sopenharmony_cioperate on that file, but the case-insensitive lookups won't work. 13762306a36Sopenharmony_ci 13862306a36Sopenharmony_ciOptions 13962306a36Sopenharmony_ci======= 14062306a36Sopenharmony_ci 14162306a36Sopenharmony_ciWhen mounting an ext4 filesystem, the following option are accepted: 14262306a36Sopenharmony_ci(*) == default 14362306a36Sopenharmony_ci 14462306a36Sopenharmony_ci ro 14562306a36Sopenharmony_ci Mount filesystem read only. Note that ext4 will replay the journal (and 14662306a36Sopenharmony_ci thus write to the partition) even when mounted "read only". The mount 14762306a36Sopenharmony_ci options "ro,noload" can be used to prevent writes to the filesystem. 14862306a36Sopenharmony_ci 14962306a36Sopenharmony_ci journal_checksum 15062306a36Sopenharmony_ci Enable checksumming of the journal transactions. This will allow the 15162306a36Sopenharmony_ci recovery code in e2fsck and the kernel to detect corruption in the 15262306a36Sopenharmony_ci kernel. It is a compatible change and will be ignored by older 15362306a36Sopenharmony_ci kernels. 15462306a36Sopenharmony_ci 15562306a36Sopenharmony_ci journal_async_commit 15662306a36Sopenharmony_ci Commit block can be written to disk without waiting for descriptor 15762306a36Sopenharmony_ci blocks. If enabled older kernels cannot mount the device. This will 15862306a36Sopenharmony_ci enable 'journal_checksum' internally. 15962306a36Sopenharmony_ci 16062306a36Sopenharmony_ci journal_path=path, journal_dev=devnum 16162306a36Sopenharmony_ci When the external journal device's major/minor numbers have changed, 16262306a36Sopenharmony_ci these options allow the user to specify the new journal location. The 16362306a36Sopenharmony_ci journal device is identified through either its new major/minor numbers 16462306a36Sopenharmony_ci encoded in devnum, or via a path to the device. 16562306a36Sopenharmony_ci 16662306a36Sopenharmony_ci norecovery, noload 16762306a36Sopenharmony_ci Don't load the journal on mounting. Note that if the filesystem was 16862306a36Sopenharmony_ci not unmounted cleanly, skipping the journal replay will lead to the 16962306a36Sopenharmony_ci filesystem containing inconsistencies that can lead to any number of 17062306a36Sopenharmony_ci problems. 17162306a36Sopenharmony_ci 17262306a36Sopenharmony_ci data=journal 17362306a36Sopenharmony_ci All data are committed into the journal prior to being written into the 17462306a36Sopenharmony_ci main file system. Enabling this mode will disable delayed allocation 17562306a36Sopenharmony_ci and O_DIRECT support. 17662306a36Sopenharmony_ci 17762306a36Sopenharmony_ci data=ordered (*) 17862306a36Sopenharmony_ci All data are forced directly out to the main file system prior to its 17962306a36Sopenharmony_ci metadata being committed to the journal. 18062306a36Sopenharmony_ci 18162306a36Sopenharmony_ci data=writeback 18262306a36Sopenharmony_ci Data ordering is not preserved, data may be written into the main file 18362306a36Sopenharmony_ci system after its metadata has been committed to the journal. 18462306a36Sopenharmony_ci 18562306a36Sopenharmony_ci commit=nrsec (*) 18662306a36Sopenharmony_ci This setting limits the maximum age of the running transaction to 18762306a36Sopenharmony_ci 'nrsec' seconds. The default value is 5 seconds. This means that if 18862306a36Sopenharmony_ci you lose your power, you will lose as much as the latest 5 seconds of 18962306a36Sopenharmony_ci metadata changes (your filesystem will not be damaged though, thanks 19062306a36Sopenharmony_ci to the journaling). This default value (or any low value) will hurt 19162306a36Sopenharmony_ci performance, but it's good for data-safety. Setting it to 0 will have 19262306a36Sopenharmony_ci the same effect as leaving it at the default (5 seconds). Setting it 19362306a36Sopenharmony_ci to very large values will improve performance. Note that due to 19462306a36Sopenharmony_ci delayed allocation even older data can be lost on power failure since 19562306a36Sopenharmony_ci writeback of those data begins only after time set in 19662306a36Sopenharmony_ci /proc/sys/vm/dirty_expire_centisecs. 19762306a36Sopenharmony_ci 19862306a36Sopenharmony_ci barrier=<0|1(*)>, barrier(*), nobarrier 19962306a36Sopenharmony_ci This enables/disables the use of write barriers in the jbd code. 20062306a36Sopenharmony_ci barrier=0 disables, barrier=1 enables. This also requires an IO stack 20162306a36Sopenharmony_ci which can support barriers, and if jbd gets an error on a barrier 20262306a36Sopenharmony_ci write, it will disable again with a warning. Write barriers enforce 20362306a36Sopenharmony_ci proper on-disk ordering of journal commits, making volatile disk write 20462306a36Sopenharmony_ci caches safe to use, at some performance penalty. If your disks are 20562306a36Sopenharmony_ci battery-backed in one way or another, disabling barriers may safely 20662306a36Sopenharmony_ci improve performance. The mount options "barrier" and "nobarrier" can 20762306a36Sopenharmony_ci also be used to enable or disable barriers, for consistency with other 20862306a36Sopenharmony_ci ext4 mount options. 20962306a36Sopenharmony_ci 21062306a36Sopenharmony_ci inode_readahead_blks=n 21162306a36Sopenharmony_ci This tuning parameter controls the maximum number of inode table blocks 21262306a36Sopenharmony_ci that ext4's inode table readahead algorithm will pre-read into the 21362306a36Sopenharmony_ci buffer cache. The default value is 32 blocks. 21462306a36Sopenharmony_ci 21562306a36Sopenharmony_ci nouser_xattr 21662306a36Sopenharmony_ci Disables Extended User Attributes. See the attr(5) manual page for 21762306a36Sopenharmony_ci more information about extended attributes. 21862306a36Sopenharmony_ci 21962306a36Sopenharmony_ci noacl 22062306a36Sopenharmony_ci This option disables POSIX Access Control List support. If ACL support 22162306a36Sopenharmony_ci is enabled in the kernel configuration (CONFIG_EXT4_FS_POSIX_ACL), ACL 22262306a36Sopenharmony_ci is enabled by default on mount. See the acl(5) manual page for more 22362306a36Sopenharmony_ci information about acl. 22462306a36Sopenharmony_ci 22562306a36Sopenharmony_ci bsddf (*) 22662306a36Sopenharmony_ci Make 'df' act like BSD. 22762306a36Sopenharmony_ci 22862306a36Sopenharmony_ci minixdf 22962306a36Sopenharmony_ci Make 'df' act like Minix. 23062306a36Sopenharmony_ci 23162306a36Sopenharmony_ci debug 23262306a36Sopenharmony_ci Extra debugging information is sent to syslog. 23362306a36Sopenharmony_ci 23462306a36Sopenharmony_ci abort 23562306a36Sopenharmony_ci Simulate the effects of calling ext4_abort() for debugging purposes. 23662306a36Sopenharmony_ci This is normally used while remounting a filesystem which is already 23762306a36Sopenharmony_ci mounted. 23862306a36Sopenharmony_ci 23962306a36Sopenharmony_ci errors=remount-ro 24062306a36Sopenharmony_ci Remount the filesystem read-only on an error. 24162306a36Sopenharmony_ci 24262306a36Sopenharmony_ci errors=continue 24362306a36Sopenharmony_ci Keep going on a filesystem error. 24462306a36Sopenharmony_ci 24562306a36Sopenharmony_ci errors=panic 24662306a36Sopenharmony_ci Panic and halt the machine if an error occurs. (These mount options 24762306a36Sopenharmony_ci override the errors behavior specified in the superblock, which can be 24862306a36Sopenharmony_ci configured using tune2fs) 24962306a36Sopenharmony_ci 25062306a36Sopenharmony_ci data_err=ignore(*) 25162306a36Sopenharmony_ci Just print an error message if an error occurs in a file data buffer in 25262306a36Sopenharmony_ci ordered mode. 25362306a36Sopenharmony_ci data_err=abort 25462306a36Sopenharmony_ci Abort the journal if an error occurs in a file data buffer in ordered 25562306a36Sopenharmony_ci mode. 25662306a36Sopenharmony_ci 25762306a36Sopenharmony_ci grpid | bsdgroups 25862306a36Sopenharmony_ci New objects have the group ID of their parent. 25962306a36Sopenharmony_ci 26062306a36Sopenharmony_ci nogrpid (*) | sysvgroups 26162306a36Sopenharmony_ci New objects have the group ID of their creator. 26262306a36Sopenharmony_ci 26362306a36Sopenharmony_ci resgid=n 26462306a36Sopenharmony_ci The group ID which may use the reserved blocks. 26562306a36Sopenharmony_ci 26662306a36Sopenharmony_ci resuid=n 26762306a36Sopenharmony_ci The user ID which may use the reserved blocks. 26862306a36Sopenharmony_ci 26962306a36Sopenharmony_ci sb= 27062306a36Sopenharmony_ci Use alternate superblock at this location. 27162306a36Sopenharmony_ci 27262306a36Sopenharmony_ci quota, noquota, grpquota, usrquota 27362306a36Sopenharmony_ci These options are ignored by the filesystem. They are used only by 27462306a36Sopenharmony_ci quota tools to recognize volumes where quota should be turned on. See 27562306a36Sopenharmony_ci documentation in the quota-tools package for more details 27662306a36Sopenharmony_ci (http://sourceforge.net/projects/linuxquota). 27762306a36Sopenharmony_ci 27862306a36Sopenharmony_ci jqfmt=<quota type>, usrjquota=<file>, grpjquota=<file> 27962306a36Sopenharmony_ci These options tell filesystem details about quota so that quota 28062306a36Sopenharmony_ci information can be properly updated during journal replay. They replace 28162306a36Sopenharmony_ci the above quota options. See documentation in the quota-tools package 28262306a36Sopenharmony_ci for more details (http://sourceforge.net/projects/linuxquota). 28362306a36Sopenharmony_ci 28462306a36Sopenharmony_ci stripe=n 28562306a36Sopenharmony_ci Number of filesystem blocks that mballoc will try to use for allocation 28662306a36Sopenharmony_ci size and alignment. For RAID5/6 systems this should be the number of 28762306a36Sopenharmony_ci data disks * RAID chunk size in file system blocks. 28862306a36Sopenharmony_ci 28962306a36Sopenharmony_ci delalloc (*) 29062306a36Sopenharmony_ci Defer block allocation until just before ext4 writes out the block(s) 29162306a36Sopenharmony_ci in question. This allows ext4 to better allocation decisions more 29262306a36Sopenharmony_ci efficiently. 29362306a36Sopenharmony_ci 29462306a36Sopenharmony_ci nodelalloc 29562306a36Sopenharmony_ci Disable delayed allocation. Blocks are allocated when the data is 29662306a36Sopenharmony_ci copied from userspace to the page cache, either via the write(2) system 29762306a36Sopenharmony_ci call or when an mmap'ed page which was previously unallocated is 29862306a36Sopenharmony_ci written for the first time. 29962306a36Sopenharmony_ci 30062306a36Sopenharmony_ci max_batch_time=usec 30162306a36Sopenharmony_ci Maximum amount of time ext4 should wait for additional filesystem 30262306a36Sopenharmony_ci operations to be batch together with a synchronous write operation. 30362306a36Sopenharmony_ci Since a synchronous write operation is going to force a commit and then 30462306a36Sopenharmony_ci a wait for the I/O complete, it doesn't cost much, and can be a huge 30562306a36Sopenharmony_ci throughput win, we wait for a small amount of time to see if any other 30662306a36Sopenharmony_ci transactions can piggyback on the synchronous write. The algorithm 30762306a36Sopenharmony_ci used is designed to automatically tune for the speed of the disk, by 30862306a36Sopenharmony_ci measuring the amount of time (on average) that it takes to finish 30962306a36Sopenharmony_ci committing a transaction. Call this time the "commit time". If the 31062306a36Sopenharmony_ci time that the transaction has been running is less than the commit 31162306a36Sopenharmony_ci time, ext4 will try sleeping for the commit time to see if other 31262306a36Sopenharmony_ci operations will join the transaction. The commit time is capped by 31362306a36Sopenharmony_ci the max_batch_time, which defaults to 15000us (15ms). This 31462306a36Sopenharmony_ci optimization can be turned off entirely by setting max_batch_time to 0. 31562306a36Sopenharmony_ci 31662306a36Sopenharmony_ci min_batch_time=usec 31762306a36Sopenharmony_ci This parameter sets the commit time (as described above) to be at least 31862306a36Sopenharmony_ci min_batch_time. It defaults to zero microseconds. Increasing this 31962306a36Sopenharmony_ci parameter may improve the throughput of multi-threaded, synchronous 32062306a36Sopenharmony_ci workloads on very fast disks, at the cost of increasing latency. 32162306a36Sopenharmony_ci 32262306a36Sopenharmony_ci journal_ioprio=prio 32362306a36Sopenharmony_ci The I/O priority (from 0 to 7, where 0 is the highest priority) which 32462306a36Sopenharmony_ci should be used for I/O operations submitted by kjournald2 during a 32562306a36Sopenharmony_ci commit operation. This defaults to 3, which is a slightly higher 32662306a36Sopenharmony_ci priority than the default I/O priority. 32762306a36Sopenharmony_ci 32862306a36Sopenharmony_ci auto_da_alloc(*), noauto_da_alloc 32962306a36Sopenharmony_ci Many broken applications don't use fsync() when replacing existing 33062306a36Sopenharmony_ci files via patterns such as fd = open("foo.new")/write(fd,..)/close(fd)/ 33162306a36Sopenharmony_ci rename("foo.new", "foo"), or worse yet, fd = open("foo", 33262306a36Sopenharmony_ci O_TRUNC)/write(fd,..)/close(fd). If auto_da_alloc is enabled, ext4 33362306a36Sopenharmony_ci will detect the replace-via-rename and replace-via-truncate patterns 33462306a36Sopenharmony_ci and force that any delayed allocation blocks are allocated such that at 33562306a36Sopenharmony_ci the next journal commit, in the default data=ordered mode, the data 33662306a36Sopenharmony_ci blocks of the new file are forced to disk before the rename() operation 33762306a36Sopenharmony_ci is committed. This provides roughly the same level of guarantees as 33862306a36Sopenharmony_ci ext3, and avoids the "zero-length" problem that can happen when a 33962306a36Sopenharmony_ci system crashes before the delayed allocation blocks are forced to disk. 34062306a36Sopenharmony_ci 34162306a36Sopenharmony_ci noinit_itable 34262306a36Sopenharmony_ci Do not initialize any uninitialized inode table blocks in the 34362306a36Sopenharmony_ci background. This feature may be used by installation CD's so that the 34462306a36Sopenharmony_ci install process can complete as quickly as possible; the inode table 34562306a36Sopenharmony_ci initialization process would then be deferred until the next time the 34662306a36Sopenharmony_ci file system is unmounted. 34762306a36Sopenharmony_ci 34862306a36Sopenharmony_ci init_itable=n 34962306a36Sopenharmony_ci The lazy itable init code will wait n times the number of milliseconds 35062306a36Sopenharmony_ci it took to zero out the previous block group's inode table. This 35162306a36Sopenharmony_ci minimizes the impact on the system performance while file system's 35262306a36Sopenharmony_ci inode table is being initialized. 35362306a36Sopenharmony_ci 35462306a36Sopenharmony_ci discard, nodiscard(*) 35562306a36Sopenharmony_ci Controls whether ext4 should issue discard/TRIM commands to the 35662306a36Sopenharmony_ci underlying block device when blocks are freed. This is useful for SSD 35762306a36Sopenharmony_ci devices and sparse/thinly-provisioned LUNs, but it is off by default 35862306a36Sopenharmony_ci until sufficient testing has been done. 35962306a36Sopenharmony_ci 36062306a36Sopenharmony_ci nouid32 36162306a36Sopenharmony_ci Disables 32-bit UIDs and GIDs. This is for interoperability with 36262306a36Sopenharmony_ci older kernels which only store and expect 16-bit values. 36362306a36Sopenharmony_ci 36462306a36Sopenharmony_ci block_validity(*), noblock_validity 36562306a36Sopenharmony_ci These options enable or disable the in-kernel facility for tracking 36662306a36Sopenharmony_ci filesystem metadata blocks within internal data structures. This 36762306a36Sopenharmony_ci allows multi- block allocator and other routines to notice bugs or 36862306a36Sopenharmony_ci corrupted allocation bitmaps which cause blocks to be allocated which 36962306a36Sopenharmony_ci overlap with filesystem metadata blocks. 37062306a36Sopenharmony_ci 37162306a36Sopenharmony_ci dioread_lock, dioread_nolock 37262306a36Sopenharmony_ci Controls whether or not ext4 should use the DIO read locking. If the 37362306a36Sopenharmony_ci dioread_nolock option is specified ext4 will allocate uninitialized 37462306a36Sopenharmony_ci extent before buffer write and convert the extent to initialized after 37562306a36Sopenharmony_ci IO completes. This approach allows ext4 code to avoid using inode 37662306a36Sopenharmony_ci mutex, which improves scalability on high speed storages. However this 37762306a36Sopenharmony_ci does not work with data journaling and dioread_nolock option will be 37862306a36Sopenharmony_ci ignored with kernel warning. Note that dioread_nolock code path is only 37962306a36Sopenharmony_ci used for extent-based files. Because of the restrictions this options 38062306a36Sopenharmony_ci comprises it is off by default (e.g. dioread_lock). 38162306a36Sopenharmony_ci 38262306a36Sopenharmony_ci max_dir_size_kb=n 38362306a36Sopenharmony_ci This limits the size of directories so that any attempt to expand them 38462306a36Sopenharmony_ci beyond the specified limit in kilobytes will cause an ENOSPC error. 38562306a36Sopenharmony_ci This is useful in memory constrained environments, where a very large 38662306a36Sopenharmony_ci directory can cause severe performance problems or even provoke the Out 38762306a36Sopenharmony_ci Of Memory killer. (For example, if there is only 512mb memory 38862306a36Sopenharmony_ci available, a 176mb directory may seriously cramp the system's style.) 38962306a36Sopenharmony_ci 39062306a36Sopenharmony_ci i_version 39162306a36Sopenharmony_ci Enable 64-bit inode version support. This option is off by default. 39262306a36Sopenharmony_ci 39362306a36Sopenharmony_ci dax 39462306a36Sopenharmony_ci Use direct access (no page cache). See 39562306a36Sopenharmony_ci Documentation/filesystems/dax.rst. Note that this option is 39662306a36Sopenharmony_ci incompatible with data=journal. 39762306a36Sopenharmony_ci 39862306a36Sopenharmony_ci inlinecrypt 39962306a36Sopenharmony_ci When possible, encrypt/decrypt the contents of encrypted files using the 40062306a36Sopenharmony_ci blk-crypto framework rather than filesystem-layer encryption. This 40162306a36Sopenharmony_ci allows the use of inline encryption hardware. The on-disk format is 40262306a36Sopenharmony_ci unaffected. For more details, see 40362306a36Sopenharmony_ci Documentation/block/inline-encryption.rst. 40462306a36Sopenharmony_ci 40562306a36Sopenharmony_ciData Mode 40662306a36Sopenharmony_ci========= 40762306a36Sopenharmony_ciThere are 3 different data modes: 40862306a36Sopenharmony_ci 40962306a36Sopenharmony_ci* writeback mode 41062306a36Sopenharmony_ci 41162306a36Sopenharmony_ci In data=writeback mode, ext4 does not journal data at all. This mode provides 41262306a36Sopenharmony_ci a similar level of journaling as that of XFS, JFS, and ReiserFS in its default 41362306a36Sopenharmony_ci mode - metadata journaling. A crash+recovery can cause incorrect data to 41462306a36Sopenharmony_ci appear in files which were written shortly before the crash. This mode will 41562306a36Sopenharmony_ci typically provide the best ext4 performance. 41662306a36Sopenharmony_ci 41762306a36Sopenharmony_ci* ordered mode 41862306a36Sopenharmony_ci 41962306a36Sopenharmony_ci In data=ordered mode, ext4 only officially journals metadata, but it logically 42062306a36Sopenharmony_ci groups metadata information related to data changes with the data blocks into 42162306a36Sopenharmony_ci a single unit called a transaction. When it's time to write the new metadata 42262306a36Sopenharmony_ci out to disk, the associated data blocks are written first. In general, this 42362306a36Sopenharmony_ci mode performs slightly slower than writeback but significantly faster than 42462306a36Sopenharmony_ci journal mode. 42562306a36Sopenharmony_ci 42662306a36Sopenharmony_ci* journal mode 42762306a36Sopenharmony_ci 42862306a36Sopenharmony_ci data=journal mode provides full data and metadata journaling. All new data is 42962306a36Sopenharmony_ci written to the journal first, and then to its final location. In the event of 43062306a36Sopenharmony_ci a crash, the journal can be replayed, bringing both data and metadata into a 43162306a36Sopenharmony_ci consistent state. This mode is the slowest except when data needs to be read 43262306a36Sopenharmony_ci from and written to disk at the same time where it outperforms all others 43362306a36Sopenharmony_ci modes. Enabling this mode will disable delayed allocation and O_DIRECT 43462306a36Sopenharmony_ci support. 43562306a36Sopenharmony_ci 43662306a36Sopenharmony_ci/proc entries 43762306a36Sopenharmony_ci============= 43862306a36Sopenharmony_ci 43962306a36Sopenharmony_ciInformation about mounted ext4 file systems can be found in 44062306a36Sopenharmony_ci/proc/fs/ext4. Each mounted filesystem will have a directory in 44162306a36Sopenharmony_ci/proc/fs/ext4 based on its device name (i.e., /proc/fs/ext4/hdc or 44262306a36Sopenharmony_ci/proc/fs/ext4/dm-0). The files in each per-device directory are shown 44362306a36Sopenharmony_ciin table below. 44462306a36Sopenharmony_ci 44562306a36Sopenharmony_ciFiles in /proc/fs/ext4/<devname> 44662306a36Sopenharmony_ci 44762306a36Sopenharmony_ci mb_groups 44862306a36Sopenharmony_ci details of multiblock allocator buddy cache of free blocks 44962306a36Sopenharmony_ci 45062306a36Sopenharmony_ci/sys entries 45162306a36Sopenharmony_ci============ 45262306a36Sopenharmony_ci 45362306a36Sopenharmony_ciInformation about mounted ext4 file systems can be found in 45462306a36Sopenharmony_ci/sys/fs/ext4. Each mounted filesystem will have a directory in 45562306a36Sopenharmony_ci/sys/fs/ext4 based on its device name (i.e., /sys/fs/ext4/hdc or 45662306a36Sopenharmony_ci/sys/fs/ext4/dm-0). The files in each per-device directory are shown 45762306a36Sopenharmony_ciin table below. 45862306a36Sopenharmony_ci 45962306a36Sopenharmony_ciFiles in /sys/fs/ext4/<devname>: 46062306a36Sopenharmony_ci 46162306a36Sopenharmony_ci(see also Documentation/ABI/testing/sysfs-fs-ext4) 46262306a36Sopenharmony_ci 46362306a36Sopenharmony_ci delayed_allocation_blocks 46462306a36Sopenharmony_ci This file is read-only and shows the number of blocks that are dirty in 46562306a36Sopenharmony_ci the page cache, but which do not have their location in the filesystem 46662306a36Sopenharmony_ci allocated yet. 46762306a36Sopenharmony_ci 46862306a36Sopenharmony_ci inode_goal 46962306a36Sopenharmony_ci Tuning parameter which (if non-zero) controls the goal inode used by 47062306a36Sopenharmony_ci the inode allocator in preference to all other allocation heuristics. 47162306a36Sopenharmony_ci This is intended for debugging use only, and should be 0 on production 47262306a36Sopenharmony_ci systems. 47362306a36Sopenharmony_ci 47462306a36Sopenharmony_ci inode_readahead_blks 47562306a36Sopenharmony_ci Tuning parameter which controls the maximum number of inode table 47662306a36Sopenharmony_ci blocks that ext4's inode table readahead algorithm will pre-read into 47762306a36Sopenharmony_ci the buffer cache. 47862306a36Sopenharmony_ci 47962306a36Sopenharmony_ci lifetime_write_kbytes 48062306a36Sopenharmony_ci This file is read-only and shows the number of kilobytes of data that 48162306a36Sopenharmony_ci have been written to this filesystem since it was created. 48262306a36Sopenharmony_ci 48362306a36Sopenharmony_ci max_writeback_mb_bump 48462306a36Sopenharmony_ci The maximum number of megabytes the writeback code will try to write 48562306a36Sopenharmony_ci out before move on to another inode. 48662306a36Sopenharmony_ci 48762306a36Sopenharmony_ci mb_group_prealloc 48862306a36Sopenharmony_ci The multiblock allocator will round up allocation requests to a 48962306a36Sopenharmony_ci multiple of this tuning parameter if the stripe size is not set in the 49062306a36Sopenharmony_ci ext4 superblock 49162306a36Sopenharmony_ci 49262306a36Sopenharmony_ci mb_max_to_scan 49362306a36Sopenharmony_ci The maximum number of extents the multiblock allocator will search to 49462306a36Sopenharmony_ci find the best extent. 49562306a36Sopenharmony_ci 49662306a36Sopenharmony_ci mb_min_to_scan 49762306a36Sopenharmony_ci The minimum number of extents the multiblock allocator will search to 49862306a36Sopenharmony_ci find the best extent. 49962306a36Sopenharmony_ci 50062306a36Sopenharmony_ci mb_order2_req 50162306a36Sopenharmony_ci Tuning parameter which controls the minimum size for requests (as a 50262306a36Sopenharmony_ci power of 2) where the buddy cache is used. 50362306a36Sopenharmony_ci 50462306a36Sopenharmony_ci mb_stats 50562306a36Sopenharmony_ci Controls whether the multiblock allocator should collect statistics, 50662306a36Sopenharmony_ci which are shown during the unmount. 1 means to collect statistics, 0 50762306a36Sopenharmony_ci means not to collect statistics. 50862306a36Sopenharmony_ci 50962306a36Sopenharmony_ci mb_stream_req 51062306a36Sopenharmony_ci Files which have fewer blocks than this tunable parameter will have 51162306a36Sopenharmony_ci their blocks allocated out of a block group specific preallocation 51262306a36Sopenharmony_ci pool, so that small files are packed closely together. Each large file 51362306a36Sopenharmony_ci will have its blocks allocated out of its own unique preallocation 51462306a36Sopenharmony_ci pool. 51562306a36Sopenharmony_ci 51662306a36Sopenharmony_ci session_write_kbytes 51762306a36Sopenharmony_ci This file is read-only and shows the number of kilobytes of data that 51862306a36Sopenharmony_ci have been written to this filesystem since it was mounted. 51962306a36Sopenharmony_ci 52062306a36Sopenharmony_ci reserved_clusters 52162306a36Sopenharmony_ci This is RW file and contains number of reserved clusters in the file 52262306a36Sopenharmony_ci system which will be used in the specific situations to avoid costly 52362306a36Sopenharmony_ci zeroout, unexpected ENOSPC, or possible data loss. The default is 2% or 52462306a36Sopenharmony_ci 4096 clusters, whichever is smaller and this can be changed however it 52562306a36Sopenharmony_ci can never exceed number of clusters in the file system. If there is not 52662306a36Sopenharmony_ci enough space for the reserved space when mounting the file mount will 52762306a36Sopenharmony_ci _not_ fail. 52862306a36Sopenharmony_ci 52962306a36Sopenharmony_ciIoctls 53062306a36Sopenharmony_ci====== 53162306a36Sopenharmony_ci 53262306a36Sopenharmony_ciExt4 implements various ioctls which can be used by applications to access 53362306a36Sopenharmony_ciext4-specific functionality. An incomplete list of these ioctls is shown in the 53462306a36Sopenharmony_citable below. This list includes truly ext4-specific ioctls (``EXT4_IOC_*``) as 53562306a36Sopenharmony_ciwell as ioctls that may have been ext4-specific originally but are now supported 53662306a36Sopenharmony_ciby some other filesystem(s) too (``FS_IOC_*``). 53762306a36Sopenharmony_ci 53862306a36Sopenharmony_ciTable of Ext4 ioctls 53962306a36Sopenharmony_ci 54062306a36Sopenharmony_ci FS_IOC_GETFLAGS 54162306a36Sopenharmony_ci Get additional attributes associated with inode. The ioctl argument is 54262306a36Sopenharmony_ci an integer bitfield, with bit values described in ext4.h. 54362306a36Sopenharmony_ci 54462306a36Sopenharmony_ci FS_IOC_SETFLAGS 54562306a36Sopenharmony_ci Set additional attributes associated with inode. The ioctl argument is 54662306a36Sopenharmony_ci an integer bitfield, with bit values described in ext4.h. 54762306a36Sopenharmony_ci 54862306a36Sopenharmony_ci EXT4_IOC_GETVERSION, EXT4_IOC_GETVERSION_OLD 54962306a36Sopenharmony_ci Get the inode i_generation number stored for each inode. The 55062306a36Sopenharmony_ci i_generation number is normally changed only when new inode is created 55162306a36Sopenharmony_ci and it is particularly useful for network filesystems. The '_OLD' 55262306a36Sopenharmony_ci version of this ioctl is an alias for FS_IOC_GETVERSION. 55362306a36Sopenharmony_ci 55462306a36Sopenharmony_ci EXT4_IOC_SETVERSION, EXT4_IOC_SETVERSION_OLD 55562306a36Sopenharmony_ci Set the inode i_generation number stored for each inode. The '_OLD' 55662306a36Sopenharmony_ci version of this ioctl is an alias for FS_IOC_SETVERSION. 55762306a36Sopenharmony_ci 55862306a36Sopenharmony_ci EXT4_IOC_GROUP_EXTEND 55962306a36Sopenharmony_ci This ioctl has the same purpose as the resize mount option. It allows 56062306a36Sopenharmony_ci to resize filesystem to the end of the last existing block group, 56162306a36Sopenharmony_ci further resize has to be done with resize2fs, either online, or 56262306a36Sopenharmony_ci offline. The argument points to the unsigned logn number representing 56362306a36Sopenharmony_ci the filesystem new block count. 56462306a36Sopenharmony_ci 56562306a36Sopenharmony_ci EXT4_IOC_MOVE_EXT 56662306a36Sopenharmony_ci Move the block extents from orig_fd (the one this ioctl is pointing to) 56762306a36Sopenharmony_ci to the donor_fd (the one specified in move_extent structure passed as 56862306a36Sopenharmony_ci an argument to this ioctl). Then, exchange inode metadata between 56962306a36Sopenharmony_ci orig_fd and donor_fd. This is especially useful for online 57062306a36Sopenharmony_ci defragmentation, because the allocator has the opportunity to allocate 57162306a36Sopenharmony_ci moved blocks better, ideally into one contiguous extent. 57262306a36Sopenharmony_ci 57362306a36Sopenharmony_ci EXT4_IOC_GROUP_ADD 57462306a36Sopenharmony_ci Add a new group descriptor to an existing or new group descriptor 57562306a36Sopenharmony_ci block. The new group descriptor is described by ext4_new_group_input 57662306a36Sopenharmony_ci structure, which is passed as an argument to this ioctl. This is 57762306a36Sopenharmony_ci especially useful in conjunction with EXT4_IOC_GROUP_EXTEND, which 57862306a36Sopenharmony_ci allows online resize of the filesystem to the end of the last existing 57962306a36Sopenharmony_ci block group. Those two ioctls combined is used in userspace online 58062306a36Sopenharmony_ci resize tool (e.g. resize2fs). 58162306a36Sopenharmony_ci 58262306a36Sopenharmony_ci EXT4_IOC_MIGRATE 58362306a36Sopenharmony_ci This ioctl operates on the filesystem itself. It converts (migrates) 58462306a36Sopenharmony_ci ext3 indirect block mapped inode to ext4 extent mapped inode by walking 58562306a36Sopenharmony_ci through indirect block mapping of the original inode and converting 58662306a36Sopenharmony_ci contiguous block ranges into ext4 extents of the temporary inode. Then, 58762306a36Sopenharmony_ci inodes are swapped. This ioctl might help, when migrating from ext3 to 58862306a36Sopenharmony_ci ext4 filesystem, however suggestion is to create fresh ext4 filesystem 58962306a36Sopenharmony_ci and copy data from the backup. Note, that filesystem has to support 59062306a36Sopenharmony_ci extents for this ioctl to work. 59162306a36Sopenharmony_ci 59262306a36Sopenharmony_ci EXT4_IOC_ALLOC_DA_BLKS 59362306a36Sopenharmony_ci Force all of the delay allocated blocks to be allocated to preserve 59462306a36Sopenharmony_ci application-expected ext3 behaviour. Note that this will also start 59562306a36Sopenharmony_ci triggering a write of the data blocks, but this behaviour may change in 59662306a36Sopenharmony_ci the future as it is not necessary and has been done this way only for 59762306a36Sopenharmony_ci sake of simplicity. 59862306a36Sopenharmony_ci 59962306a36Sopenharmony_ci EXT4_IOC_RESIZE_FS 60062306a36Sopenharmony_ci Resize the filesystem to a new size. The number of blocks of resized 60162306a36Sopenharmony_ci filesystem is passed in via 64 bit integer argument. The kernel 60262306a36Sopenharmony_ci allocates bitmaps and inode table, the userspace tool thus just passes 60362306a36Sopenharmony_ci the new number of blocks. 60462306a36Sopenharmony_ci 60562306a36Sopenharmony_ci EXT4_IOC_SWAP_BOOT 60662306a36Sopenharmony_ci Swap i_blocks and associated attributes (like i_blocks, i_size, 60762306a36Sopenharmony_ci i_flags, ...) from the specified inode with inode EXT4_BOOT_LOADER_INO 60862306a36Sopenharmony_ci (#5). This is typically used to store a boot loader in a secure part of 60962306a36Sopenharmony_ci the filesystem, where it can't be changed by a normal user by accident. 61062306a36Sopenharmony_ci The data blocks of the previous boot loader will be associated with the 61162306a36Sopenharmony_ci given inode. 61262306a36Sopenharmony_ci 61362306a36Sopenharmony_ciReferences 61462306a36Sopenharmony_ci========== 61562306a36Sopenharmony_ci 61662306a36Sopenharmony_cikernel source: <file:fs/ext4/> 61762306a36Sopenharmony_ci <file:fs/jbd2/> 61862306a36Sopenharmony_ci 61962306a36Sopenharmony_ciprograms: http://e2fsprogs.sourceforge.net/ 62062306a36Sopenharmony_ci 62162306a36Sopenharmony_ciuseful links: https://fedoraproject.org/wiki/ext3-devel 62262306a36Sopenharmony_ci http://www.bullopensource.org/ext4/ 62362306a36Sopenharmony_ci http://ext4.wiki.kernel.org/index.php/Main_Page 62462306a36Sopenharmony_ci https://fedoraproject.org/wiki/Features/Ext4 625