18c2ecf20Sopenharmony_ci.. SPDX-License-Identifier: GPL-2.0
28c2ecf20Sopenharmony_ci
38c2ecf20Sopenharmony_ci
48c2ecf20Sopenharmony_ciThe Second Extended Filesystem
58c2ecf20Sopenharmony_ci==============================
68c2ecf20Sopenharmony_ci
78c2ecf20Sopenharmony_ciext2 was originally released in January 1993.  Written by R\'emy Card,
88c2ecf20Sopenharmony_ciTheodore Ts'o and Stephen Tweedie, it was a major rewrite of the
98c2ecf20Sopenharmony_ciExtended Filesystem.  It is currently still (April 2001) the predominant
108c2ecf20Sopenharmony_cifilesystem in use by Linux.  There are also implementations available
118c2ecf20Sopenharmony_cifor NetBSD, FreeBSD, the GNU HURD, Windows 95/98/NT, OS/2 and RISC OS.
128c2ecf20Sopenharmony_ci
138c2ecf20Sopenharmony_ciOptions
148c2ecf20Sopenharmony_ci=======
158c2ecf20Sopenharmony_ci
168c2ecf20Sopenharmony_ciMost defaults are determined by the filesystem superblock, and can be
178c2ecf20Sopenharmony_ciset using tune2fs(8). Kernel-determined defaults are indicated by (*).
188c2ecf20Sopenharmony_ci
198c2ecf20Sopenharmony_ci====================    ===     ================================================
208c2ecf20Sopenharmony_cibsddf			(*)	Makes ``df`` act like BSD.
218c2ecf20Sopenharmony_ciminixdf				Makes ``df`` act like Minix.
228c2ecf20Sopenharmony_ci
238c2ecf20Sopenharmony_cicheck=none, nocheck	(*)	Don't do extra checking of bitmaps on mount
248c2ecf20Sopenharmony_ci				(check=normal and check=strict options removed)
258c2ecf20Sopenharmony_ci
268c2ecf20Sopenharmony_cidax				Use direct access (no page cache).  See
278c2ecf20Sopenharmony_ci				Documentation/filesystems/dax.txt.
288c2ecf20Sopenharmony_ci
298c2ecf20Sopenharmony_cidebug				Extra debugging information is sent to the
308c2ecf20Sopenharmony_ci				kernel syslog.  Useful for developers.
318c2ecf20Sopenharmony_ci
328c2ecf20Sopenharmony_cierrors=continue			Keep going on a filesystem error.
338c2ecf20Sopenharmony_cierrors=remount-ro		Remount the filesystem read-only on an error.
348c2ecf20Sopenharmony_cierrors=panic			Panic and halt the machine if an error occurs.
358c2ecf20Sopenharmony_ci
368c2ecf20Sopenharmony_cigrpid, bsdgroups		Give objects the same group ID as their parent.
378c2ecf20Sopenharmony_cinogrpid, sysvgroups		New objects have the group ID of their creator.
388c2ecf20Sopenharmony_ci
398c2ecf20Sopenharmony_cinouid32				Use 16-bit UIDs and GIDs.
408c2ecf20Sopenharmony_ci
418c2ecf20Sopenharmony_cioldalloc			Enable the old block allocator. Orlov should
428c2ecf20Sopenharmony_ci				have better performance, we'd like to get some
438c2ecf20Sopenharmony_ci				feedback if it's the contrary for you.
448c2ecf20Sopenharmony_ciorlov			(*)	Use the Orlov block allocator.
458c2ecf20Sopenharmony_ci				(See http://lwn.net/Articles/14633/ and
468c2ecf20Sopenharmony_ci				http://lwn.net/Articles/14446/.)
478c2ecf20Sopenharmony_ci
488c2ecf20Sopenharmony_ciresuid=n			The user ID which may use the reserved blocks.
498c2ecf20Sopenharmony_ciresgid=n			The group ID which may use the reserved blocks.
508c2ecf20Sopenharmony_ci
518c2ecf20Sopenharmony_cisb=n				Use alternate superblock at this location.
528c2ecf20Sopenharmony_ci
538c2ecf20Sopenharmony_ciuser_xattr			Enable "user." POSIX Extended Attributes
548c2ecf20Sopenharmony_ci				(requires CONFIG_EXT2_FS_XATTR).
558c2ecf20Sopenharmony_cinouser_xattr			Don't support "user." extended attributes.
568c2ecf20Sopenharmony_ci
578c2ecf20Sopenharmony_ciacl				Enable POSIX Access Control Lists support
588c2ecf20Sopenharmony_ci				(requires CONFIG_EXT2_FS_POSIX_ACL).
598c2ecf20Sopenharmony_cinoacl				Don't support POSIX ACLs.
608c2ecf20Sopenharmony_ci
618c2ecf20Sopenharmony_cinobh				Do not attach buffer_heads to file pagecache.
628c2ecf20Sopenharmony_ci
638c2ecf20Sopenharmony_ciquota, usrquota			Enable user disk quota support
648c2ecf20Sopenharmony_ci				(requires CONFIG_QUOTA).
658c2ecf20Sopenharmony_ci
668c2ecf20Sopenharmony_cigrpquota			Enable group disk quota support
678c2ecf20Sopenharmony_ci				(requires CONFIG_QUOTA).
688c2ecf20Sopenharmony_ci====================    ===     ================================================
698c2ecf20Sopenharmony_ci
708c2ecf20Sopenharmony_cinoquota option ls silently ignored by ext2.
718c2ecf20Sopenharmony_ci
728c2ecf20Sopenharmony_ci
738c2ecf20Sopenharmony_ciSpecification
748c2ecf20Sopenharmony_ci=============
758c2ecf20Sopenharmony_ci
768c2ecf20Sopenharmony_ciext2 shares many properties with traditional Unix filesystems.  It has
778c2ecf20Sopenharmony_cithe concepts of blocks, inodes and directories.  It has space in the
788c2ecf20Sopenharmony_cispecification for Access Control Lists (ACLs), fragments, undeletion and
798c2ecf20Sopenharmony_cicompression though these are not yet implemented (some are available as
808c2ecf20Sopenharmony_ciseparate patches).  There is also a versioning mechanism to allow new
818c2ecf20Sopenharmony_cifeatures (such as journalling) to be added in a maximally compatible
828c2ecf20Sopenharmony_cimanner.
838c2ecf20Sopenharmony_ci
848c2ecf20Sopenharmony_ciBlocks
858c2ecf20Sopenharmony_ci------
868c2ecf20Sopenharmony_ci
878c2ecf20Sopenharmony_ciThe space in the device or file is split up into blocks.  These are
888c2ecf20Sopenharmony_cia fixed size, of 1024, 2048 or 4096 bytes (8192 bytes on Alpha systems),
898c2ecf20Sopenharmony_ciwhich is decided when the filesystem is created.  Smaller blocks mean
908c2ecf20Sopenharmony_ciless wasted space per file, but require slightly more accounting overhead,
918c2ecf20Sopenharmony_ciand also impose other limits on the size of files and the filesystem.
928c2ecf20Sopenharmony_ci
938c2ecf20Sopenharmony_ciBlock Groups
948c2ecf20Sopenharmony_ci------------
958c2ecf20Sopenharmony_ci
968c2ecf20Sopenharmony_ciBlocks are clustered into block groups in order to reduce fragmentation
978c2ecf20Sopenharmony_ciand minimise the amount of head seeking when reading a large amount
988c2ecf20Sopenharmony_ciof consecutive data.  Information about each block group is kept in a
998c2ecf20Sopenharmony_cidescriptor table stored in the block(s) immediately after the superblock.
1008c2ecf20Sopenharmony_ciTwo blocks near the start of each group are reserved for the block usage
1018c2ecf20Sopenharmony_cibitmap and the inode usage bitmap which show which blocks and inodes
1028c2ecf20Sopenharmony_ciare in use.  Since each bitmap is limited to a single block, this means
1038c2ecf20Sopenharmony_cithat the maximum size of a block group is 8 times the size of a block.
1048c2ecf20Sopenharmony_ci
1058c2ecf20Sopenharmony_ciThe block(s) following the bitmaps in each block group are designated
1068c2ecf20Sopenharmony_cias the inode table for that block group and the remainder are the data
1078c2ecf20Sopenharmony_ciblocks.  The block allocation algorithm attempts to allocate data blocks
1088c2ecf20Sopenharmony_ciin the same block group as the inode which contains them.
1098c2ecf20Sopenharmony_ci
1108c2ecf20Sopenharmony_ciThe Superblock
1118c2ecf20Sopenharmony_ci--------------
1128c2ecf20Sopenharmony_ci
1138c2ecf20Sopenharmony_ciThe superblock contains all the information about the configuration of
1148c2ecf20Sopenharmony_cithe filing system.  The primary copy of the superblock is stored at an
1158c2ecf20Sopenharmony_cioffset of 1024 bytes from the start of the device, and it is essential
1168c2ecf20Sopenharmony_cito mounting the filesystem.  Since it is so important, backup copies of
1178c2ecf20Sopenharmony_cithe superblock are stored in block groups throughout the filesystem.
1188c2ecf20Sopenharmony_ciThe first version of ext2 (revision 0) stores a copy at the start of
1198c2ecf20Sopenharmony_cievery block group, along with backups of the group descriptor block(s).
1208c2ecf20Sopenharmony_ciBecause this can consume a considerable amount of space for large
1218c2ecf20Sopenharmony_cifilesystems, later revisions can optionally reduce the number of backup
1228c2ecf20Sopenharmony_cicopies by only putting backups in specific groups (this is the sparse
1238c2ecf20Sopenharmony_cisuperblock feature).  The groups chosen are 0, 1 and powers of 3, 5 and 7.
1248c2ecf20Sopenharmony_ci
1258c2ecf20Sopenharmony_ciThe information in the superblock contains fields such as the total
1268c2ecf20Sopenharmony_cinumber of inodes and blocks in the filesystem and how many are free,
1278c2ecf20Sopenharmony_cihow many inodes and blocks are in each block group, when the filesystem
1288c2ecf20Sopenharmony_ciwas mounted (and if it was cleanly unmounted), when it was modified,
1298c2ecf20Sopenharmony_ciwhat version of the filesystem it is (see the Revisions section below)
1308c2ecf20Sopenharmony_ciand which OS created it.
1318c2ecf20Sopenharmony_ci
1328c2ecf20Sopenharmony_ciIf the filesystem is revision 1 or higher, then there are extra fields,
1338c2ecf20Sopenharmony_cisuch as a volume name, a unique identification number, the inode size,
1348c2ecf20Sopenharmony_ciand space for optional filesystem features to store configuration info.
1358c2ecf20Sopenharmony_ci
1368c2ecf20Sopenharmony_ciAll fields in the superblock (as in all other ext2 structures) are stored
1378c2ecf20Sopenharmony_cion the disc in little endian format, so a filesystem is portable between
1388c2ecf20Sopenharmony_cimachines without having to know what machine it was created on.
1398c2ecf20Sopenharmony_ci
1408c2ecf20Sopenharmony_ciInodes
1418c2ecf20Sopenharmony_ci------
1428c2ecf20Sopenharmony_ci
1438c2ecf20Sopenharmony_ciThe inode (index node) is a fundamental concept in the ext2 filesystem.
1448c2ecf20Sopenharmony_ciEach object in the filesystem is represented by an inode.  The inode
1458c2ecf20Sopenharmony_cistructure contains pointers to the filesystem blocks which contain the
1468c2ecf20Sopenharmony_cidata held in the object and all of the metadata about an object except
1478c2ecf20Sopenharmony_ciits name.  The metadata about an object includes the permissions, owner,
1488c2ecf20Sopenharmony_cigroup, flags, size, number of blocks used, access time, change time,
1498c2ecf20Sopenharmony_cimodification time, deletion time, number of links, fragments, version
1508c2ecf20Sopenharmony_ci(for NFS) and extended attributes (EAs) and/or Access Control Lists (ACLs).
1518c2ecf20Sopenharmony_ci
1528c2ecf20Sopenharmony_ciThere are some reserved fields which are currently unused in the inode
1538c2ecf20Sopenharmony_cistructure and several which are overloaded.  One field is reserved for the
1548c2ecf20Sopenharmony_cidirectory ACL if the inode is a directory and alternately for the top 32
1558c2ecf20Sopenharmony_cibits of the file size if the inode is a regular file (allowing file sizes
1568c2ecf20Sopenharmony_cilarger than 2GB).  The translator field is unused under Linux, but is used
1578c2ecf20Sopenharmony_ciby the HURD to reference the inode of a program which will be used to
1588c2ecf20Sopenharmony_ciinterpret this object.  Most of the remaining reserved fields have been
1598c2ecf20Sopenharmony_ciused up for both Linux and the HURD for larger owner and group fields,
1608c2ecf20Sopenharmony_ciThe HURD also has a larger mode field so it uses another of the remaining
1618c2ecf20Sopenharmony_cifields to store the extra more bits.
1628c2ecf20Sopenharmony_ci
1638c2ecf20Sopenharmony_ciThere are pointers to the first 12 blocks which contain the file's data
1648c2ecf20Sopenharmony_ciin the inode.  There is a pointer to an indirect block (which contains
1658c2ecf20Sopenharmony_cipointers to the next set of blocks), a pointer to a doubly-indirect
1668c2ecf20Sopenharmony_ciblock (which contains pointers to indirect blocks) and a pointer to a
1678c2ecf20Sopenharmony_citrebly-indirect block (which contains pointers to doubly-indirect blocks).
1688c2ecf20Sopenharmony_ci
1698c2ecf20Sopenharmony_ciThe flags field contains some ext2-specific flags which aren't catered
1708c2ecf20Sopenharmony_cifor by the standard chmod flags.  These flags can be listed with lsattr
1718c2ecf20Sopenharmony_ciand changed with the chattr command, and allow specific filesystem
1728c2ecf20Sopenharmony_cibehaviour on a per-file basis.  There are flags for secure deletion,
1738c2ecf20Sopenharmony_ciundeletable, compression, synchronous updates, immutability, append-only,
1748c2ecf20Sopenharmony_cidumpable, no-atime, indexed directories, and data-journaling.  Not all
1758c2ecf20Sopenharmony_ciof these are supported yet.
1768c2ecf20Sopenharmony_ci
1778c2ecf20Sopenharmony_ciDirectories
1788c2ecf20Sopenharmony_ci-----------
1798c2ecf20Sopenharmony_ci
1808c2ecf20Sopenharmony_ciA directory is a filesystem object and has an inode just like a file.
1818c2ecf20Sopenharmony_ciIt is a specially formatted file containing records which associate
1828c2ecf20Sopenharmony_cieach name with an inode number.  Later revisions of the filesystem also
1838c2ecf20Sopenharmony_ciencode the type of the object (file, directory, symlink, device, fifo,
1848c2ecf20Sopenharmony_cisocket) to avoid the need to check the inode itself for this information
1858c2ecf20Sopenharmony_ci(support for taking advantage of this feature does not yet exist in
1868c2ecf20Sopenharmony_ciGlibc 2.2).
1878c2ecf20Sopenharmony_ci
1888c2ecf20Sopenharmony_ciThe inode allocation code tries to assign inodes which are in the same
1898c2ecf20Sopenharmony_ciblock group as the directory in which they are first created.
1908c2ecf20Sopenharmony_ci
1918c2ecf20Sopenharmony_ciThe current implementation of ext2 uses a singly-linked list to store
1928c2ecf20Sopenharmony_cithe filenames in the directory; a pending enhancement uses hashing of the
1938c2ecf20Sopenharmony_cifilenames to allow lookup without the need to scan the entire directory.
1948c2ecf20Sopenharmony_ci
1958c2ecf20Sopenharmony_ciThe current implementation never removes empty directory blocks once they
1968c2ecf20Sopenharmony_cihave been allocated to hold more files.
1978c2ecf20Sopenharmony_ci
1988c2ecf20Sopenharmony_ciSpecial files
1998c2ecf20Sopenharmony_ci-------------
2008c2ecf20Sopenharmony_ci
2018c2ecf20Sopenharmony_ciSymbolic links are also filesystem objects with inodes.  They deserve
2028c2ecf20Sopenharmony_cispecial mention because the data for them is stored within the inode
2038c2ecf20Sopenharmony_ciitself if the symlink is less than 60 bytes long.  It uses the fields
2048c2ecf20Sopenharmony_ciwhich would normally be used to store the pointers to data blocks.
2058c2ecf20Sopenharmony_ciThis is a worthwhile optimisation as it we avoid allocating a full
2068c2ecf20Sopenharmony_ciblock for the symlink, and most symlinks are less than 60 characters long.
2078c2ecf20Sopenharmony_ci
2088c2ecf20Sopenharmony_ciCharacter and block special devices never have data blocks assigned to
2098c2ecf20Sopenharmony_cithem.  Instead, their device number is stored in the inode, again reusing
2108c2ecf20Sopenharmony_cithe fields which would be used to point to the data blocks.
2118c2ecf20Sopenharmony_ci
2128c2ecf20Sopenharmony_ciReserved Space
2138c2ecf20Sopenharmony_ci--------------
2148c2ecf20Sopenharmony_ci
2158c2ecf20Sopenharmony_ciIn ext2, there is a mechanism for reserving a certain number of blocks
2168c2ecf20Sopenharmony_cifor a particular user (normally the super-user).  This is intended to
2178c2ecf20Sopenharmony_ciallow for the system to continue functioning even if non-privileged users
2188c2ecf20Sopenharmony_cifill up all the space available to them (this is independent of filesystem
2198c2ecf20Sopenharmony_ciquotas).  It also keeps the filesystem from filling up entirely which
2208c2ecf20Sopenharmony_cihelps combat fragmentation.
2218c2ecf20Sopenharmony_ci
2228c2ecf20Sopenharmony_ciFilesystem check
2238c2ecf20Sopenharmony_ci----------------
2248c2ecf20Sopenharmony_ci
2258c2ecf20Sopenharmony_ciAt boot time, most systems run a consistency check (e2fsck) on their
2268c2ecf20Sopenharmony_cifilesystems.  The superblock of the ext2 filesystem contains several
2278c2ecf20Sopenharmony_cifields which indicate whether fsck should actually run (since checking
2288c2ecf20Sopenharmony_cithe filesystem at boot can take a long time if it is large).  fsck will
2298c2ecf20Sopenharmony_cirun if the filesystem was not cleanly unmounted, if the maximum mount
2308c2ecf20Sopenharmony_cicount has been exceeded or if the maximum time between checks has been
2318c2ecf20Sopenharmony_ciexceeded.
2328c2ecf20Sopenharmony_ci
2338c2ecf20Sopenharmony_ciFeature Compatibility
2348c2ecf20Sopenharmony_ci---------------------
2358c2ecf20Sopenharmony_ci
2368c2ecf20Sopenharmony_ciThe compatibility feature mechanism used in ext2 is sophisticated.
2378c2ecf20Sopenharmony_ciIt safely allows features to be added to the filesystem, without
2388c2ecf20Sopenharmony_ciunnecessarily sacrificing compatibility with older versions of the
2398c2ecf20Sopenharmony_cifilesystem code.  The feature compatibility mechanism is not supported by
2408c2ecf20Sopenharmony_cithe original revision 0 (EXT2_GOOD_OLD_REV) of ext2, but was introduced in
2418c2ecf20Sopenharmony_cirevision 1.  There are three 32-bit fields, one for compatible features
2428c2ecf20Sopenharmony_ci(COMPAT), one for read-only compatible (RO_COMPAT) features and one for
2438c2ecf20Sopenharmony_ciincompatible (INCOMPAT) features.
2448c2ecf20Sopenharmony_ci
2458c2ecf20Sopenharmony_ciThese feature flags have specific meanings for the kernel as follows:
2468c2ecf20Sopenharmony_ci
2478c2ecf20Sopenharmony_ciA COMPAT flag indicates that a feature is present in the filesystem,
2488c2ecf20Sopenharmony_cibut the on-disk format is 100% compatible with older on-disk formats, so
2498c2ecf20Sopenharmony_cia kernel which didn't know anything about this feature could read/write
2508c2ecf20Sopenharmony_cithe filesystem without any chance of corrupting the filesystem (or even
2518c2ecf20Sopenharmony_cimaking it inconsistent).  This is essentially just a flag which says
2528c2ecf20Sopenharmony_ci"this filesystem has a (hidden) feature" that the kernel or e2fsck may
2538c2ecf20Sopenharmony_ciwant to be aware of (more on e2fsck and feature flags later).  The ext3
2548c2ecf20Sopenharmony_ciHAS_JOURNAL feature is a COMPAT flag because the ext3 journal is simply
2558c2ecf20Sopenharmony_cia regular file with data blocks in it so the kernel does not need to
2568c2ecf20Sopenharmony_citake any special notice of it if it doesn't understand ext3 journaling.
2578c2ecf20Sopenharmony_ci
2588c2ecf20Sopenharmony_ciAn RO_COMPAT flag indicates that the on-disk format is 100% compatible
2598c2ecf20Sopenharmony_ciwith older on-disk formats for reading (i.e. the feature does not change
2608c2ecf20Sopenharmony_cithe visible on-disk format).  However, an old kernel writing to such a
2618c2ecf20Sopenharmony_cifilesystem would/could corrupt the filesystem, so this is prevented. The
2628c2ecf20Sopenharmony_cimost common such feature, SPARSE_SUPER, is an RO_COMPAT feature because
2638c2ecf20Sopenharmony_cisparse groups allow file data blocks where superblock/group descriptor
2648c2ecf20Sopenharmony_cibackups used to live, and ext2_free_blocks() refuses to free these blocks,
2658c2ecf20Sopenharmony_ciwhich would leading to inconsistent bitmaps.  An old kernel would also
2668c2ecf20Sopenharmony_ciget an error if it tried to free a series of blocks which crossed a group
2678c2ecf20Sopenharmony_ciboundary, but this is a legitimate layout in a SPARSE_SUPER filesystem.
2688c2ecf20Sopenharmony_ci
2698c2ecf20Sopenharmony_ciAn INCOMPAT flag indicates the on-disk format has changed in some
2708c2ecf20Sopenharmony_ciway that makes it unreadable by older kernels, or would otherwise
2718c2ecf20Sopenharmony_cicause a problem if an old kernel tried to mount it.  FILETYPE is an
2728c2ecf20Sopenharmony_ciINCOMPAT flag because older kernels would think a filename was longer
2738c2ecf20Sopenharmony_cithan 256 characters, which would lead to corrupt directory listings.
2748c2ecf20Sopenharmony_ciThe COMPRESSION flag is an obvious INCOMPAT flag - if the kernel
2758c2ecf20Sopenharmony_cidoesn't understand compression, you would just get garbage back from
2768c2ecf20Sopenharmony_ciread() instead of it automatically decompressing your data.  The ext3
2778c2ecf20Sopenharmony_ciRECOVER flag is needed to prevent a kernel which does not understand the
2788c2ecf20Sopenharmony_ciext3 journal from mounting the filesystem without replaying the journal.
2798c2ecf20Sopenharmony_ci
2808c2ecf20Sopenharmony_ciFor e2fsck, it needs to be more strict with the handling of these
2818c2ecf20Sopenharmony_ciflags than the kernel.  If it doesn't understand ANY of the COMPAT,
2828c2ecf20Sopenharmony_ciRO_COMPAT, or INCOMPAT flags it will refuse to check the filesystem,
2838c2ecf20Sopenharmony_cibecause it has no way of verifying whether a given feature is valid
2848c2ecf20Sopenharmony_cior not.  Allowing e2fsck to succeed on a filesystem with an unknown
2858c2ecf20Sopenharmony_cifeature is a false sense of security for the user.  Refusing to check
2868c2ecf20Sopenharmony_cia filesystem with unknown features is a good incentive for the user to
2878c2ecf20Sopenharmony_ciupdate to the latest e2fsck.  This also means that anyone adding feature
2888c2ecf20Sopenharmony_ciflags to ext2 also needs to update e2fsck to verify these features.
2898c2ecf20Sopenharmony_ci
2908c2ecf20Sopenharmony_ciMetadata
2918c2ecf20Sopenharmony_ci--------
2928c2ecf20Sopenharmony_ci
2938c2ecf20Sopenharmony_ciIt is frequently claimed that the ext2 implementation of writing
2948c2ecf20Sopenharmony_ciasynchronous metadata is faster than the ffs synchronous metadata
2958c2ecf20Sopenharmony_cischeme but less reliable.  Both methods are equally resolvable by their
2968c2ecf20Sopenharmony_cirespective fsck programs.
2978c2ecf20Sopenharmony_ci
2988c2ecf20Sopenharmony_ciIf you're exceptionally paranoid, there are 3 ways of making metadata
2998c2ecf20Sopenharmony_ciwrites synchronous on ext2:
3008c2ecf20Sopenharmony_ci
3018c2ecf20Sopenharmony_ci- per-file if you have the program source: use the O_SYNC flag to open()
3028c2ecf20Sopenharmony_ci- per-file if you don't have the source: use "chattr +S" on the file
3038c2ecf20Sopenharmony_ci- per-filesystem: add the "sync" option to mount (or in /etc/fstab)
3048c2ecf20Sopenharmony_ci
3058c2ecf20Sopenharmony_cithe first and last are not ext2 specific but do force the metadata to
3068c2ecf20Sopenharmony_cibe written synchronously.  See also Journaling below.
3078c2ecf20Sopenharmony_ci
3088c2ecf20Sopenharmony_ciLimitations
3098c2ecf20Sopenharmony_ci-----------
3108c2ecf20Sopenharmony_ci
3118c2ecf20Sopenharmony_ciThere are various limits imposed by the on-disk layout of ext2.  Other
3128c2ecf20Sopenharmony_cilimits are imposed by the current implementation of the kernel code.
3138c2ecf20Sopenharmony_ciMany of the limits are determined at the time the filesystem is first
3148c2ecf20Sopenharmony_cicreated, and depend upon the block size chosen.  The ratio of inodes to
3158c2ecf20Sopenharmony_cidata blocks is fixed at filesystem creation time, so the only way to
3168c2ecf20Sopenharmony_ciincrease the number of inodes is to increase the size of the filesystem.
3178c2ecf20Sopenharmony_ciNo tools currently exist which can change the ratio of inodes to blocks.
3188c2ecf20Sopenharmony_ci
3198c2ecf20Sopenharmony_ciMost of these limits could be overcome with slight changes in the on-disk
3208c2ecf20Sopenharmony_ciformat and using a compatibility flag to signal the format change (at
3218c2ecf20Sopenharmony_cithe expense of some compatibility).
3228c2ecf20Sopenharmony_ci
3238c2ecf20Sopenharmony_ci=====================  =======    =======    =======   ========
3248c2ecf20Sopenharmony_ciFilesystem block size      1kB        2kB        4kB        8kB
3258c2ecf20Sopenharmony_ci=====================  =======    =======    =======   ========
3268c2ecf20Sopenharmony_ciFile size limit           16GB      256GB     2048GB     2048GB
3278c2ecf20Sopenharmony_ciFilesystem size limit   2047GB     8192GB    16384GB    32768GB
3288c2ecf20Sopenharmony_ci=====================  =======    =======    =======   ========
3298c2ecf20Sopenharmony_ci
3308c2ecf20Sopenharmony_ciThere is a 2.4 kernel limit of 2048GB for a single block device, so no
3318c2ecf20Sopenharmony_cifilesystem larger than that can be created at this time.  There is also
3328c2ecf20Sopenharmony_cian upper limit on the block size imposed by the page size of the kernel,
3338c2ecf20Sopenharmony_ciso 8kB blocks are only allowed on Alpha systems (and other architectures
3348c2ecf20Sopenharmony_ciwhich support larger pages).
3358c2ecf20Sopenharmony_ci
3368c2ecf20Sopenharmony_ciThere is an upper limit of 32000 subdirectories in a single directory.
3378c2ecf20Sopenharmony_ci
3388c2ecf20Sopenharmony_ciThere is a "soft" upper limit of about 10-15k files in a single directory
3398c2ecf20Sopenharmony_ciwith the current linear linked-list directory implementation.  This limit
3408c2ecf20Sopenharmony_cistems from performance problems when creating and deleting (and also
3418c2ecf20Sopenharmony_cifinding) files in such large directories.  Using a hashed directory index
3428c2ecf20Sopenharmony_ci(under development) allows 100k-1M+ files in a single directory without
3438c2ecf20Sopenharmony_ciperformance problems (although RAM size becomes an issue at this point).
3448c2ecf20Sopenharmony_ci
3458c2ecf20Sopenharmony_ciThe (meaningless) absolute upper limit of files in a single directory
3468c2ecf20Sopenharmony_ci(imposed by the file size, the realistic limit is obviously much less)
3478c2ecf20Sopenharmony_ciis over 130 trillion files.  It would be higher except there are not
3488c2ecf20Sopenharmony_cienough 4-character names to make up unique directory entries, so they
3498c2ecf20Sopenharmony_cihave to be 8 character filenames, even then we are fairly close to
3508c2ecf20Sopenharmony_cirunning out of unique filenames.
3518c2ecf20Sopenharmony_ci
3528c2ecf20Sopenharmony_ciJournaling
3538c2ecf20Sopenharmony_ci----------
3548c2ecf20Sopenharmony_ci
3558c2ecf20Sopenharmony_ciA journaling extension to the ext2 code has been developed by Stephen
3568c2ecf20Sopenharmony_ciTweedie.  It avoids the risks of metadata corruption and the need to
3578c2ecf20Sopenharmony_ciwait for e2fsck to complete after a crash, without requiring a change
3588c2ecf20Sopenharmony_cito the on-disk ext2 layout.  In a nutshell, the journal is a regular
3598c2ecf20Sopenharmony_cifile which stores whole metadata (and optionally data) blocks that have
3608c2ecf20Sopenharmony_cibeen modified, prior to writing them into the filesystem.  This means
3618c2ecf20Sopenharmony_ciit is possible to add a journal to an existing ext2 filesystem without
3628c2ecf20Sopenharmony_cithe need for data conversion.
3638c2ecf20Sopenharmony_ci
3648c2ecf20Sopenharmony_ciWhen changes to the filesystem (e.g. a file is renamed) they are stored in
3658c2ecf20Sopenharmony_cia transaction in the journal and can either be complete or incomplete at
3668c2ecf20Sopenharmony_cithe time of a crash.  If a transaction is complete at the time of a crash
3678c2ecf20Sopenharmony_ci(or in the normal case where the system does not crash), then any blocks
3688c2ecf20Sopenharmony_ciin that transaction are guaranteed to represent a valid filesystem state,
3698c2ecf20Sopenharmony_ciand are copied into the filesystem.  If a transaction is incomplete at
3708c2ecf20Sopenharmony_cithe time of the crash, then there is no guarantee of consistency for
3718c2ecf20Sopenharmony_cithe blocks in that transaction so they are discarded (which means any
3728c2ecf20Sopenharmony_cifilesystem changes they represent are also lost).
3738c2ecf20Sopenharmony_ciCheck Documentation/filesystems/ext4/ if you want to read more about
3748c2ecf20Sopenharmony_ciext4 and journaling.
3758c2ecf20Sopenharmony_ci
3768c2ecf20Sopenharmony_ciReferences
3778c2ecf20Sopenharmony_ci==========
3788c2ecf20Sopenharmony_ci
3798c2ecf20Sopenharmony_ci=======================	===============================================
3808c2ecf20Sopenharmony_ciThe kernel source	file:/usr/src/linux/fs/ext2/
3818c2ecf20Sopenharmony_cie2fsprogs (e2fsck)	http://e2fsprogs.sourceforge.net/
3828c2ecf20Sopenharmony_ciDesign & Implementation	http://e2fsprogs.sourceforge.net/ext2intro.html
3838c2ecf20Sopenharmony_ciJournaling (ext3)	ftp://ftp.uk.linux.org/pub/linux/sct/fs/jfs/
3848c2ecf20Sopenharmony_ciFilesystem Resizing	http://ext2resize.sourceforge.net/
3858c2ecf20Sopenharmony_ciCompression [1]_	http://e2compr.sourceforge.net/
3868c2ecf20Sopenharmony_ci=======================	===============================================
3878c2ecf20Sopenharmony_ci
3888c2ecf20Sopenharmony_ciImplementations for:
3898c2ecf20Sopenharmony_ci
3908c2ecf20Sopenharmony_ci=======================	===========================================================
3918c2ecf20Sopenharmony_ciWindows 95/98/NT/2000	http://www.chrysocome.net/explore2fs
3928c2ecf20Sopenharmony_ciWindows 95 [1]_		http://www.yipton.net/content.html#FSDEXT2
3938c2ecf20Sopenharmony_ciDOS client [1]_		ftp://metalab.unc.edu/pub/Linux/system/filesystems/ext2/
3948c2ecf20Sopenharmony_ciOS/2 [2]_		ftp://metalab.unc.edu/pub/Linux/system/filesystems/ext2/
3958c2ecf20Sopenharmony_ciRISC OS client		http://www.esw-heim.tu-clausthal.de/~marco/smorbrod/IscaFS/
3968c2ecf20Sopenharmony_ci=======================	===========================================================
3978c2ecf20Sopenharmony_ci
3988c2ecf20Sopenharmony_ci.. [1] no longer actively developed/supported (as of Apr 2001)
3998c2ecf20Sopenharmony_ci.. [2] no longer actively developed/supported (as of Mar 2009)
400