18c2ecf20Sopenharmony_ci.. SPDX-License-Identifier: GPL-2.0
28c2ecf20Sopenharmony_ci
38c2ecf20Sopenharmony_ciBigalloc
48c2ecf20Sopenharmony_ci--------
58c2ecf20Sopenharmony_ci
68c2ecf20Sopenharmony_ciAt the moment, the default size of a block is 4KiB, which is a commonly
78c2ecf20Sopenharmony_cisupported page size on most MMU-capable hardware. This is fortunate, as
88c2ecf20Sopenharmony_ciext4 code is not prepared to handle the case where the block size
98c2ecf20Sopenharmony_ciexceeds the page size. However, for a filesystem of mostly huge files,
108c2ecf20Sopenharmony_ciit is desirable to be able to allocate disk blocks in units of multiple
118c2ecf20Sopenharmony_ciblocks to reduce both fragmentation and metadata overhead. The
128c2ecf20Sopenharmony_cibigalloc feature provides exactly this ability.
138c2ecf20Sopenharmony_ci
148c2ecf20Sopenharmony_ciThe bigalloc feature (EXT4_FEATURE_RO_COMPAT_BIGALLOC) changes ext4 to
158c2ecf20Sopenharmony_ciuse clustered allocation, so that each bit in the ext4 block allocation
168c2ecf20Sopenharmony_cibitmap addresses a power of two number of blocks. For example, if the
178c2ecf20Sopenharmony_cifile system is mainly going to be storing large files in the 4-32
188c2ecf20Sopenharmony_cimegabyte range, it might make sense to set a cluster size of 1 megabyte.
198c2ecf20Sopenharmony_ciThis means that each bit in the block allocation bitmap now addresses
208c2ecf20Sopenharmony_ci256 4k blocks. This shrinks the total size of the block allocation
218c2ecf20Sopenharmony_cibitmaps for a 2T file system from 64 megabytes to 256 kilobytes. It also
228c2ecf20Sopenharmony_cimeans that a block group addresses 32 gigabytes instead of 128 megabytes,
238c2ecf20Sopenharmony_cialso shrinking the amount of file system overhead for metadata.
248c2ecf20Sopenharmony_ci
258c2ecf20Sopenharmony_ciThe administrator can set a block cluster size at mkfs time (which is
268c2ecf20Sopenharmony_cistored in the s\_log\_cluster\_size field in the superblock); from then
278c2ecf20Sopenharmony_cion, the block bitmaps track clusters, not individual blocks. This means
288c2ecf20Sopenharmony_cithat block groups can be several gigabytes in size (instead of just
298c2ecf20Sopenharmony_ci128MiB); however, the minimum allocation unit becomes a cluster, not a
308c2ecf20Sopenharmony_ciblock, even for directories. TaoBao had a patchset to extend the “use
318c2ecf20Sopenharmony_ciunits of clusters instead of blocks” to the extent tree, though it is
328c2ecf20Sopenharmony_cinot clear where those patches went-- they eventually morphed into
338c2ecf20Sopenharmony_ci“extent tree v2” but that code has not landed as of May 2015.
348c2ecf20Sopenharmony_ci
35