18c2ecf20Sopenharmony_ciNotes on Filesystem Layout
28c2ecf20Sopenharmony_ci--------------------------
38c2ecf20Sopenharmony_ci
48c2ecf20Sopenharmony_ciThese notes describe what mkcramfs generates.  Kernel requirements are
58c2ecf20Sopenharmony_cia bit looser, e.g. it doesn't care if the <file_data> items are
68c2ecf20Sopenharmony_ciswapped around (though it does care that directory entries (inodes) in
78c2ecf20Sopenharmony_cia given directory are contiguous, as this is used by readdir).
88c2ecf20Sopenharmony_ci
98c2ecf20Sopenharmony_ciAll data is currently in host-endian format; neither mkcramfs nor the
108c2ecf20Sopenharmony_cikernel ever do swabbing.  (See section `Block Size' below.)
118c2ecf20Sopenharmony_ci
128c2ecf20Sopenharmony_ci<filesystem>:
138c2ecf20Sopenharmony_ci	<superblock>
148c2ecf20Sopenharmony_ci	<directory_structure>
158c2ecf20Sopenharmony_ci	<data>
168c2ecf20Sopenharmony_ci
178c2ecf20Sopenharmony_ci<superblock>: struct cramfs_super (see cramfs_fs.h).
188c2ecf20Sopenharmony_ci
198c2ecf20Sopenharmony_ci<directory_structure>:
208c2ecf20Sopenharmony_ci	For each file:
218c2ecf20Sopenharmony_ci		struct cramfs_inode (see cramfs_fs.h).
228c2ecf20Sopenharmony_ci		Filename.  Not generally null-terminated, but it is
238c2ecf20Sopenharmony_ci		 null-padded to a multiple of 4 bytes.
248c2ecf20Sopenharmony_ci
258c2ecf20Sopenharmony_ciThe order of inode traversal is described as "width-first" (not to be
268c2ecf20Sopenharmony_ciconfused with breadth-first); i.e. like depth-first but listing all of
278c2ecf20Sopenharmony_cia directory's entries before recursing down its subdirectories: the
288c2ecf20Sopenharmony_cisame order as `ls -AUR' (but without the /^\..*:$/ directory header
298c2ecf20Sopenharmony_cilines); put another way, the same order as `find -type d -exec
308c2ecf20Sopenharmony_cils -AU1 {} \;'.
318c2ecf20Sopenharmony_ci
328c2ecf20Sopenharmony_ciBeginning in 2.4.7, directory entries are sorted.  This optimization
338c2ecf20Sopenharmony_ciallows cramfs_lookup to return more quickly when a filename does not
348c2ecf20Sopenharmony_ciexist, speeds up user-space directory sorts, etc.
358c2ecf20Sopenharmony_ci
368c2ecf20Sopenharmony_ci<data>:
378c2ecf20Sopenharmony_ci	One <file_data> for each file that's either a symlink or a
388c2ecf20Sopenharmony_ci	 regular file of non-zero st_size.
398c2ecf20Sopenharmony_ci
408c2ecf20Sopenharmony_ci<file_data>:
418c2ecf20Sopenharmony_ci	nblocks * <block_pointer>
428c2ecf20Sopenharmony_ci	 (where nblocks = (st_size - 1) / blksize + 1)
438c2ecf20Sopenharmony_ci	nblocks * <block>
448c2ecf20Sopenharmony_ci	padding to multiple of 4 bytes
458c2ecf20Sopenharmony_ci
468c2ecf20Sopenharmony_ciThe i'th <block_pointer> for a file stores the byte offset of the
478c2ecf20Sopenharmony_ci*end* of the i'th <block> (i.e. one past the last byte, which is the
488c2ecf20Sopenharmony_cisame as the start of the (i+1)'th <block> if there is one).  The first
498c2ecf20Sopenharmony_ci<block> immediately follows the last <block_pointer> for the file.
508c2ecf20Sopenharmony_ci<block_pointer>s are each 32 bits long.
518c2ecf20Sopenharmony_ci
528c2ecf20Sopenharmony_ciWhen the CRAMFS_FLAG_EXT_BLOCK_POINTERS capability bit is set, each
538c2ecf20Sopenharmony_ci<block_pointer>'s top bits may contain special flags as follows:
548c2ecf20Sopenharmony_ci
558c2ecf20Sopenharmony_ciCRAMFS_BLK_FLAG_UNCOMPRESSED (bit 31):
568c2ecf20Sopenharmony_ci	The block data is not compressed and should be copied verbatim.
578c2ecf20Sopenharmony_ci
588c2ecf20Sopenharmony_ciCRAMFS_BLK_FLAG_DIRECT_PTR (bit 30):
598c2ecf20Sopenharmony_ci	The <block_pointer> stores the actual block start offset and not
608c2ecf20Sopenharmony_ci	its end, shifted right by 2 bits. The block must therefore be
618c2ecf20Sopenharmony_ci	aligned to a 4-byte boundary. The block size is either blksize
628c2ecf20Sopenharmony_ci	if CRAMFS_BLK_FLAG_UNCOMPRESSED is also specified, otherwise
638c2ecf20Sopenharmony_ci	the compressed data length is included in the first 2 bytes of
648c2ecf20Sopenharmony_ci	the block data. This is used to allow discontiguous data layout
658c2ecf20Sopenharmony_ci	and specific data block alignments e.g. for XIP applications.
668c2ecf20Sopenharmony_ci
678c2ecf20Sopenharmony_ci
688c2ecf20Sopenharmony_ciThe order of <file_data>'s is a depth-first descent of the directory
698c2ecf20Sopenharmony_citree, i.e. the same order as `find -size +0 \( -type f -o -type l \)
708c2ecf20Sopenharmony_ci-print'.
718c2ecf20Sopenharmony_ci
728c2ecf20Sopenharmony_ci
738c2ecf20Sopenharmony_ci<block>: The i'th <block> is the output of zlib's compress function
748c2ecf20Sopenharmony_ciapplied to the i'th blksize-sized chunk of the input data if the
758c2ecf20Sopenharmony_cicorresponding CRAMFS_BLK_FLAG_UNCOMPRESSED <block_ptr> bit is not set,
768c2ecf20Sopenharmony_ciotherwise it is the input data directly.
778c2ecf20Sopenharmony_ci(For the last <block> of the file, the input may of course be smaller.)
788c2ecf20Sopenharmony_ciEach <block> may be a different size.  (See <block_pointer> above.)
798c2ecf20Sopenharmony_ci
808c2ecf20Sopenharmony_ci<block>s are merely byte-aligned, not generally u32-aligned.
818c2ecf20Sopenharmony_ci
828c2ecf20Sopenharmony_ciWhen CRAMFS_BLK_FLAG_DIRECT_PTR is specified then the corresponding
838c2ecf20Sopenharmony_ci<block> may be located anywhere and not necessarily contiguous with
848c2ecf20Sopenharmony_cithe previous/next blocks. In that case it is minimally u32-aligned.
858c2ecf20Sopenharmony_ciIf CRAMFS_BLK_FLAG_UNCOMPRESSED is also specified then the size is always
868c2ecf20Sopenharmony_ciblksize except for the last block which is limited by the file length.
878c2ecf20Sopenharmony_ciIf CRAMFS_BLK_FLAG_DIRECT_PTR is set and CRAMFS_BLK_FLAG_UNCOMPRESSED
888c2ecf20Sopenharmony_ciis not set then the first 2 bytes of the block contains the size of the
898c2ecf20Sopenharmony_ciremaining block data as this cannot be determined from the placement of
908c2ecf20Sopenharmony_cilogically adjacent blocks.
918c2ecf20Sopenharmony_ci
928c2ecf20Sopenharmony_ci
938c2ecf20Sopenharmony_ciHoles
948c2ecf20Sopenharmony_ci-----
958c2ecf20Sopenharmony_ci
968c2ecf20Sopenharmony_ciThis kernel supports cramfs holes (i.e. [efficient representation of]
978c2ecf20Sopenharmony_ciblocks in uncompressed data consisting entirely of NUL bytes), but by
988c2ecf20Sopenharmony_cidefault mkcramfs doesn't test for & create holes, since cramfs in
998c2ecf20Sopenharmony_cikernels up to at least 2.3.39 didn't support holes.  Run mkcramfs
1008c2ecf20Sopenharmony_ciwith -z if you want it to create files that can have holes in them.
1018c2ecf20Sopenharmony_ci
1028c2ecf20Sopenharmony_ci
1038c2ecf20Sopenharmony_ciTools
1048c2ecf20Sopenharmony_ci-----
1058c2ecf20Sopenharmony_ci
1068c2ecf20Sopenharmony_ciThe cramfs user-space tools, including mkcramfs and cramfsck, are
1078c2ecf20Sopenharmony_cilocated at <http://sourceforge.net/projects/cramfs/>.
1088c2ecf20Sopenharmony_ci
1098c2ecf20Sopenharmony_ci
1108c2ecf20Sopenharmony_ciFuture Development
1118c2ecf20Sopenharmony_ci==================
1128c2ecf20Sopenharmony_ci
1138c2ecf20Sopenharmony_ciBlock Size
1148c2ecf20Sopenharmony_ci----------
1158c2ecf20Sopenharmony_ci
1168c2ecf20Sopenharmony_ci(Block size in cramfs refers to the size of input data that is
1178c2ecf20Sopenharmony_cicompressed at a time.  It's intended to be somewhere around
1188c2ecf20Sopenharmony_ciPAGE_SIZE for cramfs_readpage's convenience.)
1198c2ecf20Sopenharmony_ci
1208c2ecf20Sopenharmony_ciThe superblock ought to indicate the block size that the fs was
1218c2ecf20Sopenharmony_ciwritten for, since comments in <linux/pagemap.h> indicate that
1228c2ecf20Sopenharmony_ciPAGE_SIZE may grow in future (if I interpret the comment
1238c2ecf20Sopenharmony_cicorrectly).
1248c2ecf20Sopenharmony_ci
1258c2ecf20Sopenharmony_ciCurrently, mkcramfs #define's PAGE_SIZE as 4096 and uses that
1268c2ecf20Sopenharmony_cifor blksize, whereas Linux-2.3.39 uses its PAGE_SIZE, which in
1278c2ecf20Sopenharmony_citurn is defined as PAGE_SIZE (which can be as large as 32KB on arm).
1288c2ecf20Sopenharmony_ciThis discrepancy is a bug, though it's not clear which should be
1298c2ecf20Sopenharmony_cichanged.
1308c2ecf20Sopenharmony_ci
1318c2ecf20Sopenharmony_ciOne option is to change mkcramfs to take its PAGE_SIZE from
1328c2ecf20Sopenharmony_ci<asm/page.h>.  Personally I don't like this option, but it does
1338c2ecf20Sopenharmony_cirequire the least amount of change: just change `#define
1348c2ecf20Sopenharmony_ciPAGE_SIZE (4096)' to `#include <asm/page.h>'.  The disadvantage
1358c2ecf20Sopenharmony_ciis that the generated cramfs cannot always be shared between different
1368c2ecf20Sopenharmony_cikernels, not even necessarily kernels of the same architecture if
1378c2ecf20Sopenharmony_ciPAGE_SIZE is subject to change between kernel versions
1388c2ecf20Sopenharmony_ci(currently possible with arm and ia64).
1398c2ecf20Sopenharmony_ci
1408c2ecf20Sopenharmony_ciThe remaining options try to make cramfs more sharable.
1418c2ecf20Sopenharmony_ci
1428c2ecf20Sopenharmony_ciOne part of that is addressing endianness.  The two options here are
1438c2ecf20Sopenharmony_ci`always use little-endian' (like ext2fs) or `writer chooses
1448c2ecf20Sopenharmony_ciendianness; kernel adapts at runtime'.  Little-endian wins because of
1458c2ecf20Sopenharmony_cicode simplicity and little CPU overhead even on big-endian machines.
1468c2ecf20Sopenharmony_ci
1478c2ecf20Sopenharmony_ciThe cost of swabbing is changing the code to use the le32_to_cpu
1488c2ecf20Sopenharmony_cietc. macros as used by ext2fs.  We don't need to swab the compressed
1498c2ecf20Sopenharmony_cidata, only the superblock, inodes and block pointers.
1508c2ecf20Sopenharmony_ci
1518c2ecf20Sopenharmony_ci
1528c2ecf20Sopenharmony_ciThe other part of making cramfs more sharable is choosing a block
1538c2ecf20Sopenharmony_cisize.  The options are:
1548c2ecf20Sopenharmony_ci
1558c2ecf20Sopenharmony_ci  1. Always 4096 bytes.
1568c2ecf20Sopenharmony_ci
1578c2ecf20Sopenharmony_ci  2. Writer chooses blocksize; kernel adapts but rejects blocksize >
1588c2ecf20Sopenharmony_ci     PAGE_SIZE.
1598c2ecf20Sopenharmony_ci
1608c2ecf20Sopenharmony_ci  3. Writer chooses blocksize; kernel adapts even to blocksize >
1618c2ecf20Sopenharmony_ci     PAGE_SIZE.
1628c2ecf20Sopenharmony_ci
1638c2ecf20Sopenharmony_ciIt's easy enough to change the kernel to use a smaller value than
1648c2ecf20Sopenharmony_ciPAGE_SIZE: just make cramfs_readpage read multiple blocks.
1658c2ecf20Sopenharmony_ci
1668c2ecf20Sopenharmony_ciThe cost of option 1 is that kernels with a larger PAGE_SIZE
1678c2ecf20Sopenharmony_civalue don't get as good compression as they can.
1688c2ecf20Sopenharmony_ci
1698c2ecf20Sopenharmony_ciThe cost of option 2 relative to option 1 is that the code uses
1708c2ecf20Sopenharmony_civariables instead of #define'd constants.  The gain is that people
1718c2ecf20Sopenharmony_ciwith kernels having larger PAGE_SIZE can make use of that if
1728c2ecf20Sopenharmony_cithey don't mind their cramfs being inaccessible to kernels with
1738c2ecf20Sopenharmony_cismaller PAGE_SIZE values.
1748c2ecf20Sopenharmony_ci
1758c2ecf20Sopenharmony_ciOption 3 is easy to implement if we don't mind being CPU-inefficient:
1768c2ecf20Sopenharmony_cie.g. get readpage to decompress to a buffer of size MAX_BLKSIZE (which
1778c2ecf20Sopenharmony_cimust be no larger than 32KB) and discard what it doesn't need.
1788c2ecf20Sopenharmony_ciGetting readpage to read into all the covered pages is harder.
1798c2ecf20Sopenharmony_ci
1808c2ecf20Sopenharmony_ciThe main advantage of option 3 over 1, 2, is better compression.  The
1818c2ecf20Sopenharmony_cicost is greater complexity.  Probably not worth it, but I hope someone
1828c2ecf20Sopenharmony_ciwill disagree.  (If it is implemented, then I'll re-use that code in
1838c2ecf20Sopenharmony_cie2compr.)
1848c2ecf20Sopenharmony_ci
1858c2ecf20Sopenharmony_ci
1868c2ecf20Sopenharmony_ciAnother cost of 2 and 3 over 1 is making mkcramfs use a different
1878c2ecf20Sopenharmony_ciblock size, but that just means adding and parsing a -b option.
1888c2ecf20Sopenharmony_ci
1898c2ecf20Sopenharmony_ci
1908c2ecf20Sopenharmony_ciInode Size
1918c2ecf20Sopenharmony_ci----------
1928c2ecf20Sopenharmony_ci
1938c2ecf20Sopenharmony_ciGiven that cramfs will probably be used for CDs etc. as well as just
1948c2ecf20Sopenharmony_cisilicon ROMs, it might make sense to expand the inode a little from
1958c2ecf20Sopenharmony_ciits current 12 bytes.  Inodes other than the root inode are followed
1968c2ecf20Sopenharmony_ciby filename, so the expansion doesn't even have to be a multiple of 4
1978c2ecf20Sopenharmony_cibytes.
198