127b27ec6Sopenharmony_ciLZ4 Frame Format Description 227b27ec6Sopenharmony_ci============================ 327b27ec6Sopenharmony_ci 427b27ec6Sopenharmony_ci### Notices 527b27ec6Sopenharmony_ci 627b27ec6Sopenharmony_ciCopyright (c) 2013-2020 Yann Collet 727b27ec6Sopenharmony_ci 827b27ec6Sopenharmony_ciPermission is granted to copy and distribute this document 927b27ec6Sopenharmony_cifor any purpose and without charge, 1027b27ec6Sopenharmony_ciincluding translations into other languages 1127b27ec6Sopenharmony_ciand incorporation into compilations, 1227b27ec6Sopenharmony_ciprovided that the copyright notice and this notice are preserved, 1327b27ec6Sopenharmony_ciand that any substantive changes or deletions from the original 1427b27ec6Sopenharmony_ciare clearly marked. 1527b27ec6Sopenharmony_ciDistribution of this document is unlimited. 1627b27ec6Sopenharmony_ci 1727b27ec6Sopenharmony_ci### Version 1827b27ec6Sopenharmony_ci 1927b27ec6Sopenharmony_ci1.6.2 (12/08/2020) 2027b27ec6Sopenharmony_ci 2127b27ec6Sopenharmony_ci 2227b27ec6Sopenharmony_ciIntroduction 2327b27ec6Sopenharmony_ci------------ 2427b27ec6Sopenharmony_ci 2527b27ec6Sopenharmony_ciThe purpose of this document is to define a lossless compressed data format, 2627b27ec6Sopenharmony_cithat is independent of CPU type, operating system, 2727b27ec6Sopenharmony_cifile system and character set, suitable for 2827b27ec6Sopenharmony_ciFile compression, Pipe and streaming compression 2927b27ec6Sopenharmony_ciusing the [LZ4 algorithm](http://www.lz4.org). 3027b27ec6Sopenharmony_ci 3127b27ec6Sopenharmony_ciThe data can be produced or consumed, 3227b27ec6Sopenharmony_cieven for an arbitrarily long sequentially presented input data stream, 3327b27ec6Sopenharmony_ciusing only an a priori bounded amount of intermediate storage, 3427b27ec6Sopenharmony_ciand hence can be used in data communications. 3527b27ec6Sopenharmony_ciThe format uses the LZ4 compression method, 3627b27ec6Sopenharmony_ciand optional [xxHash-32 checksum method](https://github.com/Cyan4973/xxHash), 3727b27ec6Sopenharmony_cifor detection of data corruption. 3827b27ec6Sopenharmony_ci 3927b27ec6Sopenharmony_ciThe data format defined by this specification 4027b27ec6Sopenharmony_cidoes not attempt to allow random access to compressed data. 4127b27ec6Sopenharmony_ci 4227b27ec6Sopenharmony_ciThis specification is intended for use by implementers of software 4327b27ec6Sopenharmony_cito compress data into LZ4 format and/or decompress data from LZ4 format. 4427b27ec6Sopenharmony_ciThe text of the specification assumes a basic background in programming 4527b27ec6Sopenharmony_ciat the level of bits and other primitive data representations. 4627b27ec6Sopenharmony_ci 4727b27ec6Sopenharmony_ciUnless otherwise indicated below, 4827b27ec6Sopenharmony_cia compliant compressor must produce data sets 4927b27ec6Sopenharmony_cithat conform to the specifications presented here. 5027b27ec6Sopenharmony_ciIt doesn't need to support all options though. 5127b27ec6Sopenharmony_ci 5227b27ec6Sopenharmony_ciA compliant decompressor must be able to decompress 5327b27ec6Sopenharmony_ciat least one working set of parameters 5427b27ec6Sopenharmony_cithat conforms to the specifications presented here. 5527b27ec6Sopenharmony_ciIt may also ignore checksums. 5627b27ec6Sopenharmony_ciWhenever it does not support a specific parameter within the compressed stream, 5727b27ec6Sopenharmony_ciit must produce a non-ambiguous error code 5827b27ec6Sopenharmony_ciand associated error message explaining which parameter is unsupported. 5927b27ec6Sopenharmony_ci 6027b27ec6Sopenharmony_ci 6127b27ec6Sopenharmony_ciGeneral Structure of LZ4 Frame format 6227b27ec6Sopenharmony_ci------------------------------------- 6327b27ec6Sopenharmony_ci 6427b27ec6Sopenharmony_ci| MagicNb | F. Descriptor | Block | (...) | EndMark | C. Checksum | 6527b27ec6Sopenharmony_ci|:-------:|:-------------:| ----- | ----- | ------- | ----------- | 6627b27ec6Sopenharmony_ci| 4 bytes | 3-15 bytes | | | 4 bytes | 0-4 bytes | 6727b27ec6Sopenharmony_ci 6827b27ec6Sopenharmony_ci__Magic Number__ 6927b27ec6Sopenharmony_ci 7027b27ec6Sopenharmony_ci4 Bytes, Little endian format. 7127b27ec6Sopenharmony_ciValue : 0x184D2204 7227b27ec6Sopenharmony_ci 7327b27ec6Sopenharmony_ci__Frame Descriptor__ 7427b27ec6Sopenharmony_ci 7527b27ec6Sopenharmony_ci3 to 15 Bytes, to be detailed in its own paragraph, 7627b27ec6Sopenharmony_cias it is the most important part of the spec. 7727b27ec6Sopenharmony_ci 7827b27ec6Sopenharmony_ciThe combined _Magic_Number_ and _Frame_Descriptor_ fields are sometimes 7927b27ec6Sopenharmony_cicalled ___LZ4 Frame Header___. Its size varies between 7 and 19 bytes. 8027b27ec6Sopenharmony_ci 8127b27ec6Sopenharmony_ci__Data Blocks__ 8227b27ec6Sopenharmony_ci 8327b27ec6Sopenharmony_ciTo be detailed in its own paragraph. 8427b27ec6Sopenharmony_ciThat’s where compressed data is stored. 8527b27ec6Sopenharmony_ci 8627b27ec6Sopenharmony_ci__EndMark__ 8727b27ec6Sopenharmony_ci 8827b27ec6Sopenharmony_ciThe flow of blocks ends when the last data block is followed by 8927b27ec6Sopenharmony_cithe 32-bit value `0x00000000`. 9027b27ec6Sopenharmony_ci 9127b27ec6Sopenharmony_ci__Content Checksum__ 9227b27ec6Sopenharmony_ci 9327b27ec6Sopenharmony_ci_Content_Checksum_ verify that the full content has been decoded correctly. 9427b27ec6Sopenharmony_ciThe content checksum is the result of [xxHash-32 algorithm] 9527b27ec6Sopenharmony_cidigesting the original (decoded) data as input, and a seed of zero. 9627b27ec6Sopenharmony_ciContent checksum is only present when its associated flag 9727b27ec6Sopenharmony_ciis set in the frame descriptor. 9827b27ec6Sopenharmony_ciContent Checksum validates the result, 9927b27ec6Sopenharmony_cithat all blocks were fully transmitted in the correct order and without error, 10027b27ec6Sopenharmony_ciand also that the encoding/decoding process itself generated no distortion. 10127b27ec6Sopenharmony_ciIts usage is recommended. 10227b27ec6Sopenharmony_ci 10327b27ec6Sopenharmony_ciThe combined _EndMark_ and _Content_Checksum_ fields might sometimes be 10427b27ec6Sopenharmony_cireferred to as ___LZ4 Frame Footer___. Its size varies between 4 and 8 bytes. 10527b27ec6Sopenharmony_ci 10627b27ec6Sopenharmony_ci__Frame Concatenation__ 10727b27ec6Sopenharmony_ci 10827b27ec6Sopenharmony_ciIn some circumstances, it may be preferable to append multiple frames, 10927b27ec6Sopenharmony_cifor example in order to add new data to an existing compressed file 11027b27ec6Sopenharmony_ciwithout re-framing it. 11127b27ec6Sopenharmony_ci 11227b27ec6Sopenharmony_ciIn such case, each frame has its own set of descriptor flags. 11327b27ec6Sopenharmony_ciEach frame is considered independent. 11427b27ec6Sopenharmony_ciThe only relation between frames is their sequential order. 11527b27ec6Sopenharmony_ci 11627b27ec6Sopenharmony_ciThe ability to decode multiple concatenated frames 11727b27ec6Sopenharmony_ciwithin a single stream or file 11827b27ec6Sopenharmony_ciis left outside of this specification. 11927b27ec6Sopenharmony_ciAs an example, the reference lz4 command line utility behavior is 12027b27ec6Sopenharmony_cito decode all concatenated frames in their sequential order. 12127b27ec6Sopenharmony_ci 12227b27ec6Sopenharmony_ci 12327b27ec6Sopenharmony_ciFrame Descriptor 12427b27ec6Sopenharmony_ci---------------- 12527b27ec6Sopenharmony_ci 12627b27ec6Sopenharmony_ci| FLG | BD | (Content Size) | (Dictionary ID) | HC | 12727b27ec6Sopenharmony_ci| ------- | ------- |:--------------:|:---------------:| ------- | 12827b27ec6Sopenharmony_ci| 1 byte | 1 byte | 0 - 8 bytes | 0 - 4 bytes | 1 byte | 12927b27ec6Sopenharmony_ci 13027b27ec6Sopenharmony_ciThe descriptor uses a minimum of 3 bytes, 13127b27ec6Sopenharmony_ciand up to 15 bytes depending on optional parameters. 13227b27ec6Sopenharmony_ci 13327b27ec6Sopenharmony_ci__FLG byte__ 13427b27ec6Sopenharmony_ci 13527b27ec6Sopenharmony_ci| BitNb | 7-6 | 5 | 4 | 3 | 2 | 1 | 0 | 13627b27ec6Sopenharmony_ci| ------- |-------|-------|----------|------|----------|----------|------| 13727b27ec6Sopenharmony_ci|FieldName|Version|B.Indep|B.Checksum|C.Size|C.Checksum|*Reserved*|DictID| 13827b27ec6Sopenharmony_ci 13927b27ec6Sopenharmony_ci 14027b27ec6Sopenharmony_ci__BD byte__ 14127b27ec6Sopenharmony_ci 14227b27ec6Sopenharmony_ci| BitNb | 7 | 6-5-4 | 3-2-1-0 | 14327b27ec6Sopenharmony_ci| ------- | -------- | ------------- | -------- | 14427b27ec6Sopenharmony_ci|FieldName|*Reserved*| Block MaxSize |*Reserved*| 14527b27ec6Sopenharmony_ci 14627b27ec6Sopenharmony_ciIn the tables, bit 7 is highest bit, while bit 0 is lowest. 14727b27ec6Sopenharmony_ci 14827b27ec6Sopenharmony_ci__Version Number__ 14927b27ec6Sopenharmony_ci 15027b27ec6Sopenharmony_ci2-bits field, must be set to `01`. 15127b27ec6Sopenharmony_ciAny other value cannot be decoded by this version of the specification. 15227b27ec6Sopenharmony_ciOther version numbers will use different flag layouts. 15327b27ec6Sopenharmony_ci 15427b27ec6Sopenharmony_ci__Block Independence flag__ 15527b27ec6Sopenharmony_ci 15627b27ec6Sopenharmony_ciIf this flag is set to “1”, blocks are independent. 15727b27ec6Sopenharmony_ciIf this flag is set to “0”, each block depends on previous ones 15827b27ec6Sopenharmony_ci(up to LZ4 window size, which is 64 KB). 15927b27ec6Sopenharmony_ciIn such case, it’s necessary to decode all blocks in sequence. 16027b27ec6Sopenharmony_ci 16127b27ec6Sopenharmony_ciBlock dependency improves compression ratio, especially for small blocks. 16227b27ec6Sopenharmony_ciOn the other hand, it makes random access or multi-threaded decoding impossible. 16327b27ec6Sopenharmony_ci 16427b27ec6Sopenharmony_ci__Block checksum flag__ 16527b27ec6Sopenharmony_ci 16627b27ec6Sopenharmony_ciIf this flag is set, each data block will be followed by a 4-bytes checksum, 16727b27ec6Sopenharmony_cicalculated by using the xxHash-32 algorithm on the raw (compressed) data block. 16827b27ec6Sopenharmony_ciThe intention is to detect data corruption (storage or transmission errors) 16927b27ec6Sopenharmony_ciimmediately, before decoding. 17027b27ec6Sopenharmony_ciBlock checksum usage is optional. 17127b27ec6Sopenharmony_ci 17227b27ec6Sopenharmony_ci__Content Size flag__ 17327b27ec6Sopenharmony_ci 17427b27ec6Sopenharmony_ciIf this flag is set, the uncompressed size of data included within the frame 17527b27ec6Sopenharmony_ciwill be present as an 8 bytes unsigned little endian value, after the flags. 17627b27ec6Sopenharmony_ciContent Size usage is optional. 17727b27ec6Sopenharmony_ci 17827b27ec6Sopenharmony_ci__Content checksum flag__ 17927b27ec6Sopenharmony_ci 18027b27ec6Sopenharmony_ciIf this flag is set, a 32-bits content checksum will be appended 18127b27ec6Sopenharmony_ciafter the EndMark. 18227b27ec6Sopenharmony_ci 18327b27ec6Sopenharmony_ci__Dictionary ID flag__ 18427b27ec6Sopenharmony_ci 18527b27ec6Sopenharmony_ciIf this flag is set, a 4-bytes Dict-ID field will be present, 18627b27ec6Sopenharmony_ciafter the descriptor flags and the Content Size. 18727b27ec6Sopenharmony_ci 18827b27ec6Sopenharmony_ci__Block Maximum Size__ 18927b27ec6Sopenharmony_ci 19027b27ec6Sopenharmony_ciThis information is useful to help the decoder allocate memory. 19127b27ec6Sopenharmony_ciSize here refers to the original (uncompressed) data size. 19227b27ec6Sopenharmony_ciBlock Maximum Size is one value among the following table : 19327b27ec6Sopenharmony_ci 19427b27ec6Sopenharmony_ci| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 19527b27ec6Sopenharmony_ci| --- | --- | --- | --- | ----- | ------ | ---- | ---- | 19627b27ec6Sopenharmony_ci| N/A | N/A | N/A | N/A | 64 KB | 256 KB | 1 MB | 4 MB | 19727b27ec6Sopenharmony_ci 19827b27ec6Sopenharmony_ciThe decoder may refuse to allocate block sizes above any system-specific size. 19927b27ec6Sopenharmony_ciUnused values may be used in a future revision of the spec. 20027b27ec6Sopenharmony_ciA decoder conformant with the current version of the spec 20127b27ec6Sopenharmony_ciis only able to decode block sizes defined in this spec. 20227b27ec6Sopenharmony_ci 20327b27ec6Sopenharmony_ci__Reserved bits__ 20427b27ec6Sopenharmony_ci 20527b27ec6Sopenharmony_ciValue of reserved bits **must** be 0 (zero). 20627b27ec6Sopenharmony_ciReserved bit might be used in a future version of the specification, 20727b27ec6Sopenharmony_citypically enabling new optional features. 20827b27ec6Sopenharmony_ciWhen this happens, a decoder respecting the current specification version 20927b27ec6Sopenharmony_cishall not be able to decode such a frame. 21027b27ec6Sopenharmony_ci 21127b27ec6Sopenharmony_ci__Content Size__ 21227b27ec6Sopenharmony_ci 21327b27ec6Sopenharmony_ciThis is the original (uncompressed) size. 21427b27ec6Sopenharmony_ciThis information is optional, and only present if the associated flag is set. 21527b27ec6Sopenharmony_ciContent size is provided using unsigned 8 Bytes, for a maximum of 16 Exabytes. 21627b27ec6Sopenharmony_ciFormat is Little endian. 21727b27ec6Sopenharmony_ciThis value is informational, typically for display or memory allocation. 21827b27ec6Sopenharmony_ciIt can be skipped by a decoder, or used to validate content correctness. 21927b27ec6Sopenharmony_ci 22027b27ec6Sopenharmony_ci__Dictionary ID__ 22127b27ec6Sopenharmony_ci 22227b27ec6Sopenharmony_ciDict-ID is only present if the associated flag is set. 22327b27ec6Sopenharmony_ciIt's an unsigned 32-bits value, stored using little-endian convention. 22427b27ec6Sopenharmony_ciA dictionary is useful to compress short input sequences. 22527b27ec6Sopenharmony_ciThe compressor can take advantage of the dictionary context 22627b27ec6Sopenharmony_cito encode the input in a more compact manner. 22727b27ec6Sopenharmony_ciIt works as a kind of “known prefix” which is used by 22827b27ec6Sopenharmony_ciboth the compressor and the decompressor to “warm-up” reference tables. 22927b27ec6Sopenharmony_ci 23027b27ec6Sopenharmony_ciThe decompressor can use Dict-ID identifier to determine 23127b27ec6Sopenharmony_ciwhich dictionary must be used to correctly decode data. 23227b27ec6Sopenharmony_ciThe compressor and the decompressor must use exactly the same dictionary. 23327b27ec6Sopenharmony_ciIt's presumed that the 32-bits dictID uniquely identifies a dictionary. 23427b27ec6Sopenharmony_ci 23527b27ec6Sopenharmony_ciWithin a single frame, a single dictionary can be defined. 23627b27ec6Sopenharmony_ciWhen the frame descriptor defines independent blocks, 23727b27ec6Sopenharmony_cieach block will be initialized with the same dictionary. 23827b27ec6Sopenharmony_ciIf the frame descriptor defines linked blocks, 23927b27ec6Sopenharmony_cithe dictionary will only be used once, at the beginning of the frame. 24027b27ec6Sopenharmony_ci 24127b27ec6Sopenharmony_ci__Header Checksum__ 24227b27ec6Sopenharmony_ci 24327b27ec6Sopenharmony_ciOne-byte checksum of combined descriptor fields, including optional ones. 24427b27ec6Sopenharmony_ciThe value is the second byte of `xxh32()` : ` (xxh32()>>8) & 0xFF ` 24527b27ec6Sopenharmony_ciusing zero as a seed, and the full Frame Descriptor as an input 24627b27ec6Sopenharmony_ci(including optional fields when they are present). 24727b27ec6Sopenharmony_ciA wrong checksum indicates that the descriptor is erroneous. 24827b27ec6Sopenharmony_ci 24927b27ec6Sopenharmony_ci 25027b27ec6Sopenharmony_ciData Blocks 25127b27ec6Sopenharmony_ci----------- 25227b27ec6Sopenharmony_ci 25327b27ec6Sopenharmony_ci| Block Size | data | (Block Checksum) | 25427b27ec6Sopenharmony_ci|:----------:| ------ |:----------------:| 25527b27ec6Sopenharmony_ci| 4 bytes | | 0 - 4 bytes | 25627b27ec6Sopenharmony_ci 25727b27ec6Sopenharmony_ci 25827b27ec6Sopenharmony_ci__Block Size__ 25927b27ec6Sopenharmony_ci 26027b27ec6Sopenharmony_ciThis field uses 4-bytes, format is little-endian. 26127b27ec6Sopenharmony_ci 26227b27ec6Sopenharmony_ciIf the highest bit is set (`1`), the block is uncompressed. 26327b27ec6Sopenharmony_ci 26427b27ec6Sopenharmony_ciIf the highest bit is not set (`0`), the block is LZ4-compressed, 26527b27ec6Sopenharmony_ciusing the [LZ4 block format specification](https://github.com/lz4/lz4/blob/dev/doc/lz4_Block_format.md). 26627b27ec6Sopenharmony_ci 26727b27ec6Sopenharmony_ciAll other bits give the size, in bytes, of the data section. 26827b27ec6Sopenharmony_ciThe size does not include the block checksum if present. 26927b27ec6Sopenharmony_ci 27027b27ec6Sopenharmony_ci_Block_Size_ shall never be larger than _Block_Maximum_Size_. 27127b27ec6Sopenharmony_ciSuch an outcome could potentially happen for non-compressible sources. 27227b27ec6Sopenharmony_ciIn such a case, such data block must be passed using uncompressed format. 27327b27ec6Sopenharmony_ci 27427b27ec6Sopenharmony_ciA value of `0x00000000` is invalid, and signifies an _EndMark_ instead. 27527b27ec6Sopenharmony_ciNote that this is different from a value of `0x80000000` (highest bit set), 27627b27ec6Sopenharmony_ciwhich is an uncompressed block of size 0 (empty), 27727b27ec6Sopenharmony_ciwhich is valid, and therefore doesn't end a frame. 27827b27ec6Sopenharmony_ciNote that, if _Block_checksum_ is enabled, 27927b27ec6Sopenharmony_cieven an empty block must be followed by a 32-bit block checksum. 28027b27ec6Sopenharmony_ci 28127b27ec6Sopenharmony_ci__Data__ 28227b27ec6Sopenharmony_ci 28327b27ec6Sopenharmony_ciWhere the actual data to decode stands. 28427b27ec6Sopenharmony_ciIt might be compressed or not, depending on previous field indications. 28527b27ec6Sopenharmony_ci 28627b27ec6Sopenharmony_ciWhen compressed, the data must respect the [LZ4 block format specification](https://github.com/lz4/lz4/blob/dev/doc/lz4_Block_format.md). 28727b27ec6Sopenharmony_ci 28827b27ec6Sopenharmony_ciNote that a block is not necessarily full. 28927b27ec6Sopenharmony_ciUncompressed size of data can be any size __up to__ _Block_Maximum_Size_, 29027b27ec6Sopenharmony_ciso it may contain less data than the maximum block size. 29127b27ec6Sopenharmony_ci 29227b27ec6Sopenharmony_ci__Block checksum__ 29327b27ec6Sopenharmony_ci 29427b27ec6Sopenharmony_ciOnly present if the associated flag is set. 29527b27ec6Sopenharmony_ciThis is a 4-bytes checksum value, in little endian format, 29627b27ec6Sopenharmony_cicalculated by using the [xxHash-32 algorithm] on the __raw__ (undecoded) data block, 29727b27ec6Sopenharmony_ciand a seed of zero. 29827b27ec6Sopenharmony_ciThe intention is to detect data corruption (storage or transmission errors) 29927b27ec6Sopenharmony_cibefore decoding. 30027b27ec6Sopenharmony_ci 30127b27ec6Sopenharmony_ci_Block_checksum_ can be cumulative with _Content_checksum_. 30227b27ec6Sopenharmony_ci 30327b27ec6Sopenharmony_ci[xxHash-32 algorithm]: https://github.com/Cyan4973/xxHash/blob/release/doc/xxhash_spec.md 30427b27ec6Sopenharmony_ci 30527b27ec6Sopenharmony_ci 30627b27ec6Sopenharmony_ciSkippable Frames 30727b27ec6Sopenharmony_ci---------------- 30827b27ec6Sopenharmony_ci 30927b27ec6Sopenharmony_ci| Magic Number | Frame Size | User Data | 31027b27ec6Sopenharmony_ci|:------------:|:----------:| --------- | 31127b27ec6Sopenharmony_ci| 4 bytes | 4 bytes | | 31227b27ec6Sopenharmony_ci 31327b27ec6Sopenharmony_ciSkippable frames allow the integration of user-defined data 31427b27ec6Sopenharmony_ciinto a flow of concatenated frames. 31527b27ec6Sopenharmony_ciIts design is pretty straightforward, 31627b27ec6Sopenharmony_ciwith the sole objective to allow the decoder to quickly skip 31727b27ec6Sopenharmony_ciover user-defined data and continue decoding. 31827b27ec6Sopenharmony_ci 31927b27ec6Sopenharmony_ciFor the purpose of facilitating identification, 32027b27ec6Sopenharmony_ciit is discouraged to start a flow of concatenated frames with a skippable frame. 32127b27ec6Sopenharmony_ciIf there is a need to start such a flow with some user data 32227b27ec6Sopenharmony_ciencapsulated into a skippable frame, 32327b27ec6Sopenharmony_ciit’s recommended to start with a zero-byte LZ4 frame 32427b27ec6Sopenharmony_cifollowed by a skippable frame. 32527b27ec6Sopenharmony_ciThis will make it easier for file type identifiers. 32627b27ec6Sopenharmony_ci 32727b27ec6Sopenharmony_ci 32827b27ec6Sopenharmony_ci__Magic Number__ 32927b27ec6Sopenharmony_ci 33027b27ec6Sopenharmony_ci4 Bytes, Little endian format. 33127b27ec6Sopenharmony_ciValue : 0x184D2A5X, which means any value from 0x184D2A50 to 0x184D2A5F. 33227b27ec6Sopenharmony_ciAll 16 values are valid to identify a skippable frame. 33327b27ec6Sopenharmony_ci 33427b27ec6Sopenharmony_ci__Frame Size__ 33527b27ec6Sopenharmony_ci 33627b27ec6Sopenharmony_ciThis is the size, in bytes, of the following User Data 33727b27ec6Sopenharmony_ci(without including the magic number nor the size field itself). 33827b27ec6Sopenharmony_ci4 Bytes, Little endian format, unsigned 32-bits. 33927b27ec6Sopenharmony_ciThis means User Data can’t be bigger than (2^32-1) Bytes. 34027b27ec6Sopenharmony_ci 34127b27ec6Sopenharmony_ci__User Data__ 34227b27ec6Sopenharmony_ci 34327b27ec6Sopenharmony_ciUser Data can be anything. Data will just be skipped by the decoder. 34427b27ec6Sopenharmony_ci 34527b27ec6Sopenharmony_ci 34627b27ec6Sopenharmony_ciLegacy frame 34727b27ec6Sopenharmony_ci------------ 34827b27ec6Sopenharmony_ci 34927b27ec6Sopenharmony_ciThe Legacy frame format was defined into the initial versions of “LZ4Demo”. 35027b27ec6Sopenharmony_ciNewer compressors should not use this format anymore, as it is too restrictive. 35127b27ec6Sopenharmony_ci 35227b27ec6Sopenharmony_ciMain characteristics of the legacy format : 35327b27ec6Sopenharmony_ci 35427b27ec6Sopenharmony_ci- Fixed block size : 8 MB. 35527b27ec6Sopenharmony_ci- All blocks must be completely filled, except the last one. 35627b27ec6Sopenharmony_ci- All blocks are always compressed, even when compression is detrimental. 35727b27ec6Sopenharmony_ci- The last block is detected either because 35827b27ec6Sopenharmony_ci it is followed by the “EOF” (End of File) mark, 35927b27ec6Sopenharmony_ci or because it is followed by a known Frame Magic Number. 36027b27ec6Sopenharmony_ci- No checksum 36127b27ec6Sopenharmony_ci- Convention is Little endian 36227b27ec6Sopenharmony_ci 36327b27ec6Sopenharmony_ci| MagicNb | B.CSize | CData | B.CSize | CData | (...) | EndMark | 36427b27ec6Sopenharmony_ci| ------- | ------- | ----- | ------- | ----- | ------- | ------- | 36527b27ec6Sopenharmony_ci| 4 bytes | 4 bytes | CSize | 4 bytes | CSize | x times | EOF | 36627b27ec6Sopenharmony_ci 36727b27ec6Sopenharmony_ci 36827b27ec6Sopenharmony_ci__Magic Number__ 36927b27ec6Sopenharmony_ci 37027b27ec6Sopenharmony_ci4 Bytes, Little endian format. 37127b27ec6Sopenharmony_ciValue : 0x184C2102 37227b27ec6Sopenharmony_ci 37327b27ec6Sopenharmony_ci__Block Compressed Size__ 37427b27ec6Sopenharmony_ci 37527b27ec6Sopenharmony_ciThis is the size, in bytes, of the following compressed data block. 37627b27ec6Sopenharmony_ci4 Bytes, Little endian format. 37727b27ec6Sopenharmony_ci 37827b27ec6Sopenharmony_ci__Data__ 37927b27ec6Sopenharmony_ci 38027b27ec6Sopenharmony_ciWhere the actual compressed data stands. 38127b27ec6Sopenharmony_ciData is always compressed, even when compression is detrimental. 38227b27ec6Sopenharmony_ci 38327b27ec6Sopenharmony_ci__EndMark__ 38427b27ec6Sopenharmony_ci 38527b27ec6Sopenharmony_ciEnd of legacy frame is implicit only. 38627b27ec6Sopenharmony_ciIt must be followed by a standard EOF (End Of File) signal, 38727b27ec6Sopenharmony_ciwhether it is a file or a stream. 38827b27ec6Sopenharmony_ci 38927b27ec6Sopenharmony_ciAlternatively, if the frame is followed by a valid Frame Magic Number, 39027b27ec6Sopenharmony_ciit is considered completed. 39127b27ec6Sopenharmony_ciThis policy makes it possible to concatenate legacy frames. 39227b27ec6Sopenharmony_ci 39327b27ec6Sopenharmony_ciAny other value will be interpreted as a block size, 39427b27ec6Sopenharmony_ciand trigger an error if it does not fit within acceptable range. 39527b27ec6Sopenharmony_ci 39627b27ec6Sopenharmony_ci 39727b27ec6Sopenharmony_ciVersion changes 39827b27ec6Sopenharmony_ci--------------- 39927b27ec6Sopenharmony_ci 40027b27ec6Sopenharmony_ci1.6.2 : clarifies specification of _EndMark_ 40127b27ec6Sopenharmony_ci 40227b27ec6Sopenharmony_ci1.6.1 : introduced terms "LZ4 Frame Header" and "LZ4 Frame Footer" 40327b27ec6Sopenharmony_ci 40427b27ec6Sopenharmony_ci1.6.0 : restored Dictionary ID field in Frame header 40527b27ec6Sopenharmony_ci 40627b27ec6Sopenharmony_ci1.5.1 : changed document format to MarkDown 40727b27ec6Sopenharmony_ci 40827b27ec6Sopenharmony_ci1.5 : removed Dictionary ID from specification 40927b27ec6Sopenharmony_ci 41027b27ec6Sopenharmony_ci1.4.1 : changed wording from “stream” to “frame” 41127b27ec6Sopenharmony_ci 41227b27ec6Sopenharmony_ci1.4 : added skippable streams, re-added stream checksum 41327b27ec6Sopenharmony_ci 41427b27ec6Sopenharmony_ci1.3 : modified header checksum 41527b27ec6Sopenharmony_ci 41627b27ec6Sopenharmony_ci1.2 : reduced choice of “block size”, to postpone decision on “dynamic size of BlockSize Field”. 41727b27ec6Sopenharmony_ci 41827b27ec6Sopenharmony_ci1.1 : optional fields are now part of the descriptor 41927b27ec6Sopenharmony_ci 42027b27ec6Sopenharmony_ci1.0 : changed “block size” specification, adding a compressed/uncompressed flag 42127b27ec6Sopenharmony_ci 42227b27ec6Sopenharmony_ci0.9 : reduced scale of “block maximum size” table 42327b27ec6Sopenharmony_ci 42427b27ec6Sopenharmony_ci0.8 : removed : high compression flag 42527b27ec6Sopenharmony_ci 42627b27ec6Sopenharmony_ci0.7 : removed : stream checksum 42727b27ec6Sopenharmony_ci 42827b27ec6Sopenharmony_ci0.6 : settled : stream size uses 8 bytes, endian convention is little endian 42927b27ec6Sopenharmony_ci 43027b27ec6Sopenharmony_ci0.5: added copyright notice 43127b27ec6Sopenharmony_ci 43227b27ec6Sopenharmony_ci0.4 : changed format to Google Doc compatible OpenDocument 433