127b27ec6Sopenharmony_ci# LZ4 Streaming API Example : Line by Line Text Compression 227b27ec6Sopenharmony_ciby *Takayuki Matsuoka* 327b27ec6Sopenharmony_ci 427b27ec6Sopenharmony_ci`blockStreaming_lineByLine.c` is LZ4 Streaming API example which implements line by line incremental (de)compression. 527b27ec6Sopenharmony_ci 627b27ec6Sopenharmony_ciPlease note the following restrictions : 727b27ec6Sopenharmony_ci 827b27ec6Sopenharmony_ci - Firstly, read "LZ4 Streaming API Basics". 927b27ec6Sopenharmony_ci - This is relatively advanced application example. 1027b27ec6Sopenharmony_ci - Output file is not compatible with lz4frame and platform dependent. 1127b27ec6Sopenharmony_ci 1227b27ec6Sopenharmony_ci 1327b27ec6Sopenharmony_ci## What's the point of this example ? 1427b27ec6Sopenharmony_ci 1527b27ec6Sopenharmony_ci - Line by line incremental (de)compression. 1627b27ec6Sopenharmony_ci - Handle huge file in small amount of memory 1727b27ec6Sopenharmony_ci - Generally better compression ratio than Block API 1827b27ec6Sopenharmony_ci - Non-uniform block size 1927b27ec6Sopenharmony_ci 2027b27ec6Sopenharmony_ci 2127b27ec6Sopenharmony_ci## How the compression works 2227b27ec6Sopenharmony_ci 2327b27ec6Sopenharmony_ciFirst of all, allocate "Ring Buffer" for input and LZ4 compressed data buffer for output. 2427b27ec6Sopenharmony_ci 2527b27ec6Sopenharmony_ci``` 2627b27ec6Sopenharmony_ci(1) 2727b27ec6Sopenharmony_ci Ring Buffer 2827b27ec6Sopenharmony_ci 2927b27ec6Sopenharmony_ci +--------+ 3027b27ec6Sopenharmony_ci | Line#1 | 3127b27ec6Sopenharmony_ci +---+----+ 3227b27ec6Sopenharmony_ci | 3327b27ec6Sopenharmony_ci v 3427b27ec6Sopenharmony_ci {Out#1} 3527b27ec6Sopenharmony_ci 3627b27ec6Sopenharmony_ci 3727b27ec6Sopenharmony_ci(2) 3827b27ec6Sopenharmony_ci Prefix Mode Dependency 3927b27ec6Sopenharmony_ci +----+ 4027b27ec6Sopenharmony_ci | | 4127b27ec6Sopenharmony_ci v | 4227b27ec6Sopenharmony_ci +--------+-+------+ 4327b27ec6Sopenharmony_ci | Line#1 | Line#2 | 4427b27ec6Sopenharmony_ci +--------+---+----+ 4527b27ec6Sopenharmony_ci | 4627b27ec6Sopenharmony_ci v 4727b27ec6Sopenharmony_ci {Out#2} 4827b27ec6Sopenharmony_ci 4927b27ec6Sopenharmony_ci 5027b27ec6Sopenharmony_ci(3) 5127b27ec6Sopenharmony_ci Prefix Prefix 5227b27ec6Sopenharmony_ci +----+ +----+ 5327b27ec6Sopenharmony_ci | | | | 5427b27ec6Sopenharmony_ci v | v | 5527b27ec6Sopenharmony_ci +--------+-+------+-+------+ 5627b27ec6Sopenharmony_ci | Line#1 | Line#2 | Line#3 | 5727b27ec6Sopenharmony_ci +--------+--------+---+----+ 5827b27ec6Sopenharmony_ci | 5927b27ec6Sopenharmony_ci v 6027b27ec6Sopenharmony_ci {Out#3} 6127b27ec6Sopenharmony_ci 6227b27ec6Sopenharmony_ci 6327b27ec6Sopenharmony_ci(4) 6427b27ec6Sopenharmony_ci External Dictionary Mode 6527b27ec6Sopenharmony_ci +----+ +----+ 6627b27ec6Sopenharmony_ci | | | | 6727b27ec6Sopenharmony_ci v | v | 6827b27ec6Sopenharmony_ci ------+--------+-+------+-+--------+ 6927b27ec6Sopenharmony_ci | .... | Line#X | Line#X+1 | 7027b27ec6Sopenharmony_ci ------+--------+--------+-----+----+ 7127b27ec6Sopenharmony_ci ^ | 7227b27ec6Sopenharmony_ci | v 7327b27ec6Sopenharmony_ci | {Out#X+1} 7427b27ec6Sopenharmony_ci | 7527b27ec6Sopenharmony_ci Reset 7627b27ec6Sopenharmony_ci 7727b27ec6Sopenharmony_ci 7827b27ec6Sopenharmony_ci(5) 7927b27ec6Sopenharmony_ci Prefix 8027b27ec6Sopenharmony_ci +-----+ 8127b27ec6Sopenharmony_ci | | 8227b27ec6Sopenharmony_ci v | 8327b27ec6Sopenharmony_ci ------+--------+--------+----------+--+-------+ 8427b27ec6Sopenharmony_ci | .... | Line#X | Line#X+1 | Line#X+2 | 8527b27ec6Sopenharmony_ci ------+--------+--------+----------+-----+----+ 8627b27ec6Sopenharmony_ci ^ | 8727b27ec6Sopenharmony_ci | v 8827b27ec6Sopenharmony_ci | {Out#X+2} 8927b27ec6Sopenharmony_ci | 9027b27ec6Sopenharmony_ci Reset 9127b27ec6Sopenharmony_ci``` 9227b27ec6Sopenharmony_ci 9327b27ec6Sopenharmony_ciNext (see (1)), read first line to ringbuffer and compress it by `LZ4_compress_continue()`. 9427b27ec6Sopenharmony_ciFor the first time, LZ4 doesn't know any previous dependencies, 9527b27ec6Sopenharmony_ciso it just compress the line without dependencies and generates compressed line {Out#1} to LZ4 compressed data buffer. 9627b27ec6Sopenharmony_ciAfter that, write {Out#1} to the file and forward ringbuffer offset. 9727b27ec6Sopenharmony_ci 9827b27ec6Sopenharmony_ciDo the same things to second line (see (2)). 9927b27ec6Sopenharmony_ciBut in this time, LZ4 can use dependency to Line#1 to improve compression ratio. 10027b27ec6Sopenharmony_ciThis dependency is called "Prefix mode". 10127b27ec6Sopenharmony_ci 10227b27ec6Sopenharmony_ciEventually, we'll reach end of ringbuffer at Line#X (see (4)). 10327b27ec6Sopenharmony_ciThis time, we should reset ringbuffer offset. 10427b27ec6Sopenharmony_ciAfter resetting, at Line#X+1 pointer is not adjacent, but LZ4 still maintain its memory. 10527b27ec6Sopenharmony_ciThis is called "External Dictionary Mode". 10627b27ec6Sopenharmony_ci 10727b27ec6Sopenharmony_ciIn Line#X+2 (see (5)), finally LZ4 forget almost all memories but still remains Line#X+1. 10827b27ec6Sopenharmony_ciThis is the same situation as Line#2. 10927b27ec6Sopenharmony_ci 11027b27ec6Sopenharmony_ciContinue these procedures to the end of text file. 11127b27ec6Sopenharmony_ci 11227b27ec6Sopenharmony_ci 11327b27ec6Sopenharmony_ci## How the decompression works 11427b27ec6Sopenharmony_ci 11527b27ec6Sopenharmony_ciDecompression will do reverse order. 11627b27ec6Sopenharmony_ci 11727b27ec6Sopenharmony_ci - Read compressed line from the file to buffer. 11827b27ec6Sopenharmony_ci - Decompress it to the ringbuffer. 11927b27ec6Sopenharmony_ci - Output decompressed plain text line to the file. 12027b27ec6Sopenharmony_ci - Forward ringbuffer offset. If offset exceeds end of the ringbuffer, reset it. 12127b27ec6Sopenharmony_ci 12227b27ec6Sopenharmony_ciContinue these procedures to the end of the compressed file. 123