Add document for "Line by Line Text Compression" example
This commit is contained in:
parent
438fee9169
commit
19665c93ea
121
examples/blockStreaming_lineByLine.md
Normal file
121
examples/blockStreaming_lineByLine.md
Normal file
@ -0,0 +1,121 @@
|
||||
# LZ4 Streaming API Example : Line by Line Text Compression
|
||||
|
||||
`blockStreaming_lineByLine.c` is LZ4 Straming API example which implements line by line incremental (de)compression.
|
||||
|
||||
Please note the following restrictions :
|
||||
|
||||
- Firstly, read "LZ4 Streaming API Basics".
|
||||
- This is relatively advanced application example.
|
||||
- Output file is not compatible with lz4frame and platform dependent.
|
||||
|
||||
|
||||
## What's the point of this example ?
|
||||
|
||||
- Line by line incremental (de)compression.
|
||||
- Handle huge file in small amount of memory
|
||||
- Generally better compression ratio than Block API
|
||||
- Non-uniform block size
|
||||
|
||||
|
||||
## How the compression works
|
||||
|
||||
First of all, allocate "Ring Buffer" for input and LZ4 compressed data buffer for output.
|
||||
|
||||
```
|
||||
(1)
|
||||
Ring Buffer
|
||||
|
||||
+--------+
|
||||
| Line#1 |
|
||||
+---+----+
|
||||
|
|
||||
v
|
||||
{Out#1}
|
||||
|
||||
|
||||
(2)
|
||||
Prefix Mode Dependency
|
||||
+----+
|
||||
| |
|
||||
v |
|
||||
+--------+-+------+
|
||||
| Line#1 | Line#2 |
|
||||
+--------+---+----+
|
||||
|
|
||||
v
|
||||
{Out#2}
|
||||
|
||||
|
||||
(3)
|
||||
Prefix Prefix
|
||||
+----+ +----+
|
||||
| | | |
|
||||
v | v |
|
||||
+--------+-+------+-+------+
|
||||
| Line#1 | Line#2 | Line#3 |
|
||||
+--------+--------+---+----+
|
||||
|
|
||||
v
|
||||
{Out#3}
|
||||
|
||||
|
||||
(4)
|
||||
External Dictionary Mode
|
||||
+----+ +----+
|
||||
| | | |
|
||||
v | v |
|
||||
------+--------+-+------+-+--------+
|
||||
| .... | Line#X | Line#X+1 |
|
||||
------+--------+--------+-----+----+
|
||||
^ |
|
||||
| v
|
||||
| {Out#X+1}
|
||||
|
|
||||
Reset
|
||||
|
||||
|
||||
(5)
|
||||
Prefix
|
||||
+-----+
|
||||
| |
|
||||
v |
|
||||
------+--------+--------+----------+--+-------+
|
||||
| .... | Line#X | Line#X+1 | Line#X+2 |
|
||||
------+--------+--------+----------+-----+----+
|
||||
^ |
|
||||
| v
|
||||
| {Out#X+2}
|
||||
|
|
||||
Reset
|
||||
```
|
||||
|
||||
Next (see (1)), read first line to ringbuffer and compress it by `LZ4_compress_continue()`.
|
||||
For the first time, LZ4 doesn't know any previous dependencies,
|
||||
so it just compress the line without dependencies and generates compressed line {Out#1} to LZ4 compressed data buffer.
|
||||
After that, write {Out#1} to the file and forward ringbuffer offset.
|
||||
|
||||
Do the same things to second line (see (2)).
|
||||
But in this time, LZ4 can use dependency to Line#1 to improve compression ratio.
|
||||
This dependency is called "Prefix mode".
|
||||
|
||||
Eventually, we'll reach end of ringbuffer at Line#X (see (4)).
|
||||
This time, we should reset ringbuffer offset.
|
||||
After resetting, at Line#X+1 pointer is not adjacent, but LZ4 still maintain its memory.
|
||||
This is called "External Dictionary Mode".
|
||||
|
||||
In Line#X+2 (see (5)), finally LZ4 forget almost all memories but still remains Line#X+1.
|
||||
This is the same situation as Line#2.
|
||||
|
||||
Continue these procedure to the end of text file.
|
||||
|
||||
|
||||
## How the decompression works
|
||||
|
||||
Decompression will do reverse order.
|
||||
|
||||
- Read compressed line from the file to buffer.
|
||||
- Decompress it to the ringbuffer.
|
||||
- Output decompressed plain text line to the file.
|
||||
- Forward ringbuffer offset. If offset exceedes end of the ringbuffer, reset it.
|
||||
|
||||
Continue these procedure to the end of the compressed file.
|
Loading…
Reference in New Issue
Block a user