mirror of
https://github.com/google/brotli.git
synced 2024-12-29 03:01:16 +00:00
Merge pull request #230 from szabadka/master
Generate new .txt version of the spec.
This commit is contained in:
commit
83b8de7cb5
@ -93,7 +93,7 @@ Table of Contents
|
||||
12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 34
|
||||
13. Informative References . . . . . . . . . . . . . . . . . . . 35
|
||||
14. Source code . . . . . . . . . . . . . . . . . . . . . . . . . 35
|
||||
15. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 35
|
||||
15. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 35
|
||||
Appendix A. Static dictionary data . . . . . . . . . . . . . . . 35
|
||||
Appendix B. List of word transformations . . . . . . . . . . . . 116
|
||||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 119
|
||||
@ -549,11 +549,11 @@ Internet-Draft Brotli October 2015
|
||||
significant bit. The code lengths are initially in tree[I].Len; the
|
||||
codes are produced in tree[I].Code.
|
||||
|
||||
1) Count the number of codes for each code length. Let
|
||||
bl_count[N] be the number of codes of length N, N >= 1.
|
||||
1) Count the number of codes for each code length. Let
|
||||
bl_count[N] be the number of codes of length N, N >= 1.
|
||||
|
||||
2) Find the numerical value of the smallest code for each
|
||||
code length:
|
||||
2) Find the numerical value of the smallest code for each
|
||||
code length:
|
||||
|
||||
|
||||
|
||||
@ -564,26 +564,26 @@ Alakuijala & Szabadka Expires April 6, 2016 [Page 10]
|
||||
Internet-Draft Brotli October 2015
|
||||
|
||||
|
||||
code = 0;
|
||||
bl_count[0] = 0;
|
||||
for (bits = 1; bits <= MAX_BITS; bits++) {
|
||||
code = (code + bl_count[bits-1]) << 1;
|
||||
next_code[bits] = code;
|
||||
}
|
||||
code = 0;
|
||||
bl_count[0] = 0;
|
||||
for (bits = 1; bits <= MAX_BITS; bits++) {
|
||||
code = (code + bl_count[bits-1]) << 1;
|
||||
next_code[bits] = code;
|
||||
}
|
||||
|
||||
3) Assign numerical values to all codes, using consecutive
|
||||
values for all codes of the same length with the base
|
||||
values determined at step 2. Codes that are never used
|
||||
(which have a bit length of zero) must not be assigned a
|
||||
value.
|
||||
3) Assign numerical values to all codes, using consecutive
|
||||
values for all codes of the same length with the base
|
||||
values determined at step 2. Codes that are never used
|
||||
(which have a bit length of zero) must not be assigned a
|
||||
value.
|
||||
|
||||
for (n = 0; n <= max_code; n++) {
|
||||
len = tree[n].Len;
|
||||
if (len != 0) {
|
||||
tree[n].Code = next_code[len];
|
||||
next_code[len]++;
|
||||
}
|
||||
}
|
||||
for (n = 0; n <= max_code; n++) {
|
||||
len = tree[n].Len;
|
||||
if (len != 0) {
|
||||
tree[n].Code = next_code[len];
|
||||
next_code[len]++;
|
||||
}
|
||||
}
|
||||
|
||||
Example:
|
||||
|
||||
@ -1161,10 +1161,10 @@ Internet-Draft Brotli October 2015
|
||||
|
||||
There are four methods, called context modes, to compute the Context
|
||||
ID:
|
||||
* MSB6, where the Context ID is the value of six most
|
||||
significant bits of p1,
|
||||
* LSB6, where the Context ID is the value of six least
|
||||
significant bits of p1,
|
||||
* MSB6, where the Context ID is the value of six most
|
||||
significant bits of p1,
|
||||
* UTF8, where the Context ID is a complex function of p1, p2,
|
||||
optimized for text compression, and
|
||||
* Signed, where Context ID is a complex function of p1, p2,
|
||||
@ -1353,20 +1353,20 @@ Internet-Draft Brotli October 2015
|
||||
language function:
|
||||
|
||||
void InverseMoveToFrontTransform(uint8_t* v, int v_len) {
|
||||
uint8_t mtf[256];
|
||||
int i;
|
||||
for (i = 0; i < 256; ++i) {
|
||||
mtf[i] = (uint8_t)i;
|
||||
}
|
||||
for (i = 0; i < v_len; ++i) {
|
||||
uint8_t index = v[i];
|
||||
uint8_t value = mtf[index];
|
||||
v[i] = value;
|
||||
for (; index; --index) {
|
||||
mtf[index] = mtf[index - 1];
|
||||
}
|
||||
mtf[0] = value;
|
||||
}
|
||||
uint8_t mtf[256];
|
||||
int i;
|
||||
for (i = 0; i < 256; ++i) {
|
||||
mtf[i] = (uint8_t)i;
|
||||
}
|
||||
for (i = 0; i < v_len; ++i) {
|
||||
uint8_t index = v[i];
|
||||
uint8_t value = mtf[index];
|
||||
v[i] = value;
|
||||
for (; index; --index) {
|
||||
mtf[index] = mtf[index - 1];
|
||||
}
|
||||
mtf[0] = value;
|
||||
}
|
||||
}
|
||||
|
||||
8. Static dictionary
|
||||
@ -1434,16 +1434,13 @@ Internet-Draft Brotli October 2015
|
||||
where the _i subscript denotes the transform_id above. Each T_i is
|
||||
one of the following 21 elementary transforms:
|
||||
|
||||
Identity, OmitLast1, ..., OmitLast9, UppercaseFirst, UppercaseAll,
|
||||
OmitFirst1, ..., OmitFirst9
|
||||
Identity, UppercaseFirst, UppercaseAll,
|
||||
OmitFirst1, ..., OmitFirst9, OmitLast1, ..., OmitLast9
|
||||
|
||||
The form of these elementary transforms are as follows:
|
||||
|
||||
Identity(word) = word
|
||||
|
||||
OmitLastk(word) = the first (length(word) - k) bytes of word, or
|
||||
empty string if length(word) < k
|
||||
|
||||
UppercaseFirst(word) = first UTF-8 character of word upper-cased
|
||||
|
||||
UppercaseAll(word) = all UTF-8 characters of word upper-cased
|
||||
@ -1451,6 +1448,9 @@ Internet-Draft Brotli October 2015
|
||||
OmitFirstk(word) = the last (length(word) - k) bytes of word, or
|
||||
empty string if length(word) < k
|
||||
|
||||
OmitLastk(word) = the first (length(word) - k) bytes of word, or
|
||||
empty string if length(word) < k
|
||||
|
||||
For the purposes of UppercaseAll, word is parsed into UTF-8
|
||||
|
||||
|
||||
@ -1544,25 +1544,25 @@ Internet-Draft Brotli October 2015
|
||||
length of the meta-block, and a bit signaling if the meta-block is
|
||||
the last one. The format of the meta-block header is the following:
|
||||
|
||||
1 bit: ISLAST, set to 1 if this is the last meta-block
|
||||
1 bit: ISLASTEMPTY, if set to 1, the meta-block is empty;
|
||||
this field is only present if ISLAST bit is set -- if
|
||||
it is 1, then the meta-block and the brotli stream ends
|
||||
at that bit, with any remaining bits in the last byte
|
||||
of the compressed stream filled with zeros (if the
|
||||
fill bits are not zero, then the stream should be
|
||||
rejected as invalid)
|
||||
2 bits: MNIBBLES, # of nibbles to represent the uncompressed
|
||||
length, encoded as follows: if set to 3, MNIBBLES is 0,
|
||||
otherwise MNIBBLES is the value of this field plus 4.
|
||||
If MNIBBLES is 0, the meta-block is empty, i.e. it does
|
||||
not generate any uncompressed data. In this case, the
|
||||
rest of the meta-block has the following format:
|
||||
1 bit: ISLAST, set to 1 if this is the last meta-block
|
||||
1 bit: ISLASTEMPTY, if set to 1, the meta-block is empty;
|
||||
this field is only present if ISLAST bit is set -- if
|
||||
it is 1, then the meta-block and the brotli stream ends
|
||||
at that bit, with any remaining bits in the last byte
|
||||
of the compressed stream filled with zeros (if the
|
||||
fill bits are not zero, then the stream should be
|
||||
rejected as invalid)
|
||||
2 bits: MNIBBLES, # of nibbles to represent the uncompressed
|
||||
length, encoded as follows: if set to 3, MNIBBLES is 0,
|
||||
otherwise MNIBBLES is the value of this field plus 4.
|
||||
If MNIBBLES is 0, the meta-block is empty, i.e. it does
|
||||
not generate any uncompressed data. In this case, the
|
||||
rest of the meta-block has the following format:
|
||||
|
||||
1 bit: reserved, must be zero
|
||||
1 bit: reserved, must be zero
|
||||
|
||||
2 bits: MSKIPBYTES, # of bytes to represent metadata
|
||||
length
|
||||
2 bits: MSKIPBYTES, # of bytes to represent metadata
|
||||
length
|
||||
|
||||
|
||||
|
||||
@ -1572,37 +1572,37 @@ Alakuijala & Szabadka Expires April 6, 2016 [Page 28]
|
||||
Internet-Draft Brotli October 2015
|
||||
|
||||
|
||||
MSKIPBYTES x 8 bits: MSKIPLEN - 1, where MSKIPLEN is
|
||||
the number of metadata bytes; this field is
|
||||
only present if MSKIPBYTES is positive,
|
||||
otherwise MSKIPLEN is 0 (if MSKIPBYTES is
|
||||
greater than 1, and the last byte is all
|
||||
zeros, then the stream should be rejected
|
||||
as invalid)
|
||||
MSKIPBYTES x 8 bits: MSKIPLEN - 1, where MSKIPLEN is
|
||||
the number of metadata bytes; this field is
|
||||
only present if MSKIPBYTES is positive,
|
||||
otherwise MSKIPLEN is 0 (if MSKIPBYTES is
|
||||
greater than 1, and the last byte is all
|
||||
zeros, then the stream should be rejected
|
||||
as invalid)
|
||||
|
||||
0 - 7 bits: fill bits until the next byte boundary,
|
||||
must be all zeros
|
||||
0 - 7 bits: fill bits until the next byte boundary,
|
||||
must be all zeros
|
||||
|
||||
MSKIPLEN bytes of metadata, not part of the
|
||||
uncompressed data or the sliding window
|
||||
MSKIPLEN bytes of metadata, not part of the
|
||||
uncompressed data or the sliding window
|
||||
|
||||
MNIBBLES x 4 bits: MLEN - 1, where MLEN is the length
|
||||
of the meta-block uncompressed data in bytes (if the
|
||||
number of nibbles is greater than 4, and the last
|
||||
nibble is all zeros, then the stream should be
|
||||
rejected as invalid)
|
||||
MNIBBLES x 4 bits: MLEN - 1, where MLEN is the length
|
||||
of the meta-block uncompressed data in bytes (if the
|
||||
number of nibbles is greater than 4, and the last
|
||||
nibble is all zeros, then the stream should be
|
||||
rejected as invalid)
|
||||
|
||||
1 bit: ISUNCOMPRESSED, if set to 1, any bits of compressed
|
||||
data up to the next byte boundary are ignored, and
|
||||
the rest of the meta-block contains MLEN bytes of
|
||||
literal data; this field is only present if the
|
||||
ISLAST bit is not set (if the ignored bits are not
|
||||
all zeros, the stream should be rejected as invalid)
|
||||
1 bit: ISUNCOMPRESSED, if set to 1, any bits of compressed
|
||||
data up to the next byte boundary are ignored, and
|
||||
the rest of the meta-block contains MLEN bytes of
|
||||
literal data; this field is only present if the
|
||||
ISLAST bit is not set (if the ignored bits are not
|
||||
all zeros, the stream should be rejected as invalid)
|
||||
|
||||
1-11 bits: NBLTYPESL, # of literal block types, encoded with
|
||||
the following variable length code (as it appears in
|
||||
the compressed data, where the bits are parsed from
|
||||
right to left, so 0110111 has the value 12):
|
||||
the following variable length code (as it appears in
|
||||
the compressed data, where the bits are parsed from
|
||||
right to left, so 0110111 has the value 12):
|
||||
|
||||
Value Bit Pattern
|
||||
----- -----------
|
||||
@ -1657,13 +1657,13 @@ Internet-Draft Brotli October 2015
|
||||
Block count code + Extra bits for first distance block
|
||||
count, only if NBLTYPESD >= 2
|
||||
|
||||
2 bits: NPOSTFIX, parameter used in the distance coding
|
||||
2 bits: NPOSTFIX, parameter used in the distance coding
|
||||
|
||||
4 bits: four most significant bits of NDIRECT, to get the
|
||||
actual value of the parameter NDIRECT, left-shift
|
||||
this four bit number by NPOSTFIX bits
|
||||
4 bits: four most significant bits of NDIRECT, to get the
|
||||
actual value of the parameter NDIRECT, left-shift
|
||||
this four bit number by NPOSTFIX bits
|
||||
|
||||
NBLTYPESL x 2 bits: context mode for each literal block type
|
||||
NBLTYPESL x 2 bits: context mode for each literal block type
|
||||
|
||||
1-11 bits: NTREESL, # of literal prefix trees, encoded with
|
||||
the same variable length code as NBLTYPESL
|
||||
@ -1687,11 +1687,11 @@ Internet-Draft Brotli October 2015
|
||||
appears only if NTREESD >= 2, otherwise the context map
|
||||
has only zero values
|
||||
|
||||
NTREESL prefix codes for literals
|
||||
NTREESL prefix codes for literals
|
||||
|
||||
NBLTYPESI prefix codes for insert-and-copy lengths
|
||||
NBLTYPESI prefix codes for insert-and-copy lengths
|
||||
|
||||
NTREESD prefix codes for distances
|
||||
NTREESD prefix codes for distances
|
||||
|
||||
9.3. Format of the meta-block data
|
||||
|
||||
@ -1727,8 +1727,8 @@ Internet-Draft Brotli October 2015
|
||||
described in Paragraph 7.3.
|
||||
|
||||
Block type code for next distance block type, appears only
|
||||
if NBLTYPESD >= 2 and the previous distance block count
|
||||
is zero
|
||||
if NBLTYPESD >= 2 and the previous distance block count
|
||||
is zero
|
||||
|
||||
Block count code + Extra bits for next distance block
|
||||
length, appears only if NBLTYPESD >= 2 and the previous
|
||||
@ -1831,7 +1831,7 @@ Internet-Draft Brotli October 2015
|
||||
save previous block type
|
||||
read block count using HTREE_BLEN_I and set BLEN_I
|
||||
decrement BLEN_I
|
||||
read insert and copy length, ILEN, CLEN with HTREEI[BTYPE_I]
|
||||
read insert and copy length, ILEN, CLEN using HTREEI[BTYPE_I]
|
||||
loop for ILEN
|
||||
if BLEN_L is zero
|
||||
read block type using HTREE_BTYPE_L and set BTYPE_L
|
||||
@ -1862,9 +1862,9 @@ Internet-Draft Brotli October 2015
|
||||
read block count using HTREE_BLEN_D and set BLEN_D
|
||||
decrement BLEN_D
|
||||
compute context ID, CIDD from CLEN
|
||||
read distance code with HTREED[CMAPD[4 * BTYPE_D + CIDD]]
|
||||
read distance code using HTREED[CMAPD[4 * BTYPE_D + CIDD]]
|
||||
compute distance by distance short code substitution
|
||||
move backwards distance bytes in the uncompressed data and
|
||||
move backwards distance bytes in the uncompressed data and
|
||||
copy CLEN bytes from this position to the uncompressed
|
||||
stream, or look up the static dictionary word, transform
|
||||
the word as directed, and copy the result to the
|
||||
@ -1942,7 +1942,7 @@ Internet-Draft Brotli October 2015
|
||||
available in the brotli open-source project:
|
||||
https://github.com/google/brotli
|
||||
|
||||
15. Acknowledgements
|
||||
15. Acknowledgments
|
||||
|
||||
The authors would like to thank Mark Adler for providing helpful
|
||||
review comments, validating the specification by writing an
|
||||
|
Loading…
Reference in New Issue
Block a user