From 676bc91cc409515aa6e4418b336fcedb29d34ff2 Mon Sep 17 00:00:00 2001 From: Zoltan Szabadka Date: Tue, 20 Oct 2015 12:27:09 +0200 Subject: [PATCH] Generate new .txt version of the spec. Based on the changes in the .nroff source in PR #229 --- docs/draft-alakuijala-brotli-07.txt | 200 ++++++++++++++-------------- 1 file changed, 100 insertions(+), 100 deletions(-) diff --git a/docs/draft-alakuijala-brotli-07.txt b/docs/draft-alakuijala-brotli-07.txt index 7e520fd..340d7f1 100644 --- a/docs/draft-alakuijala-brotli-07.txt +++ b/docs/draft-alakuijala-brotli-07.txt @@ -93,7 +93,7 @@ Table of Contents 12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 34 13. Informative References . . . . . . . . . . . . . . . . . . . 35 14. Source code . . . . . . . . . . . . . . . . . . . . . . . . . 35 - 15. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 35 + 15. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 35 Appendix A. Static dictionary data . . . . . . . . . . . . . . . 35 Appendix B. List of word transformations . . . . . . . . . . . . 116 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 119 @@ -549,11 +549,11 @@ Internet-Draft Brotli October 2015 significant bit. The code lengths are initially in tree[I].Len; the codes are produced in tree[I].Code. - 1) Count the number of codes for each code length. Let - bl_count[N] be the number of codes of length N, N >= 1. + 1) Count the number of codes for each code length. Let + bl_count[N] be the number of codes of length N, N >= 1. - 2) Find the numerical value of the smallest code for each - code length: + 2) Find the numerical value of the smallest code for each + code length: @@ -564,26 +564,26 @@ Alakuijala & Szabadka Expires April 6, 2016 [Page 10] Internet-Draft Brotli October 2015 - code = 0; - bl_count[0] = 0; - for (bits = 1; bits <= MAX_BITS; bits++) { - code = (code + bl_count[bits-1]) << 1; - next_code[bits] = code; - } + code = 0; + bl_count[0] = 0; + for (bits = 1; bits <= MAX_BITS; bits++) { + code = (code + bl_count[bits-1]) << 1; + next_code[bits] = code; + } - 3) Assign numerical values to all codes, using consecutive - values for all codes of the same length with the base - values determined at step 2. Codes that are never used - (which have a bit length of zero) must not be assigned a - value. + 3) Assign numerical values to all codes, using consecutive + values for all codes of the same length with the base + values determined at step 2. Codes that are never used + (which have a bit length of zero) must not be assigned a + value. - for (n = 0; n <= max_code; n++) { - len = tree[n].Len; - if (len != 0) { - tree[n].Code = next_code[len]; - next_code[len]++; - } - } + for (n = 0; n <= max_code; n++) { + len = tree[n].Len; + if (len != 0) { + tree[n].Code = next_code[len]; + next_code[len]++; + } + } Example: @@ -1161,10 +1161,10 @@ Internet-Draft Brotli October 2015 There are four methods, called context modes, to compute the Context ID: - * MSB6, where the Context ID is the value of six most - significant bits of p1, * LSB6, where the Context ID is the value of six least significant bits of p1, + * MSB6, where the Context ID is the value of six most + significant bits of p1, * UTF8, where the Context ID is a complex function of p1, p2, optimized for text compression, and * Signed, where Context ID is a complex function of p1, p2, @@ -1353,20 +1353,20 @@ Internet-Draft Brotli October 2015 language function: void InverseMoveToFrontTransform(uint8_t* v, int v_len) { - uint8_t mtf[256]; - int i; - for (i = 0; i < 256; ++i) { - mtf[i] = (uint8_t)i; - } - for (i = 0; i < v_len; ++i) { - uint8_t index = v[i]; - uint8_t value = mtf[index]; - v[i] = value; - for (; index; --index) { - mtf[index] = mtf[index - 1]; - } - mtf[0] = value; - } + uint8_t mtf[256]; + int i; + for (i = 0; i < 256; ++i) { + mtf[i] = (uint8_t)i; + } + for (i = 0; i < v_len; ++i) { + uint8_t index = v[i]; + uint8_t value = mtf[index]; + v[i] = value; + for (; index; --index) { + mtf[index] = mtf[index - 1]; + } + mtf[0] = value; + } } 8. Static dictionary @@ -1434,16 +1434,13 @@ Internet-Draft Brotli October 2015 where the _i subscript denotes the transform_id above. Each T_i is one of the following 21 elementary transforms: - Identity, OmitLast1, ..., OmitLast9, UppercaseFirst, UppercaseAll, - OmitFirst1, ..., OmitFirst9 + Identity, UppercaseFirst, UppercaseAll, + OmitFirst1, ..., OmitFirst9, OmitLast1, ..., OmitLast9 The form of these elementary transforms are as follows: Identity(word) = word - OmitLastk(word) = the first (length(word) - k) bytes of word, or - empty string if length(word) < k - UppercaseFirst(word) = first UTF-8 character of word upper-cased UppercaseAll(word) = all UTF-8 characters of word upper-cased @@ -1451,6 +1448,9 @@ Internet-Draft Brotli October 2015 OmitFirstk(word) = the last (length(word) - k) bytes of word, or empty string if length(word) < k + OmitLastk(word) = the first (length(word) - k) bytes of word, or + empty string if length(word) < k + For the purposes of UppercaseAll, word is parsed into UTF-8 @@ -1544,25 +1544,25 @@ Internet-Draft Brotli October 2015 length of the meta-block, and a bit signaling if the meta-block is the last one. The format of the meta-block header is the following: - 1 bit: ISLAST, set to 1 if this is the last meta-block - 1 bit: ISLASTEMPTY, if set to 1, the meta-block is empty; - this field is only present if ISLAST bit is set -- if - it is 1, then the meta-block and the brotli stream ends - at that bit, with any remaining bits in the last byte - of the compressed stream filled with zeros (if the - fill bits are not zero, then the stream should be - rejected as invalid) - 2 bits: MNIBBLES, # of nibbles to represent the uncompressed - length, encoded as follows: if set to 3, MNIBBLES is 0, - otherwise MNIBBLES is the value of this field plus 4. - If MNIBBLES is 0, the meta-block is empty, i.e. it does - not generate any uncompressed data. In this case, the - rest of the meta-block has the following format: + 1 bit: ISLAST, set to 1 if this is the last meta-block + 1 bit: ISLASTEMPTY, if set to 1, the meta-block is empty; + this field is only present if ISLAST bit is set -- if + it is 1, then the meta-block and the brotli stream ends + at that bit, with any remaining bits in the last byte + of the compressed stream filled with zeros (if the + fill bits are not zero, then the stream should be + rejected as invalid) + 2 bits: MNIBBLES, # of nibbles to represent the uncompressed + length, encoded as follows: if set to 3, MNIBBLES is 0, + otherwise MNIBBLES is the value of this field plus 4. + If MNIBBLES is 0, the meta-block is empty, i.e. it does + not generate any uncompressed data. In this case, the + rest of the meta-block has the following format: - 1 bit: reserved, must be zero + 1 bit: reserved, must be zero - 2 bits: MSKIPBYTES, # of bytes to represent metadata - length + 2 bits: MSKIPBYTES, # of bytes to represent metadata + length @@ -1572,37 +1572,37 @@ Alakuijala & Szabadka Expires April 6, 2016 [Page 28] Internet-Draft Brotli October 2015 - MSKIPBYTES x 8 bits: MSKIPLEN - 1, where MSKIPLEN is - the number of metadata bytes; this field is - only present if MSKIPBYTES is positive, - otherwise MSKIPLEN is 0 (if MSKIPBYTES is - greater than 1, and the last byte is all - zeros, then the stream should be rejected - as invalid) + MSKIPBYTES x 8 bits: MSKIPLEN - 1, where MSKIPLEN is + the number of metadata bytes; this field is + only present if MSKIPBYTES is positive, + otherwise MSKIPLEN is 0 (if MSKIPBYTES is + greater than 1, and the last byte is all + zeros, then the stream should be rejected + as invalid) - 0 - 7 bits: fill bits until the next byte boundary, - must be all zeros + 0 - 7 bits: fill bits until the next byte boundary, + must be all zeros - MSKIPLEN bytes of metadata, not part of the - uncompressed data or the sliding window + MSKIPLEN bytes of metadata, not part of the + uncompressed data or the sliding window - MNIBBLES x 4 bits: MLEN - 1, where MLEN is the length - of the meta-block uncompressed data in bytes (if the - number of nibbles is greater than 4, and the last - nibble is all zeros, then the stream should be - rejected as invalid) + MNIBBLES x 4 bits: MLEN - 1, where MLEN is the length + of the meta-block uncompressed data in bytes (if the + number of nibbles is greater than 4, and the last + nibble is all zeros, then the stream should be + rejected as invalid) - 1 bit: ISUNCOMPRESSED, if set to 1, any bits of compressed - data up to the next byte boundary are ignored, and - the rest of the meta-block contains MLEN bytes of - literal data; this field is only present if the - ISLAST bit is not set (if the ignored bits are not - all zeros, the stream should be rejected as invalid) + 1 bit: ISUNCOMPRESSED, if set to 1, any bits of compressed + data up to the next byte boundary are ignored, and + the rest of the meta-block contains MLEN bytes of + literal data; this field is only present if the + ISLAST bit is not set (if the ignored bits are not + all zeros, the stream should be rejected as invalid) 1-11 bits: NBLTYPESL, # of literal block types, encoded with - the following variable length code (as it appears in - the compressed data, where the bits are parsed from - right to left, so 0110111 has the value 12): + the following variable length code (as it appears in + the compressed data, where the bits are parsed from + right to left, so 0110111 has the value 12): Value Bit Pattern ----- ----------- @@ -1657,13 +1657,13 @@ Internet-Draft Brotli October 2015 Block count code + Extra bits for first distance block count, only if NBLTYPESD >= 2 - 2 bits: NPOSTFIX, parameter used in the distance coding + 2 bits: NPOSTFIX, parameter used in the distance coding - 4 bits: four most significant bits of NDIRECT, to get the - actual value of the parameter NDIRECT, left-shift - this four bit number by NPOSTFIX bits + 4 bits: four most significant bits of NDIRECT, to get the + actual value of the parameter NDIRECT, left-shift + this four bit number by NPOSTFIX bits - NBLTYPESL x 2 bits: context mode for each literal block type + NBLTYPESL x 2 bits: context mode for each literal block type 1-11 bits: NTREESL, # of literal prefix trees, encoded with the same variable length code as NBLTYPESL @@ -1687,11 +1687,11 @@ Internet-Draft Brotli October 2015 appears only if NTREESD >= 2, otherwise the context map has only zero values - NTREESL prefix codes for literals + NTREESL prefix codes for literals - NBLTYPESI prefix codes for insert-and-copy lengths + NBLTYPESI prefix codes for insert-and-copy lengths - NTREESD prefix codes for distances + NTREESD prefix codes for distances 9.3. Format of the meta-block data @@ -1727,8 +1727,8 @@ Internet-Draft Brotli October 2015 described in Paragraph 7.3. Block type code for next distance block type, appears only - if NBLTYPESD >= 2 and the previous distance block count - is zero + if NBLTYPESD >= 2 and the previous distance block count + is zero Block count code + Extra bits for next distance block length, appears only if NBLTYPESD >= 2 and the previous @@ -1831,7 +1831,7 @@ Internet-Draft Brotli October 2015 save previous block type read block count using HTREE_BLEN_I and set BLEN_I decrement BLEN_I - read insert and copy length, ILEN, CLEN with HTREEI[BTYPE_I] + read insert and copy length, ILEN, CLEN using HTREEI[BTYPE_I] loop for ILEN if BLEN_L is zero read block type using HTREE_BTYPE_L and set BTYPE_L @@ -1862,9 +1862,9 @@ Internet-Draft Brotli October 2015 read block count using HTREE_BLEN_D and set BLEN_D decrement BLEN_D compute context ID, CIDD from CLEN - read distance code with HTREED[CMAPD[4 * BTYPE_D + CIDD]] + read distance code using HTREED[CMAPD[4 * BTYPE_D + CIDD]] compute distance by distance short code substitution - move backwards distance bytes in the uncompressed data and + move backwards distance bytes in the uncompressed data and copy CLEN bytes from this position to the uncompressed stream, or look up the static dictionary word, transform the word as directed, and copy the result to the @@ -1942,7 +1942,7 @@ Internet-Draft Brotli October 2015 available in the brotli open-source project: https://github.com/google/brotli -15. Acknowledgements +15. Acknowledgments The authors would like to thank Mark Adler for providing helpful review comments, validating the specification by writing an