Commit Graph

54 Commits

Author SHA1 Message Date
Joe Tsai
3ab9853648 Fix grammar in Section 2.
s/copy length determine /copy length determines /g
2015-11-01 23:00:07 -08:00
Joe Tsai
e57dbc0f5d Minor capitalization fix 2015-11-01 18:23:20 -08:00
Joe Tsai
5c869c9de2 Clarify simple and complex prefix codes
* At the beginning of the simple prefix code section, telling us that "a value
of 1 indicates the number of leading zeros" is not very helpful. Instead, it
should indicate that it means a complex prefix code and point the reader to
the relevant section (which repeats this information in more detail)

* Clearly indicate that reusing a value is an error! This seems to be the
behavior of the of the reference implementation.

* Clarify what the termination conditions are while reading the prefix codes.
Also, indicate that it is an error if the prefix tree is over-subscribed or
under-subscribed.

* Clearly state what is the maximum number of individual symbols that may be
read. This ensures that it is forbidden to an stream that continually says that
the symbols have zero length.
2015-11-01 17:01:38 -08:00
Joe Tsai
c5b6b5c7c1 Minor formatting changes
* In the description about "three categories", explicitly number them instead
of using a giant paragraph that is harder to follow.

* Switch lists of items to consistently use American style commas. The American
style lists is better for clarity purposes. Consider the following:
	-Each category of value (insert and copy lengths, literals and distances)
	+Each category of value (insert and copy lengths, literals, and distances)

* Make sure not to break a hyphenated phrase with a newline. When the nroff
file is processed, "insert-\nand-copy" becomes "insert- and-copy", making it
inconsistent with other uses of the hyphenated phrase.

* Consistently use the same hyphenated phrase if referred to as a single unit.
	"insert and copy"   -> "insert-and-copy"
	"least significant" -> "least-significant"
	"most significant"  -> "most-significant"
	"fixed length"      -> "fixed-length"
	"block switch"      -> "block-switch".

* Consistently use "indexes" instead of "indices"
2015-11-01 16:50:13 -08:00
Joe Tsai
166edb0287 Minor formatting of Section 9.2. and Section 9.3.
Many of the fields are copy-pastes of each other, but differ slightly
in placement of words, capitalization, or other random
oddities. This commit makes it so that if you simply do a search
replace on these following passages, you get the same thing:

s/NBLTYPESX/(NBLTYPESI|NBLTYPESL|NBLTYPESD)/g
s/CATEGORY/(insert-and-copy|literal|distance)/g

>>>
   1-11 bits: NBLTYPESX, # of CATEGORY block types, encoded
              with the same variable length code as above

      Prefix code over the block type code alphabet for
         CATEGORY block types, appears only if NBLTYPESX >= 2

      Prefix code over the block count code alphabet for
         CATEGORY block counts, appears only if NBLTYPESX >= 2

      Block count code + Extra bits for first CATEGORY
         block count, appears only if NBLTYPESX >= 2
<<<

>>>
      Block type code for next CATEGORY block type, appears
         only if NBLTYPESX >= 2 and the previous CATEGORY
         block count is zero

      Block count code + extra bits for next CATEGORY
         block count, appears only if NBLTYPESX >= 2 and the
         previous CATEGORY block count is zero
<<<
2015-11-01 16:28:25 -08:00
Joe Tsai
542a8b776e Clarify Section 7.3
* Acknowledge the fact that the context map is conceptually really a
two-dimensional matrix with 2 different keys, but in reality stored
as a one-dimensional array.

* Mention that InverseMoveToFrontTransform will not cause the
context map to have invalid indexes. This gives someone implementing
a decoder sanity that they do not have to go through the context
map again and check that all values are less than NTREES.
2015-10-29 09:50:19 -07:00
Joe Tsai
ff3897df2d Clarify Section 8.
* The phrase "difference between these distances" can either refer to
the conceptual difference (i.e. they hae different semantic meaning)
or to the mathematical difference (i.e. use substraction for the two).
Instead, just remove the sentence since the equations below make it
clear what we're supposed to do here.
2015-10-29 09:44:23 -07:00
Joe Tsai
2ffe45bd67 Clarify Section 4.
* If NDIRECT is zero, then the paragraph reads "from 16 to 15", which
doesn't make much sense. Thus, add a conditional to avoid this minor
oddity.
2015-10-29 09:42:00 -07:00
Joe Tsai
185cb9eada Define the maximum number of bytes transforms may add to a word
* This value is useful in implementing the decoder since we can know
ahead-of-time what size buffer is needed to contain the output of a
transformed word.
2015-10-29 09:40:41 -07:00
Joe Tsai
6d2575eab3 Use consistent bit convention in Section 5.
* Rather than say "lower 3 bits" in one sentence and "bits 3-5" in
the sentence right after, just consistently use the same convention
and say "0-2" and "3-5".
2015-10-29 09:39:06 -07:00
Joe Tsai
0e4cb52a8b Clarify Section 7.1.
* Provide exhaustive list of all the ways the last two bytes can be
sourced from.

* Also make a clear connection in this section that there are only 64
context IDs for literals. This is important for the indexing math
in context maps to make sense.
2015-10-29 08:32:11 -07:00
szabadka
8523d36e69 Merge pull request #242 from ende76/spec_suggest_block_switch
Added note about invalid block type value in block switch commands
2015-10-27 12:19:53 +01:00
Ende
11286539a5 Removed previous change, fixed typo NBLTYPES -> NBLTYPESL #242 2015-10-27 07:07:32 -04:00
Ende
d1cd34f6d9 Moved not about invalid distances up to 0-15 section 2015-10-27 06:35:30 -04:00
Ende
e544a185f9 Added note about invalid block type value 2015-10-26 16:22:28 -04:00
Ende
a05fa62501 Added note about invalid distance values 2015-10-26 16:00:20 -04:00
Zoltan Szabadka
ae04a34ce5 Generate new .txt version of the spec.
Based on the changes in the .nroff source in PR #231.
2015-10-26 12:06:29 +01:00
szabadka
816153cc79 Merge pull request #231 from dsnet/master
Use consistent bit ordering and variable names
2015-10-26 12:02:36 +01:00
Joe Tsai
ec8756d79c Remove note at end of section 3.1 about switching prefix conventions 2015-10-23 15:38:45 -07:00
Joe Tsai
0a9f65aadc s/static prefix code/variable length code/g 2015-10-22 09:13:59 -07:00
Joe Tsai
efeb59c4a2 Placed explicit bit pattern table for MNIBBLES to avoid any doubts 2015-10-22 09:11:04 -07:00
Joe Tsai
c996c06e8d Use consistent bit ordering and variable names
If bit-orderings are to be parsed from left-to-right,
then make the bit-strings left-justified.
If bit-orderings are to be parsed from right-to-left,
then make the bit-strings right-justified.

Section 3.1, which describes how prefix codes work
shows prefix codes that are "left-to-right", which
is better for demonstrating how the work. However,
most of the rest of the document uses a "right-to-left"
convention. We should distinctly say at the end of
section 3.1 that we are switching conventions.

Thus, change the prefix code in section 3.5 to be
"right-to-left" to be consistent with sections 9.1
and 9.2.

Also, change the variable names in section 7.3 to
be consistent with those used in section 10.

Also, change the description of MNIBBLES to be
"MNIBBLES - 4", similar to the convention of saying
"MLEN - 1". Beforehand, the phrase
"If MNIBBLES is 0, then ..." was unclear whether it
meant MNIBBLES before the "plus 4" or after.
2015-10-20 14:54:51 -07:00
Zoltan Szabadka
676bc91cc4 Generate new .txt version of the spec.
Based on the changes in the .nroff source in PR #229
2015-10-20 12:27:09 +02:00
Joe Tsai
4f1fce1681 Make code and paragraph both use 3-space indents 2015-10-20 03:02:55 -07:00
Joe Tsai
f908a4ebe4 Fix spelling of "Acknowledgments"
Made tab-space of code snippet to be 3-space instead of 2-space
2015-10-20 02:43:25 -07:00
Joe Tsai
fa1c60e35d Addressed comments about whitespace 2015-10-20 02:39:09 -07:00
Joe Tsai
1486df764e Fixed minor whitespace formatting and ordering of elements
Fixed minor whitespacing issues that caused print-out to be slightly
confusing. Biggest change is in section 9.2, where an indent seemed
to indicate that some fields were part of the previous field, when
they were not related.

Also, changed the order that transforms are described in section 8
to match the enumeration values that are explicitly defined in
Appendix B.
2015-10-19 13:53:24 -07:00
Zoltan Szabadka
e92afe0737 Add a summary table of alphabet sizes to the spec.
Based on a suggestion from Thomas Pickert.
2015-10-19 14:10:58 +02:00
Zoltan Szabadka
2c3d8eaea3 Change the title and the expiration date of the -07 draft. 2015-10-19 12:16:00 +02:00
Zoltan Szabadka
9bc4008fa2 Create -07 version of the draft. 2015-10-19 12:15:05 +02:00
Zoltan Szabadka
c4f439dbe6 Change the content encoding type from "bro" to "br". 2015-10-06 16:54:04 +02:00
Zoltan Szabadka
4d7de651a2 Fix the introduction part of the specification.
- window bits can be 10 to 24
- meta block can have 0 length
2015-10-06 15:22:18 +02:00
Zoltan Szabadka
534072ad42 Add brotli comparison study to the docs. 2015-10-02 14:40:56 +02:00
Zoltan Szabadka
100a2382b1 Update the spec with IANA Considerations. 2015-10-02 13:08:43 +02:00
Zoltan Szabadka
2faed4abaa Create -06 version of the spec. 2015-10-02 13:08:10 +02:00
Zoltan Szabadka
cacd294e87 Change the expiration date and title of the -05 draft. 2015-09-21 13:29:47 +02:00
Zoltan Szabadka
d1341bdd4c Create -05 version of the draft. 2015-09-21 13:27:22 +02:00
Zoltan Szabadka
e9edf7ebc5 Fix typo in the specification. 2015-09-21 13:26:17 +02:00
Zoltan Szabadka
075b3ad5fb Clarifications to the spec regarding when the stream should be rejected as invalid.
Based on Mark Adler's review findings.
2015-09-15 15:35:48 +02:00
Zoltan Szabadka
ea35936816 Change the expiration date and title of the -04 draft. 2015-05-11 17:04:13 +02:00
Zoltan Szabadka
14ea2b5805 Create -04 version of the draft. 2015-05-11 17:03:35 +02:00
Zoltan Szabadka
78350a9135 Add an Acknowledgements section to the spec. 2015-05-07 20:10:22 +02:00
Zoltan Szabadka
54f69c9ef7 Support window bits 10 - 15 in the decoder.
The previous window bit value 17 is used to
extend the range, since it has not been used
in any previous encoders.
2015-05-07 17:44:33 +02:00
Zoltan Szabadka
94bc27d87a Fix the year on the copyright message. 2015-04-27 18:25:59 +02:00
Zoltan Szabadka
fd4a048171 Change the expiration date and title of the -03 draft. 2015-04-27 18:12:09 +02:00
Zoltan Szabadka
98bd88413a Create -03 version of the internet draft. 2015-04-27 17:52:21 +02:00
Zoltan Szabadka
2d8b2ec12b Support empty meta-blocks with optional ignored metadata.
This is a partially backward incompatible format change,
that makes previously valid brotli streams that contain
larger than 16MB meta-blocks invalid.

The impact of this should be minimal, since the 'bro'
command-line tool does not create larger than 2MB
meta-blocks, so the only streams this change could
break are those created by a custom brotli encoder.

This commit contains only the specification update,
implementation in the decoder and encoder will
follow in later commits.
2015-04-22 12:41:57 +02:00
Zoltan Szabadka
5b80ef0fd1 Change the specification to be less strict in some cases.
In the following three cases we allow more choices
for the compressor, which can potentially lead to
less compressed bits.

  (1) Allow brotli streams where the block counts
      do not count down to exactly zero at the end
      of the meta-block. This makes it possible
      for compressors to sometimes choose a block
      count which can be represented with less bits
      than the exact block count.

  (2) Remove the restriction that prefix code
      descriptions with exactly one non-zero
      length symbol in the code length alphabet
      must have 1 bit depth. This is because
      bit depth 1 requires the most bits to encode.

  (3) Allow any copy length value in the last
      command where the copy part is ignored.
      This makes it possible for a compressor
      to choose a copy length which can be
      represented with the least amount of bits.

In addition to the changes above, this commit also
has a wording clarification in the overview section
where the use of the 'context ID' expression is
changed to be consistent with the rest of the
specification, i.e. that it is a function of the
last two literals or the copy length.
2015-04-22 12:08:16 +02:00
Zoltan Szabadka
206d067c4a Use consistent sentence spacing in the specification.
All sentence spacing was changed to one space, except
in the boilerplate which must be preserved verbatim.
2015-04-22 11:55:29 +02:00
Zoltan Szabadka
e9fd1a4f20 Add Mark Adler's edits to the specification.
The specification source is changed in this commit
to exactly mirror the specification edited by Mark Adler:

https://github.com/madler/brotli/blob/master/brotli-02-edit.nroff
(version 70e53d7)
2015-04-22 11:33:38 +02:00