Many of the fields are copy-pastes of each other, but differ slightly
in placement of words, capitalization, or other random
oddities. This commit makes it so that if you simply do a search
replace on these following passages, you get the same thing:
s/NBLTYPESX/(NBLTYPESI|NBLTYPESL|NBLTYPESD)/g
s/CATEGORY/(insert-and-copy|literal|distance)/g
>>>
1-11 bits: NBLTYPESX, # of CATEGORY block types, encoded
with the same variable length code as above
Prefix code over the block type code alphabet for
CATEGORY block types, appears only if NBLTYPESX >= 2
Prefix code over the block count code alphabet for
CATEGORY block counts, appears only if NBLTYPESX >= 2
Block count code + Extra bits for first CATEGORY
block count, appears only if NBLTYPESX >= 2
<<<
>>>
Block type code for next CATEGORY block type, appears
only if NBLTYPESX >= 2 and the previous CATEGORY
block count is zero
Block count code + extra bits for next CATEGORY
block count, appears only if NBLTYPESX >= 2 and the
previous CATEGORY block count is zero
<<<
* Acknowledge the fact that the context map is conceptually really a
two-dimensional matrix with 2 different keys, but in reality stored
as a one-dimensional array.
* Mention that InverseMoveToFrontTransform will not cause the
context map to have invalid indexes. This gives someone implementing
a decoder sanity that they do not have to go through the context
map again and check that all values are less than NTREES.
* The phrase "difference between these distances" can either refer to
the conceptual difference (i.e. they hae different semantic meaning)
or to the mathematical difference (i.e. use substraction for the two).
Instead, just remove the sentence since the equations below make it
clear what we're supposed to do here.
* This value is useful in implementing the decoder since we can know
ahead-of-time what size buffer is needed to contain the output of a
transformed word.
* Rather than say "lower 3 bits" in one sentence and "bits 3-5" in
the sentence right after, just consistently use the same convention
and say "0-2" and "3-5".
* Provide exhaustive list of all the ways the last two bytes can be
sourced from.
* Also make a clear connection in this section that there are only 64
context IDs for literals. This is important for the indexing math
in context maps to make sense.
If bit-orderings are to be parsed from left-to-right,
then make the bit-strings left-justified.
If bit-orderings are to be parsed from right-to-left,
then make the bit-strings right-justified.
Section 3.1, which describes how prefix codes work
shows prefix codes that are "left-to-right", which
is better for demonstrating how the work. However,
most of the rest of the document uses a "right-to-left"
convention. We should distinctly say at the end of
section 3.1 that we are switching conventions.
Thus, change the prefix code in section 3.5 to be
"right-to-left" to be consistent with sections 9.1
and 9.2.
Also, change the variable names in section 7.3 to
be consistent with those used in section 10.
Also, change the description of MNIBBLES to be
"MNIBBLES - 4", similar to the convention of saying
"MLEN - 1". Beforehand, the phrase
"If MNIBBLES is 0, then ..." was unclear whether it
meant MNIBBLES before the "plus 4" or after.
Fixed minor whitespacing issues that caused print-out to be slightly
confusing. Biggest change is in section 9.2, where an indent seemed
to indicate that some fields were part of the previous field, when
they were not related.
Also, changed the order that transforms are described in section 8
to match the enumeration values that are explicitly defined in
Appendix B.