With this commit, the encoder will skip some
compression optimization steps for quality <= 4,
which results in faster compression but higher
compressed sizes.
Enabled for quality >= 4, and if there are no obvious
UTF8 violations detected.
For each block, we gather two separate histograms, one
for continuation bytes and one for ASCII or lead bytes.
* Change order of members of bit reader state structure.
* Remove unused includes for assert. Add BROTLI_DCHECK
macros and use it instead of assert.
* Do not calculate nbits in common case of ReadSymbol.
* Introduce and use PREDICT_TRUE / PREDICT_FALSE macros.
* Allocate less memory in the brotli decoder if it knows
the result size beforehand. Before this, the decoder
would always allocate 16MB if the encoder annotated the
window size as 22 bit (which is the default), even if the
file is only a few KB uncompressed. Now, it'll only
allocate a ringbuffer as large as needed for the result file.
But only if it can know the filesize, it's not possible
to know that if there are multiple metablocks or too large
uncompressed metablock.
Add a BrotliCompress() method to the public encoder API
that uses the BrotliIn and BrotliOut classes and use
that in the 'bro' command-line tool.
Use the streaming api in BrotliCompressBuffer() and
BrotliCompressor::WriteMetaBlock().
Use the appropiate hashers for quality <= 9.
Disable all slow features for quality <= 9 (literal cost modeling,
dictionary, context modeling, advanced block splitting).
Change vector<Command> arguments of internal functions
to Command* and size_t.
- Reject brotli streams where the number of
nibbles is too large for the size of the
meta-block
- Reject brotli streams where the padding
bits after a meta-block are not all zero
- Reject brotli streams where the symbol
in the simple prefix code is not in
the symbol alphabet
This is a partially backward incompatible format change,
that makes previously valid brotli streams that contain
larger than 16MB meta-blocks invalid.
The impact of this should be minimal, since the 'bro'
command-line tool does not create larger than 2MB
meta-blocks, so the only streams this change could
break are those created by a custom brotli encoder.
This commit contains only the specification update,
implementation in the decoder and encoder will
follow in later commits.
In the following three cases we allow more choices
for the compressor, which can potentially lead to
less compressed bits.
(1) Allow brotli streams where the block counts
do not count down to exactly zero at the end
of the meta-block. This makes it possible
for compressors to sometimes choose a block
count which can be represented with less bits
than the exact block count.
(2) Remove the restriction that prefix code
descriptions with exactly one non-zero
length symbol in the code length alphabet
must have 1 bit depth. This is because
bit depth 1 requires the most bits to encode.
(3) Allow any copy length value in the last
command where the copy part is ignored.
This makes it possible for a compressor
to choose a copy length which can be
represented with the least amount of bits.
In addition to the changes above, this commit also
has a wording clarification in the overview section
where the use of the 'context ID' expression is
changed to be consistent with the rest of the
specification, i.e. that it is a function of the
last two literals or the copy length.