Commit Graph

458 Commits

Author SHA1 Message Date
Zoltan Szabadka
95ddb48a11 Fix some VS compilation errors in the encoder.
- Use std::numeric_limits<double>::infinity() instead of 1.0 / 0.0
  - Use FastLog2() instead of log2() in cost model
2015-06-29 14:20:25 +02:00
lvandeve
9fa3016e66 Merge pull request #121 from lvandeve/master
Brotli Bug Fixes
2015-06-26 17:38:49 +02:00
Lode Vandevenne
bad0f4edf1 Brotli Bug Fixes 2015-06-26 17:37:00 +02:00
szabadka
67d26e0d46 Merge pull request #119 from szabadka/master
Deprecate greedy_block_split and enable_context_modeling brotli params.
2015-06-12 16:55:09 +02:00
Zoltan Szabadka
618287b373 Deprecate greedy_block_split and enable_context_modeling brotli params.
These affected only quality 11, and now it does not make sense
to disable block splitting or context modeling because most of
the time is spent in zopfli anyway.

Now all speed vs size compromises are controlled by the quality param.
2015-06-12 16:50:49 +02:00
szabadka
981ca6e561 Merge pull request #118 from szabadka/master
Use a static hash table to look up dictionary words and transforms.
2015-06-12 16:47:56 +02:00
Zoltan Szabadka
66098830a2 Use a static hash table to look up dictionary words and transforms.
This is used for quality 11, for qualities <= 9 we already
have a simpler hash table.

The static data size is 252 kB, and this removes the
need to initialize a huge hash map at startup, which was
the reason why transforms had to be disabled by default.
In comparison, the static dictionary itself is 120 kB.
This supports every transform, except the kOmitFirstN.
2015-06-12 16:45:17 +02:00
szabadka
0fd2df4f4d Merge pull request #117 from szabadka/master
Add "zopfli"-style backward reference search to brotli.
2015-06-12 16:37:17 +02:00
Zoltan Szabadka
b3d3723f62 Add "zopfli"-style backward reference search to brotli.
This commit adopts the backward reference search algorithm
from the zopfli project (see https://github.com/google/zopfli)
to brotli.

This slower backward reference search is run only in quality 11
and it runs two iterations of entropy cost modeling and
shortest path search.

As a result, the original backward reference search function can
be simplified a bit, since we can remove some heuristics that were
replaced with the zopfli-style search.
2015-06-12 16:25:41 +02:00
szabadka
94b337699f Merge pull request #116 from szabadka/master
Change the static dictionary hash table to take into account word frequency when there are hash collisions.
2015-06-12 16:16:04 +02:00
Zoltan Szabadka
835a77469e Change the static dictionary hash table to take into
account word frequency when there are hash collisions.
2015-06-12 16:14:06 +02:00
szabadka
a0d0ecfead Merge pull request #115 from szabadka/master
Bug fixes for the brotli encoder.
2015-06-12 16:12:38 +02:00
Zoltan Szabadka
65f3fc55f5 Bug fixes for the brotli encoder.
* Fix an out-of-bounds access to depth_histo in the
    bit cost calculation function.

  * Change type of distance symbol to uint16_t in block
    splitter, because if all postfix bits are used, there
    can be 520 distance symbols.

  * Save the distance cache between meta-blocks at the
    correct place. This fixes a roundtrip failure that
    can occur when there is an uncompressed metablock
    between two compressed metablocks.

  * Fix a bug when setting lgwin to 24 in the encoder parameters
    It ended up making metablocks larger than 24 bits in size.

  * Fix out-of-bounds memory accesses in parallel encoder.
    CreateBackwardReferences can read up to 4 bytes past end of
    input if the end of input is before mask.

  * Add missing header for memcpy() in port.h
2015-06-12 16:11:50 +02:00
szabadka
a13ea018f5 Merge pull request #114 from szabadka/master
Brotli custom LZ77 dictionary support.
2015-06-12 15:45:41 +02:00
Zoltan Szabadka
b43df8f699 Brotli custom LZ77 dictionary support.
Adds functions to prepend such dictionary to the
encoder and decoder, and twiddles their internal
parameters to do as if that was a previous part of
the input. This dictionary is just a prefilled LZ77
window, it is not related to the built in transformable
brotli dictionary.
2015-06-12 15:43:54 +02:00
szabadka
631184d40c Merge pull request #113 from szabadka/master
Speedups to brotli quality 11.
2015-06-12 15:32:31 +02:00
Zoltan Szabadka
667f70adcb Speedups to brotli quality 11.
* Cluster at most 64 histograms at a time in the first
    round of clustering.

  * Use a faster histogram cost estimation function.

  * Don't compute the log2(total) multiple times in the
    block splitter.
2015-06-12 15:29:06 +02:00
szabadka
af09ee7344 Merge pull request #112 from szabadka/master
Speedups and fixes to the decoder.
2015-06-12 15:14:13 +02:00
Zoltan Szabadka
641bc15882 Speedups and fixes to the decoder.
* Read data by 4-byte runs.
    This resolves unaligned read (Bus error) on arm-android.

  * Get rid of malloc/free in BrotliBuildHuffmanTable.

  * Tweak order of instructions when reading Huffman codes.
2015-06-12 15:12:23 +02:00
lvandeve
e0510a828e Update README.md 2015-06-12 14:31:47 +02:00
szabadka
996ec28993 Merge pull request #111 from szabadka/master
Create -04 version of the internet draft.
2015-05-11 17:22:04 +02:00
Zoltan Szabadka
ea35936816 Change the expiration date and title of the -04 draft. 2015-05-11 17:04:13 +02:00
Zoltan Szabadka
14ea2b5805 Create -04 version of the draft. 2015-05-11 17:03:35 +02:00
szabadka
6ee61e78c8 Merge pull request #110 from anthrotype/test_quality
[roundtrip_test.py] repeat test at different quality (1, 6, 9, 11)
2015-05-11 15:15:55 +02:00
Cosimo Lupo
e356b9bc2f [roundtrip_test.py] repeat test at different quality (1, 6, 9, 11) 2015-05-11 14:12:37 +01:00
szabadka
682facef7b Merge pull request #109 from szabadka/master
Expose the quality parameter to the bro.cc tool.
2015-05-11 14:15:05 +02:00
Zoltan Szabadka
8d83839ac2 Expose the quality parameter to the bro.cc tool. 2015-05-11 14:14:05 +02:00
szabadka
463ceda563 Merge pull request #108 from szabadka/master
Use the same hasher for text and font mode.
2015-05-11 14:12:24 +02:00
Zoltan Szabadka
6622355a9a Use the same hasher for text and font mode.
We use 4-byte hashing in both and look for length 3
matches separately.
2015-05-11 14:11:07 +02:00
szabadka
6bb4316280 Merge pull request #107 from szabadka/master
Fix broken quality 0, make it same as quality 1.
2015-05-11 13:52:54 +02:00
Zoltan Szabadka
cc8d64dfec Fix broken quality 0, make it same as quality 1. 2015-05-11 13:51:47 +02:00
szabadka
cc211b92f1 Merge pull request #105 from anthrotype/newparams
[python] expose new encoder parameters as kwargs of brotli.compress
2015-05-11 13:36:43 +02:00
Cosimo Lupo
c93c0dab92 [bro.py] use brotli.MODE_GENERIC as default compression mode;
remove additional low-level parameters
2015-05-11 11:10:48 +01:00
Cosimo Lupo
aa6f7d8f0c [brotlimodule] add MODE_GENERIC constant 2015-05-11 11:09:36 +01:00
Cosimo Lupo
b7e8291788 [bro.py] remove debug print 2015-05-11 10:39:29 +01:00
Cosimo Lupo
4106a406d0 [bro.py] use new optional encoder parameters when compressing;
modified the help string to include the new parameters.
2015-05-11 10:39:28 +01:00
Cosimo Lupo
32c44ec87d [bro.py] use argparse instead of getopt 2015-05-11 10:39:26 +01:00
Cosimo Lupo
3351bb08e3 [brotlimodule] apply uniform docstring style 2015-05-11 10:39:24 +01:00
Cosimo Lupo
6d935db75c [brotlimodule] add quality, lgwin and lgblock parameters 2015-05-11 10:39:23 +01:00
Cosimo Lupo
dbcb32614a [brotlimodule] add enable_context_modeling parameter (defaults to True) 2015-05-11 10:39:21 +01:00
Cosimo Lupo
4c1d06931e [brotlimodule] add new keyword params docstring of brotli.compress 2015-05-11 10:39:20 +01:00
Cosimo Lupo
6264bea2e4 [brotlimodule] add greedy_block_split parameter (defaults to False);
renamed variables: transform -> enable_transforms, dictionary -> enable_dictionary
2015-05-11 10:39:18 +01:00
Cosimo Lupo
b2eba122c8 [brotlimodule] add enable_dictionary parameter (defautls to True) 2015-05-11 10:39:17 +01:00
Cosimo Lupo
89c74d6859 [brotlimodule] use keyword arguments for mode and enable_transforms;
update brotli.compress docstring accordingly
2015-05-11 10:39:15 +01:00
szabadka
621cd0cf04 Merge pull request #106 from szabadka/master
Add a MODE_GENERIC compression mode to the interface.
2015-05-11 11:34:39 +02:00
Zoltan Szabadka
aa853f3cbc Add a MODE_GENERIC compression mode to the interface.
With this the users can distinguish between not knowing
what the input is (ddefault) and knowing that it is text,
and thus can be relied on to force some UTF-8 specific settings.
2015-05-11 11:33:19 +02:00
szabadka
288f70d7ea Merge pull request #104 from anthrotype/py3split
[python] fix compatibility_test.py with Python 3
2015-05-08 11:11:22 +02:00
Cosimo Lupo
e6913b2e78 [python] use built-in split instead of 'string' module for py23
In python3, the 'string' module no longer has a 'split' function.
2015-05-08 10:06:18 +01:00
szabadka
4e94277e9d Merge pull request #103 from szabadka/master
Handle multiple compressed files per original in the test.
2015-05-07 20:46:40 +02:00
Zoltan Szabadka
10a2f3745a Handle multiple compressed files per original in the test.
Add some more test cases that decompress to the empty
file or a one byte long file. These test cases have
examples for the updated stream header and meta-block
header formats.
2015-05-07 20:43:01 +02:00