Commit Graph

4567 Commits

Author SHA1 Message Date
Yann Collet
0a0a212934 zstd_opt: changed cost formula
There was a flaw in the formula
which compared literal cost with match cost :
at a given position,
a non-null literal suite is going to be part of next sequence,
while if position ends a previous match, to immediately start another match,
next sequence will have a litlength of zero.
A litlength of zero has a non-null cost.
It follows that literals cost should be compared to match cost + litlength==0.

Not doing so gave a structural advantage to matches, which would be selected more often.
I believe that's what led to the creation of the strange heuristic which added a complex cost to matches.
The heuristic was actually compensating.
It was probably created through multiple trials, settling for best outcome on a given scenario (I suspect silesia.tar).
The problem with this heuristic is that it's hard to understand,
and unfortunately, any future change in the parser would impact the way it should be calculated and its effects.

The "proper" formula makes it possible to remove this heuristic.

Now, the problem is : in a head to head comparison, it's sometimes better, sometimes worse.
Note that all differences are small (< 0.01 ratio).
In general, the newer formula is better for smaller files (for example, calgary.tar and enwik7).
I suspect that's because starting statistics are pretty poor (another area of improvement).
However, for silesia.tar specifically, it's worse at level 22 (while being better at level 17, so even compression level has an impact ...).

It's a pity that zstd -22 gets worse on silesia.tar.
That being said, I like that the new code gets rid of strange variables,
which were introducing complexity for any future evolution (faster variants being in mind).
Therefore, in spite of this detrimental side effect, I tend to be in favor of it.
2017-11-28 14:07:03 -08:00
Yann Collet
b71405dc51 removed a bunch of code related to cached literal price
optState was used both to evaluate price
and to cache cost of previously calculated literals.
This created a strong dependency, forcing parser to request cost in a strict order.
This limitation is forbids future parser with skipping capabilities.

After this patch, caching literals price still exists,
but is now explicit, in a stack structure.
2017-11-28 12:32:24 -08:00
Yann Collet
03f30d9dcb separate rawLiterals, fullLiterals and match costs
removed one SET_PRICE() macro invocation
2017-11-28 12:14:46 -08:00
Yann Collet
be73f8a749
Merge pull request #893 from felixhandte/fix-lz4-compression-bug
Fix LZ4 Compression Bug
2017-11-28 11:34:22 -08:00
W. Felix Handte
baff9dd15e Fix LZ4 Compression Buffer Overflow
Fixes issue where, when `zstd --format=lz4` is fed an input larger than 128KB,
the read overruns the input buffer. This changes Zstd to use LZ4 with chained
64KB blocks. This is technically a breaking change in that some third party
LZ4 implementations may not support linked blocks. However, progress should not
be allowed to be stopped by such petty concerns as backwards compatibility!
2017-11-28 12:07:26 -05:00
W. Felix Handte
62c746dcf9 Add Test on LZ4 Format Input Buffer Overrun 2017-11-28 12:06:48 -05:00
Yann Collet
eee87cd6f2 btopt: minor refactor : removed one SET_PRICE() macro invocation
direct assignment makes operation cleaner.
Also allows some (very minor) optimization (non-measurable)
2017-11-27 17:18:57 -08:00
Yann Collet
e9d1987fd7 btopt: minor speed optimization
matchPrice is always right at beginning
2017-11-27 17:01:51 -08:00
Yann Collet
743b23878e install: changed variable MANDIR into MAN1DIR
MANDIR still exists, and is now the parent of MAN1DIR
2017-11-27 13:47:35 -08:00
Yann Collet
bd88f633ac zstreamtest : in -T#s, s considered a suffix meaning "seconds"
avoid unintentionnally triggering `seedset`,
so that seed gets automatically determined when not set.
2017-11-27 12:15:23 -08:00
Yann Collet
2fd765498a updated man page
following patch #931 by @scottchiefbaker
2017-11-24 17:20:54 -08:00
Yann Collet
c857ee850a minor update 2017-11-24 16:44:28 -08:00
Yann Collet
73a6c6ca5a
Merge pull request #931 from scottchiefbaker/documentation
Include information about the benchmark output/methodology
2017-11-24 00:01:34 -08:00
Scott Baker
31a191b178 Include information about the benchmark output/methodology
Addresses #930
2017-11-22 20:34:25 -08:00
Yann Collet
4451b213c0
Merge pull request #924 from facebook/opt2
faster btopt variant
2017-11-21 16:34:37 -08:00
Yann Collet
f8d5c478af fixed comment, reported by @gyscos 2017-11-21 10:36:14 -08:00
Yann Collet
4154aec679 fixed comment, as suggested by @terrelln 2017-11-21 10:26:17 -08:00
Yann Collet
851383e7a4
Merge pull request #925 from dusta/patch-1
Missing dot
2017-11-20 14:34:03 -08:00
Dusta
287db17c8c
Missing dot 2017-11-20 21:35:28 +01:00
Yann Collet
899f2a29f6 strategy ZSTD_btopt pinned to (0) variant (faster one) 2017-11-20 11:53:20 -08:00
Yann Collet
3f457264d1 slightly improved compression speed 2017-11-19 14:40:21 -08:00
Yann Collet
42c1e64270 slightly improved ratio at -22
merging of repcode search into btsearch introduced a small compression ratio regressio at max level :
1.3.2 : 52728769
after repMerge patch : 52760789 (+32020)

A few minor changes have produced this difference.
They can be hard to spot.

This patch buys back about half of the difference,
by no longer inserting position at hc3 when a long match is found there.
It feels strangely counter-intuitive, but works :
after this patch : 52742555 (-18234)
2017-11-19 14:00:55 -08:00
Yann Collet
99435dbbab minor : search early-out on sufficient_len for hc3 and rep
very very small speed and ratio increases
2017-11-19 12:58:04 -08:00
Yann Collet
d100670045 btopt0 : a bit faster and weaker 2017-11-19 10:38:02 -08:00
Yann Collet
e6da37c430 created (hidden) new strategy btopt0
about ~+10% faster but losing ~0.01 compression ratio
(note : amplitude vary a lot depending on files, but direction remains the same)
2017-11-19 10:21:21 -08:00
Yann Collet
e717a5b0dd zstd_opt: minor speed optimization
Calculate reference log2sums only once per serie of sequence
(as opposed to once per sequence)

Also: improved code comments
2017-11-18 16:24:02 -08:00
Yann Collet
daebc7fe26 bench: slightly adjusted display format
adapt accuracy depending on value.
makes it possible to have higher accuracy for small value,
notably small compression speed.
This capability is expected to be useful while modifying optimal parser.
2017-11-18 15:54:32 -08:00
Yann Collet
fccb46fbe0 minor spelling fixes 2017-11-18 11:28:00 -08:00
Yann Collet
d11661c3ec fix ZSTD_COMPRESSBOUND() macro
It was using macro `KB`, which is not defined in `zstd.h`.
2017-11-18 11:16:39 -08:00
Yann Collet
22ff60b3aa
Merge pull request #922 from terrelln/signal
[zstd] Fix rare bug with signal handler
2017-11-17 18:50:02 -08:00
Yann Collet
9708c46879
Merge pull request #921 from facebook/ubfix
Fixes some arithmetic pointer operations
2017-11-17 17:11:18 -08:00
Nick Terrell
a6052af0e8 [zstd] Fix rare bug with signal handler 2017-11-17 16:38:56 -08:00
Yann Collet
a4a20a4b2f fix un-initialized memory warning
harmless, but cleaner
2017-11-17 15:51:52 -08:00
Yann Collet
c8b3e08535
Merge pull request #920 from facebook/benchSeparate
Bench multiple files
2017-11-17 13:25:12 -08:00
Yann Collet
23767e950a fix one UB pointer arithmetic in encoder
Instead of calculating distance between 2 memory objects, which is UB,
we extract the offset from object 1, and transfer it into object 2.
2017-11-17 13:24:51 -08:00
Yann Collet
cdade555ee fixed one UB pointer arithmetic 2017-11-17 11:40:08 -08:00
Yann Collet
6e8573a90a
Merge pull request #916 from facebook/optMergeRep
Optimal parser part 2 : Merge normal and _extDict parsers
2017-11-17 11:29:47 -08:00
Yann Collet
49a445e65d fixed circle-ci script 2017-11-17 02:14:16 -08:00
Yann Collet
5b957ba899 minor interface adjustments 2017-11-17 01:21:40 -08:00
Yann Collet
d898fb7ba6 bench: added cli command -S to benchmark multiple files separately
Currently, all files are joined by default,
they are compressed separately but benchmarked together,
providing a single final result.

Benchmarking files separately make it possible to accurately measure difference for each file.
This is expected to be useful while tuning optimal parser.
2017-11-17 00:22:55 -08:00
Yann Collet
8accfa7fcc bench: realTime is a global parameter
like most parameters not directly related to compression
2017-11-17 00:02:37 -08:00
Yann Collet
11e58d9ba4 fixed minor warning
warning: void function returning a value
(even if the return value is void)
2017-11-16 15:21:30 -08:00
Yann Collet
15768cabb5 fixed some complex scenarios
Fixed : multithreading to compress some small data with dictionary
Fixed : ZSTD_initCStream_usingCDict()
Improved streaming memory usage when pledgedSrcSize is known.
2017-11-16 15:18:18 -08:00
Yann Collet
05dffe43a7 Fixed Btree update
ZSTD_updateTree() expected to be followed by a Bt match finder, which would update zc->nextToUpdate.
With the new optimal match finder, it's not necessarily the case : a match might be found during repcode or hash3, and stops there because it reaches sufficient_len, without even entering the binary tree.
Previous policy was to nonetheless update zc->nextToUpdate, but the current position would not be inserted, creating "holes" in the btree, aka positions that will no longer be searched.
Now, when current position is not inserted, zc->nextToUpdate is not update, expecting ZSTD_updateTree() to fill the tree later on.

Solution selected is that ZSTD_updateTree() takes care of properly setting zc->nextToUpdate,
so that it no longer depends on a future function to do this job.

It took time to get there, as the issue started with a memory sanitizer error.
The pb would have been easier to spot with a proper `assert()`.
So this patch add a few of them.

Additionnally, I discovered that `make test` does not enable `assert()` during CLI tests.
This patch enables them.

Unfortunately, these `assert()` triggered other (unrelated) bugs during CLI tests, mostly within zstdmt.
So this patch also fixes them.

- Changed packed structure for gcc memory access : memory sanitizer would complain that a read "might" reach out-of-bound position on the ground that the `union` is larger than the type accessed.
  Now, to avoid this issue, each type is independent.
- ZSTD_CCtxParams_setParameter() : @return provides the value of parameter, clamped/fixed appropriately.
- ZSTDMT : changed constant name to ZSTDMT_JOBSIZE_MIN
- ZSTDMT : multithreading is automatically disabled when srcSize <= ZSTDMT_JOBSIZE_MIN, since only one thread will be used in this case (saves memory and runtime).
- ZSTDMT : nbThreads is automatically clamped on setting the value.
2017-11-16 12:18:56 -08:00
Yann Collet
dfc14579f5 removed wrong assertion 2017-11-15 15:35:56 -08:00
Yann Collet
c55e35b2fc removed a few specialized traces 2017-11-15 15:04:53 -08:00
Yann Collet
61c2d70c86 shortened repcode match finder implementation 2017-11-15 14:37:40 -08:00
Yann Collet
d7e9805028 fixed corruption issue 2017-11-15 13:44:24 -08:00
Yann Collet
046ea53bef still fighting data corruption
due to messed up tree.
Seems to happen when reaching end of buffer.
2017-11-15 11:29:24 -08:00
Yann Collet
4202b2e8a6 merged rep search into btMatchSearch
but there is a tree corruption somewhere ...
bug hunt ongoing
2017-11-14 20:38:52 -08:00