Commit Graph

6888 Commits

Author SHA1 Message Date
Nick Terrell
5dc0a1d659 HINT_INLINE ZSTD_storeSeq()
Clang on Mac wasn't inlining `ZSTD_storeSeq()` in level 1, which was
causing a 5% performance regression. This fixes it.
2019-09-20 16:39:27 -07:00
Nick Terrell
44c65da97e Remove literals overread in ZSTD_storeSeq() for ~neutral perf 2019-09-20 12:23:25 -07:00
Nick Terrell
fde217df04 Fix bounds check in ZSTD_storeSeq() 2019-09-20 08:25:12 -07:00
Nick Terrell
67b1f5fc72 Fix too strict assert 2019-09-20 01:23:35 -07:00
Nick Terrell
e068bd01df [tests] Fix decodecorpus 2019-09-20 01:09:47 -07:00
Nick Terrell
ddab2a94e8 Pass iend into ZSTD_storeSeq() to allow ZSTD_wildcopy() 2019-09-20 00:56:20 -07:00
Nick Terrell
cdad7fa512 Widen ZSTD_wildcopy to 32 bytes 2019-09-20 00:52:15 -07:00
Nick Terrell
efd37a64ea Optimize decompression and fix wildcopy overread
* Bump `WILDCOPY_OVERLENGTH` to 16 to fix the wildcopy overread.
* Optimize `ZSTD_wildcopy()` by removing unnecessary branches and
  unrolling the loop.
* Extract `ZSTD_overlapCopy8()` into its own function.
* Add `ZSTD_safecopy()` for `ZSTD_execSequenceEnd()`. It is
  optimized for single long sequences, since that is the important
  case that can end up in `ZSTD_execSequenceEnd()`. Without this
  optimization, decompressing a block with 1 long match goes
  from 5.7 GB/s to 800 MB/s.
* Refactor `ZSTD_execSequenceEnd()`.
* Increase the literal copy shortcut to 16.
* Add a shortcut for offset >= 16.
* Simplify `ZSTD_execSequence()` by pushing more cases into
  `ZSTD_execSequenceEnd()`.
* Delete `ZSTD_execSequenceLong()` since it is exactly the
  same as `ZSTD_execSequence()`.

clang-8 seeds +17.5% on silesia and +21.8% on enwik8.
gcc-9 sees +12% on silesia and +15.5% on enwik8.

TODO: More detailed measurements, and on more datasets.

Crdit to OSS-Fuzz for finding the wildcopy overread.
2019-09-19 21:07:14 -07:00
Nick Terrell
0e76000dee
Merge pull request #1801 from terrelln/int-max
[test] Test the bounds of ZSTD_c_srcSizeHint
2019-09-19 11:10:13 -07:00
Yann Collet
3cac061db5
Merge pull request #1802 from bimbashrestha/rle_block_bound_fix_pt2
Adding 4 blocks to FSE_BLOCKBOUND() in lib/common (different from las…
2019-09-18 16:32:37 -07:00
Bimba Shrestha
6e9f6813bb adding bit container size 2019-09-18 13:49:45 -07:00
Bimba Shrestha
f9b6abb896 Adding 4 blocks to FSE_BLOCKBOUND() in lib/common (different from last week) 2019-09-18 13:29:05 -07:00
Yann Collet
bfff5b30a4
Merge pull request #1756 from mgrice/dev
Improvements in zstd decode performance
2019-09-18 11:35:50 -07:00
Nick Terrell
51990246c3 [test] Test the bounds of ZSTD_c_srcSizeHint 2019-09-18 11:05:08 -07:00
Yann Collet
5329de1f1e
Merge pull request #1798 from facebook/refac_fast
minor refactor of ZSTD_fast
2019-09-17 14:54:23 -07:00
Yann Collet
243200e5bf minor refactor of ZSTD_fast
- reduced variables lifetime
- more accurate code comments
2019-09-17 14:02:57 -07:00
Felix Handte
dd2838eeb4
Merge pull request #1783 from felixhandte/mtime-nsec
Set Mod Time Nanoseconds
2019-09-17 13:30:21 -04:00
Felix Handte
2164a130f3
Merge pull request #1780 from felixhandte/workspace-efficiency-3
Avoid Clearing Tables Even When Changing CParams
2019-09-16 14:37:05 -04:00
W. Felix Handte
72ea79cacd Don't Include sanitizer/msan_interface.h, Since Not All Platforms Provide It
Instead, explicitly declare the functions we use.
2019-09-16 12:08:03 -04:00
Nick Terrell
282ac22b8a
Merge pull request #1791 from terrelln/doc-up
[libzstd] Improve advanced API docs
2019-09-15 14:50:55 -07:00
Nick Terrell
fbeaf6989e [libzstd] Improve advanced API docs 2019-09-15 12:41:24 -07:00
Nick Terrell
f2941db4a9
Merge pull request #1789 from terrelln/larger-fuzz
[fuzz] Fix leak in block_round_trip
2019-09-13 14:13:34 -07:00
Nick Terrell
d721fcf3ee [fuzz] Fix leak in block_round_trip 2019-09-13 10:32:38 -07:00
Yann Collet
09b1844d9b
Merge pull request #1784 from bimbashrestha/fse_block_bound_err
Rearranging assert and allowing 4 extra for FSE_BLOCKBOUND()
2019-09-12 19:09:27 -07:00
Nick Terrell
23c5df1052
Merge pull request #1785 from terrelln/larger-fuzz
[fuzz] Generate seed data up to 256KB
2019-09-12 17:21:10 -07:00
Bimba Shrestha
fe9af338ed Added assert to BIT_flushBits() 2019-09-12 15:35:27 -07:00
Nick Terrell
7c4578160e [fuzz] Generate seed data up to 256KB 2019-09-12 15:02:01 -07:00
Bimba Shrestha
43da5bf27e Rearranging assert and allowing 4 extra for FSE_BLOCKBOUND() 2019-09-12 14:43:50 -07:00
Nick Terrell
8c1b6f74dd
Merge pull request #1781 from darxsys/improvDataGen
Improve data generation
2019-09-12 14:27:58 -07:00
W. Felix Handte
e1ec8004cc Formatting and Clean Up 2019-09-12 16:27:05 -04:00
Dario Pavlovic
51e9d29a51 Merge branch 'improvDataGen' of github.com:darxsys/zstd into improvDataGen 2019-09-12 13:11:02 -07:00
Dario Pavlovic
cd8588077e It's time for all of rng seed code to go. Goodbye 2019-09-12 13:10:34 -07:00
Dario Pavlovic
47bb4c6a23
Update tests/fuzz/fuzz_data_producer.h 2019-09-12 12:45:28 -07:00
Dario Pavlovic
92c58c4d5d Use range instead of the generic uint32 method to use less bytes when generating necessary numbers. 2019-09-12 12:40:12 -07:00
Yann Collet
e0fb7e1eaf ignore dictionary artifacts 2019-09-12 09:39:15 -07:00
W. Felix Handte
5a9baae9cf Set M-Time Nanoseconds 2019-09-12 11:50:33 -04:00
Felix Handte
6ae1ec96bc
Merge pull request #1708 from neheb/dev
zstd: Don't use utime on Linux
2019-09-12 11:44:31 -04:00
W. Felix Handte
20c69077d1 Shrink Table Valid End During Alloc Alignment / Phase Change 2019-09-11 17:14:59 -04:00
W. Felix Handte
51d90668ba Add Assertions to Confirm that Workspace Pointers are Correctly Ordered 2019-09-11 17:14:59 -04:00
W. Felix Handte
a10c191613 __msan_poison() Workspace When Preparing for Re-Use 2019-09-11 17:14:45 -04:00
W. Felix Handte
194c542598 Fix Memory Leak in Test 2019-09-11 14:25:30 -04:00
W. Felix Handte
ff67c62458 Fix Compilation Error (uint32_t -> size_t) 2019-09-11 13:59:09 -04:00
W. Felix Handte
5707c8a9d5 Speed Up Test a Little 2019-09-11 13:23:59 -04:00
W. Felix Handte
ed4c2c60c3 Add Fuzzer Test Case for Index Reduction 2019-09-11 13:17:19 -04:00
W. Felix Handte
7c57e2b9ca Zero h3size When h3log is 0
This led to a nasty edgecase, where index reduction for modes that don't use
the h3 table would have a degenerate table (size 4) allocated and marked clean,
but which would not be re-indexed.
2019-09-11 13:14:26 -04:00
Dario Pavlovic
b5b24c2a0d Combining fuzz_data_producer restrict calls into a single function 2019-09-11 10:09:29 -07:00
W. Felix Handte
bc020eec92 Also Shrink Clean Table Area When Reducing Indices 2019-09-11 11:40:57 -04:00
W. Felix Handte
1999b2ed9b Update DEBUGLOG Statements 2019-09-11 11:21:00 -04:00
W. Felix Handte
13e29a56de Shrink Clean Table Area When Copying Table Contents into Context
The source matchState is potentially at a lower current index, which means
that any extra table space not overwritten by the copy may now contain
invalid indices. The simple solution is to unconditionally shrink the valid
table area to just the area overwritten.
2019-09-11 11:18:45 -04:00
Dario Pavlovic
23cc2d8510 All tests should give some portion of data to the producer and use the rest. 2019-09-10 16:52:38 -07:00