AuroraMiddleware/lz4 - lz4 - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Chenxi Mao	64b5917736	FAST_DEC_LOOP: only did offset check in specific condition. When I did FAST_DEC_LOOP performance test, I found the offset check is much more than v1.8.3 You will see the condition check difference via lzbench with dickens test case. v1.8.3 34959 v.1.9.x 1055885 After investigate the code, we could see the difference. v.1.8.3 SKIP the condition check if if condition is true in: https://github.com/lz4/lz4/blob/v1.8.3/lib/lz4.c#L1463 AND below condition is true https://github.com/lz4/lz4/blob/v1.8.3/lib/lz4.c#L1478\ The offset check should be invoked. v1.9.3 The offset check code will be invoked in every loop which lead to downgrade. So the fix would be move this check to specific condition to avoid useless condition check. After this change, the call number is same as v1.8.3	2019-05-31 08:36:13 +08:00
George Prekas	605d811e6c	enable LZ4_FAST_DEC_LOOP build macro on aarch64/GCC by default	2019-05-07 08:36:06 -05:00
Yann Collet	ba99eac4d0	several minor style changes recommended by clang-tidy	2019-04-24 10:03:02 -07:00
Yann Collet	ae199124e5	fixed read-after input in LZ4_decompress_safe()	2019-04-18 18:50:51 -07:00
Yann Collet	5acfb15df0	re-enable FORCE_INLINE was disabled for tests	2019-04-17 15:33:37 -07:00
Yann Collet	25d96f1e4d	fix out-of-bound read within LZ4_decompress_fast() and deprecate LZ4_decompress_fast(), with deprecation warnings enabled by default. Note that, as a consequence of the fix, LZ4_decompress_fast is now __slower__ than LZ4_decompress_safe(). That's because, since it doesn't know the input buffer size, it must progress more cautiously into the input buffer to ensure to out-of-bound read.	2019-04-17 15:01:53 -07:00
Norm Green	1848ea5cbd	Fix AIX errors/warnings	2019-04-17 09:20:09 -07:00
Yann Collet	920c988669	simplified output_directive	2019-04-15 14:13:10 -07:00
Yann Collet	55f6f0dd74	fix comma for pedantic	2019-04-15 11:22:25 -07:00
Yann Collet	474c17cdc4	unified limitedOutput_directive between lz4.c and lz4hc.c . was left in a strange state after the "amalgamation" patch. Now only 3 directives remain, same name across both implementations, single definition place. Might allow some light simplification due to reduced nb of states possible.	2019-04-15 11:09:56 -07:00
Yann Collet	481a37fe47	fixed lz4frame with linked blocks when one block was not compressible, it would tag the context as `dirty`, resulting in compression automatically bailing out of all future blocks, making the rest of the frame uncompressible.	2019-04-15 10:28:36 -07:00
Yann Collet	f8b7605034	fixed minor Visual warnings since Visual 2017, worries about potential overflow, which are actually impossible. Replaced (c * a) by (c ? a : 0). Will likely replaced a * by a cmov. Probably harmless for performance.	2019-04-12 16:49:01 -07:00
Yann Collet	8d76c8a44a	introduce LZ4_DISTANCE_MAX build macro make it possible to generate LZ4-compressed block with a controlled maximum offset (necessarily <= 65535). This could be useful for compatibility with decoders using a very limited memory budget (<64 KB). Answer #154	2019-04-11 14:15:33 -07:00
Yann Collet	14c71dfa9c	modified LZ4_initStreamHC() to look like LZ4_initStream() it is now a pure initializer, for statically allocated states. It can initialize any memory area, and because of this, requires size.	2019-04-09 13:55:42 -07:00
Yann Collet	5ef4f3ce91	check some more initialization result ensure it's not NULL.	2019-04-08 16:51:22 -07:00
Yann Collet	111df0fa45	removed LZ4_stream_t alignment test on Visual it fails on x86 32-bit mode : Visual reports an alignment of 8-bytes (even with alignof()) but actually only align LZ4_stream_t on 4 bytes. The alignment check then fails, resulting in missed initialization.	2019-04-08 16:47:21 -07:00
Yann Collet	c198a39a66	LZ4_initStream() checks alignment restriction updated associated documentation	2019-04-08 12:49:54 -07:00
Yann Collet	2ece0d8380	created LZ4_initStream() - promoted LZ4_resetStream_fast() to stable - moved LZ4_resetStream() into deprecate, but without triggering a compiler warning - update all sources to no longer rely on LZ4_resetStream() note : LZ4_initStream() proposal is slightly different : it's able to initialize any buffer, provided that it's large enough. To this end, it accepts a void, and returns an LZ4_stream_t.	2019-04-05 12:56:26 -07:00
Yann Collet	f2755c9887	minor comments and reformatting	2019-04-03 08:59:29 -07:00
Yann Collet	753076bfa4	fixed minor conversion warnings	2019-04-02 17:16:43 -07:00
Yann Collet	2589c4424f	created LZ4_FAST_DEC_LOOP build macro	2019-04-02 16:22:11 -07:00
Yann Collet	7d9d00f4df	fixed a few minor conversion warnings	2019-04-02 16:06:37 -07:00
Yann Collet	d85bdb4ff2	Merge pull request #645 from djwatson/optimize_decompress_generic Optimize decompress generic	2019-02-11 16:58:53 -08:00
Dave Watson	5d7d1166cb	decompress_generic: Limit fastpath to x86 New fastpath currently shows a regression on qualcomm arm chips. Restrict it to x86 for now	2019-02-11 11:44:51 -08:00
Dave Watson	75fb878a90	decompress_generic: Add fastpath for small offsets For small offsets of size 1, 2, 4 and 8, we can set a single uint64_t, and then use it to do a memset() variation. In particular, this makes the somewhat-common RLE (offset 1) about 2-4x faster than the previous implementation - we avoid not only the load blocked by store, but also avoid the loads entirely.	2019-02-08 13:57:23 -08:00
Dave Watson	faac110e20	decompress_generic: Unroll loops a bit more Generally we want our wildcopy loops to look like the memcpy loops from our libc, but without the final byte copy checks. We can unroll a bit to make long copies even faster. The only catch is that this affects the value of FASTLOOP_SAFE_DISTANCE.	2019-02-08 13:57:23 -08:00
Dave Watson	1fbaf84306	decompress_generic: remove msan write This store is also causing load-blocked-by-store issues, remove it. The msan warning will have to be fixed another way if it is still an issue.	2019-02-08 13:57:23 -08:00
Dave Watson	28b824921d	decompress_generic: re-add fastpath This is the remaineder of the original 'shortcut'. If true, we can avoid the loop in LZ4_wildCopy, and directly copy instead.	2019-02-08 13:57:23 -08:00
Dave Watson	232f1e261f	decompress_generic: drop partial copy check in fast loop We've already checked that we are more than FASTLOOP_SAFE_DISTANCE away from the end, so this branch can never be true, we will have already jumped to the second decode loop.	2019-02-08 13:57:23 -08:00
Dave Watson	59332a3026	decompress_generic: Optimize literal copies Use LZ4_wildCopy16 for variable-length literals. For literal counts that fit in the flag byte, copy directly. We can also omit oend checks for roughly the same reason as the previous shortcut: We check once that both match length and literal length fit in FASTLOOP_SAFE_DISTANCE, including wildcopy distance.	2019-02-08 13:57:23 -08:00
Dave Watson	5dfa7d422b	decompress_generic: optimize match copy Add an LZ4_wildCopy16, that will wildcopy, potentially smashing up to 16 bytes, and use it for match copy. On x64, this avoids many blocked loads due to store forwarding, similar to issue #411.	2019-02-08 13:57:23 -08:00
Dave Watson	28356e02ad	decompress_generic: Add a loop fastpath Copy the main loop, and change checks such that op is always less than oend-SAFE_DISTANCE. Currently these are added for the literal copy length check, and for the match copy length check. Otherwise the first loop is exactly the same as the second. Follow on diffs will optimize the first copy loop based on this new requirement. I also tried instead making a separate inlineable function for the copy loop (similar to existing partialDecode flags, etc), but I think the changes might be significant enough to warrent doubling the code, instead pulling out common functionality to separate functions. This is the basic transformation that will allow several following optimisations.	2019-02-08 13:57:19 -08:00
Dave Watson	4da336062e	decompress_generic: Refactor variable length fields Make a helper function to read variable lengths for literals and match length.	2019-02-08 13:42:42 -08:00
Jeremy Maitin-Shepard	26e7635a0e	Eliminate optimize attribute warning with clang on PPC64LE	2019-02-04 12:22:56 -08:00
W. Felix Handte	4e3accccb2	Fix Dict Size Test in `LZ4_compress_fast_continue()` Dictionaries don't need to be > 4 bytes, they need to be >= 4 bytes. This test was overly conservative. Also removes the test in `LZ4_attach_dictionary()`.	2018-12-05 11:24:33 -08:00
W. Felix Handte	535636ff5c	Don't Attach Very Small Dictionaries Fixes a mismatch in behavior between loading into the context (via `LZ4_loadDict()`) a very small (<= 4 bytes) non-contiguous dictionary, versus attaching it with `LZ4_attach_dictionary()`. Before this patch, this divergence could be reproduced by running ``` make -C tests fuzzer MOREFLAGS="-m32" tests/fuzzer -v -s1239 -t3146 ``` Making sure these two paths behave exactly identically is an easy way to test the correctness of the attach path, so it's desirable that this remain an unpolluted, high signal test.	2018-12-04 14:05:11 -08:00
Bing Xu	17f5071e72	Enable amalgamation of lz4hc.c and lz4.c	2018-11-15 22:24:25 -08:00
Oleg Khabinov	28eb88d988	Some followups and renamings	2018-10-01 15:19:45 -07:00
Oleg Khabinov	f2ae385c2f	Rename initCheck to dirtyContext and use it in LZ4_resetStream_fast() to check if full reset is needed.	2018-09-28 14:55:05 -07:00
Yann Collet	cb917827f9	Merge pull request #578 from lz4/support128bit Support for 128bit pointers like AS400	2018-09-26 13:57:09 -07:00
Yann Collet	b2215f2a89	tried to clean another bunch of cppcheck warnings so "funny" thing with cppcheck is that no 2 versions give the same list of warnings. On Mac, I'm using v1.81, which had all warnings fixed. On Travis CI, it's v1.61, and it complains about a dozen more/different things. On Linux, it's v1.72, and it finds a completely different list of a half dozen warnings. Some of these seems to be bugs/limitations in cppcheck itself. The TravisCI version v1.61 seems unable to understand %zu correctly, and seems to assume it means %u.	2018-09-19 12:12:49 -07:00
Yann Collet	8bea19d57c	fixed minor cppcheck warnings in lib	2018-09-18 15:51:26 -07:00
Yann Collet	6381d828fd	increase size of LZ4 contexts for 128-bit systems	2018-09-17 17:31:57 -07:00
Yann Collet	6103b4c9b4	use byU32 mode for any pointer > 32-bit including 128-bit, like IBM AS-400	2018-09-14 15:27:48 -07:00
Yann Collet	6d32240b2e	clarify constant MFLIMIT and separate it from MATCH_SAFEGUARD_DISTANCE. While both constants have same value, they do not seve same purpose, hence should not be confused.	2018-09-11 10:00:13 -07:00
Yann Collet	b87a8e9e62	fixed minor warning in fuzzer.c added a few more comments and assert()	2018-09-10 16:48:41 -07:00
Yann Collet	63fc6fbf7e	restored nullifying output to counter possible (offset==0)	2018-09-10 16:22:16 -07:00
Yann Collet	32272f9866	removed temporary debug traces	2018-09-10 15:51:53 -07:00
Yann Collet	e22bb80074	fixed fuzzer test and removed one blind copy, since there is no more guarantee that at least 4 bytes are still available in output buffer	2018-09-07 18:22:01 -07:00
Yann Collet	bf614d3c51	first sketch for a byte-accurate partial decoder	2018-09-07 15:44:19 -07:00

1 2 3 4 5 ...

270 Commits