Commit Graph

54 Commits

Author SHA1 Message Date
Nick Terrell
ac58c8d720 Fix copyright and license lines
* All copyright lines now have -2020 instead of -present
* All copyright lines include "Facebook, Inc"
* All licenses are now standardized

The copyright in `threading.{h,c}` is not changed because it comes from
zstdmt.

The copyright and license of `divsufsort.{h,c}` is not changed.
2020-03-26 17:02:06 -07:00
Ed Maste
b81d7cc6a0 remove extraneous doubled ;s 2019-08-15 21:17:06 -04:00
Yann Collet
59a7116cc2 benchfn dependencies reduced to only timefn
benchfn used to rely on mem.h, and util,
which in turn relied on platform.h.
Using benchfn outside of zstd required to bring all these dependencies.

Now, dependency is reduced to timefn only.
This required to create a separate timefn from util,
and rewrite benchfn and timefn to no longer need mem.h.

Separating timefn from util has a wide effect accross the code base,
as usage of time functions is widespread.
A lot of build scripts had to be updated to also include timefn.
2019-04-10 12:37:03 -07:00
Yann Collet
ededcfca57 fix confusion between unsigned <-> U32
as suggested in #1441.

generally U32 and unsigned are the same thing,
except when they are not ...

case : 32-bit compilation for MIPS (uint32_t == unsigned long)

A vast majority of transformation consists in transforming U32 into unsigned.
In rare cases, it's the other way around (typically for internal code, such as seeds).

Among a few issues this patches solves :
- some parameters were declared with type `unsigned` in *.h,
  but with type `U32` in their implementation *.c .
- some parameters have type unsigned*,
  but the caller user a pointer to U32 instead.

These fixes are useful.

However, the bulk of changes is about %u formating,
which requires unsigned type,
but generally receives U32 values instead,
often just for brevity (U32 is shorter than unsigned).
These changes are generally minor, or even annoying.

As a consequence, the amount of code changed is larger than I would expect for such a patch.

Testing is also a pain :
it requires manually modifying `mem.h`,
in order to lie about `U32`
and force it to be an `unsigned long` typically.
On a 64-bit system, this will break the equivalence unsigned == U32.
Unfortunately, it will also break a few static_assert(), controlling structure sizes.
So it also requires modifying `debug.h` to make `static_assert()` a noop.
And then reverting these changes.

So it's inconvenient, and as a consequence,
this property is currently not checked during CI tests.
Therefore, these problems can emerge again in the future.

I wonder if it is worth ensuring proper distinction of U32 != unsigned in CI tests.
It's another restriction for coding, adding more frustration during merge tests,
since most platforms don't need this distinction (hence contributor will not see it),
and while this can matter in theory, the number of platforms impacted seems minimal.

Thoughts ?
2018-12-21 18:09:41 -08:00
Nick Terrell
f2d6db45cd [zstd] Add -Wmissing-prototypes 2018-09-27 15:24:48 -07:00
Jennifer Liu
9d6ed9def3 Merge fastCover into DictBuilder (#1274)
* Minor fix

* Run non-optimize FASTCOVER 5 times in benchmark

* Merge fastCover into dictBuilder

* Fix mixed declaration issue

* Add fastcover to symbol.c

* Add fastCover.c and cover.h to build

* Change fastCover.c to fastcover.c

* Update benchmark to run FASTCOVER in dictBuilder

* Undo spliting fastcover_param into cover_param and f

* Remove convert param functions

* Assign f to parameter

* Add zdict.h to Makefile in lib

* Add cover.h to BUCK

* Cast 1 to U64 before shifting

* Remove trimming of zero freq head and tail in selectSegment and rebenchmark

* Remove f as a separate parameter of tryParam

* Read 8 bytes when d is 6

* Add trimming off zero frequency head and tail

* Use best functions from COVER and remove trimming part(which leads to worse compression ratio after previous bugs were fixed)

* Add finalize= argument to FASTCOVER to specify percentage of training samples passed to ZDICT_finalizeDictionary

* Change nbDmer to always read 8 bytes even when d=6

* Add skip=# argument to allow skipping dmers in computeFrequency in FASTCOVER

* Update comments and benchmarking result

* Change default method of ZDICT_trainFromBuffer to ZDICT_optimizeTrainFromBuffer_fastCover

* Add dictType enum and fix bug about passing zParam when converting to coverParam

* Combine finalize and skip into a single parameter

* Update acceleration parameters and benchmark on 3 sample sets

* Change default splitPoint of FASTCOVER to 0.75 and benchmark first 3 sample sets

* Initialize variables outside of for loop in benchmark.c

* Update benchmark result for hg-manifest

* Remove cover.h from install-includes

* Add explanation of f

* Set default compression level for trainFromBuffer to 3

* Add assertion of fastCoverParams in DiB_trainFromFiles

* Add checkTotalCompressedSize function + some minor fixes

* Add test for multithreading fastCovr

* Initialize segmentFreqs in every FASTCOVER_selectSegment and move mutex_unnlock to end of COVER_best_finish

* Free segmentFreqs

* Initialize segmentFreqs before calling FASTCOVER_buildDictionary instead of in FASTCOVER_selectSegment

* Add FASTCOVER_MEMMULT

* Minor fix

* Update benchmarking result
2018-08-23 12:06:20 -07:00
Yann Collet
42a02ab745 fixed minor warnings issued by scan-build 2018-08-15 14:36:02 -07:00
Jennifer Liu
8afcb8eea7 Update documentation 2018-07-01 19:59:37 -07:00
Nick Terrell
dab8cfa3c7 Combine definitions of SEC_TO_MICRO 2017-11-30 19:40:53 -08:00
Nick Terrell
9a2f6f477b Use util.h for timing 2017-11-30 14:57:25 -08:00
Yann Collet
18b795374a UTIL_getFileSize() returns UTIL_FILESIZE_UNKNOWN on failure
UTIL_getFileSize() used to return zero on failure.
This made it impossible to distinguish a failure from a genuine empty file.
Both cases where coalesced.

Adding UTIL_FILESIZE_UNKNOWN constant has many consequences on user code,
since in many places, the `0` was assumed to mean "error".
This is no longer the case, and the error code must be actively checked.
2017-10-17 16:14:25 -07:00
Yann Collet
1722055799 add comment on using -B# to split input file for dictionary training 2017-09-15 16:23:50 -07:00
Yann Collet
c68d17f2da ensures that sampleSizes table is large enough
as recommended by @terrelln
2017-09-15 15:31:31 -07:00
Yann Collet
25a60488dd fixed 64-to-32 conversion warnings 2017-09-15 11:55:13 -07:00
Yann Collet
a9694231ca fixed minor conversion warning 2017-09-15 10:16:26 -07:00
Yann Collet
086b9597d9 added ability to split input files for dictionary training
using command -B#
This is the same behavior as benchmark module,
which can also split input into arbitrary size blocks, using -B#.
2017-09-14 16:45:10 -07:00
Yann Collet
77c137b3ae minor comment refactor 2017-09-14 15:12:57 -07:00
Yann Collet
3128e03be6 updated license header
to clarify dual-license meaning as "or"
2017-09-08 00:09:23 -07:00
Yann Collet
32fb407c9d updated a bunch of headers
for the new license
2017-08-18 16:52:05 -07:00
Nick Terrell
5b7fd7c422 [zdict] Make COVER the default algorithm 2017-06-26 21:09:22 -07:00
Sean Purcell
42bac7fa84 Change ifndef's to undef's 2017-04-13 15:35:05 -07:00
Sean Purcell
f876f1200c Fix compilation on macOS 2017-04-13 12:33:45 -07:00
Sean Purcell
042ba122ae Change g_displayLevel to int and fix DISPLAYUPDATE flush 2017-03-23 11:21:59 -07:00
Nick Terrell
c220d4c74d Use COVER_MEMMULT when training with COVER. 2017-01-09 16:49:04 -08:00
Nick Terrell
3a1fefcf00 Simplify COVER parameters 2017-01-02 17:51:38 -08:00
Nick Terrell
df8415c502 Add COVER to the zstd cli 2017-01-02 14:43:08 -08:00
Przemyslaw Skibinski
7a8a03c20d util.h: restore BSD license for Facebook Open-Source 2016-12-21 15:08:44 +01:00
Przemyslaw Skibinski
e679741b18 _CRT_SECURE_NO_WARNINGS moved to util.h 2016-12-21 13:47:11 +01:00
Przemyslaw Skibinski
2f6ccee6af platform.h: removed Compiler Options 2016-12-21 13:23:34 +01:00
Przemyslaw Skibinski
16ae6563a2 executables use new util.h and platform.h 2016-12-21 09:06:14 +01:00
Przemyslaw Skibinski
f8046b8e72 Merge remote-tracking branch 'refs/remotes/facebook/dev' into v112
# Conflicts:
#	appveyor.yml
2016-12-19 08:20:26 +01:00
Yann Collet
1496c3dc47 Fix : size estimation when some samples are very large 2016-12-18 11:58:23 +01:00
Yann Collet
d46ecb58a5 added dll compilation tests 2016-12-17 16:28:12 +01:00
Przemyslaw Skibinski
b866e72826 tools use platform.h 2016-12-16 14:24:01 +01:00
Yann Collet
4ded9e591c added boilerplate 2016-08-30 11:06:28 -07:00
Yann Collet
49d105cfcf better warning and error messages in case of dictionary training failure (#292) 2016-08-18 15:02:11 +02:00
Yann Collet
dd25a27702 added tutorial warning messages for dictBuilder 2016-07-27 12:43:09 +02:00
Yann Collet
a3d03a3973 added <errno.h> dependency 2016-07-06 16:27:17 +02:00
Yann Collet
bcb5f77efa dictBuilder manages better samples of null size 0 and large size > 128 KB 2016-07-06 15:41:03 +02:00
Yann Collet
290aaa7521 Added : ability to manually select the dictionary ID of a newly created dictionary 2016-05-30 21:18:52 +02:00
inikep
3733797fcd bench.c: experimental -r (operate recursively on directories) for Windows and _POSIX_C_SOURCE >= 200112L 2016-05-10 14:22:55 +02:00
inikep
ed9a08538c Merge remote-tracking branch 'refs/remotes/Cyan4973/dev' into dev
# Conflicts:
#	lib/common/util.h
#	programs/paramgrill.c
#	visual/2013/fullbench/fullbench.vcxproj.filters
#	visual/2013/fuzzer/fuzzer.vcxproj.filters
2016-05-10 13:20:01 +02:00
Yann Collet
f6ca09b5ff Reduced console display on loading lots of files with zstd --train. Reported by @KrzysFR, see #177 2016-05-09 04:44:45 +02:00
inikep
13c8424ea0 code cleaning 2016-05-05 13:58:56 +02:00
inikep
9c22e57bfb Compiler Options moved to util.h 2016-05-05 11:53:42 +02:00
inikep
bab4317961 util.h must the the first include to #define _POSIX_C_SOURCE 2016-04-29 15:19:40 +02:00
inikep
55d047aa92 getTotalFileSize moved to common/util.h 2016-04-28 16:50:13 +02:00
inikep
d5ff2c3d9a ordering of #include 2016-04-28 14:40:45 +02:00
inikep
69fcd7c0ae getFileSize moved to common/util.h 2016-04-28 12:23:33 +02:00
inikep
23a0889301 separation of lib/ into common/, compress/, decompress/, dictBuilder/, legacy/ 2016-04-22 12:43:18 +02:00