Commit Graph

462 Commits

Author SHA1 Message Date
Stella Lau
273c17b350 Experiment with 64-bit hash and checksum 2017-07-24 10:19:50 -07:00
Paul Cruz
483d936b87 reduced competition for completion mutex by separating mutex use based on which values is updated 2017-07-23 14:09:16 -07:00
Paul Cruz
880f08d104 change how completion is measured in compression thread 2017-07-23 10:18:54 -07:00
Paul Cruz
08d9e42ec6 removed useless measurements 2017-07-21 18:02:55 -07:00
Paul Cruz
95bef759b3 switched over to model where reading only waits on compression thread 2017-07-21 17:49:39 -07:00
Paul Cruz
6455ec482c taking the maximum of the completion level reads in order to determine which one was waiting more 2017-07-21 16:05:01 -07:00
Paul Cruz
05fe8dd47c updating debug statements 2017-07-21 14:06:24 -07:00
Paul Cruz
db109f8fef measure multiple completion levels during each wait 2017-07-21 13:38:24 -07:00
Paul Cruz
721c6a8b97 added bounding to compression level change 2017-07-21 09:30:24 -07:00
Paul Cruz
e929d3b787 added priority decision making for adapt compression level 2017-07-21 09:26:35 -07:00
Paul Cruz
9259c7afa4 semi working version that stabilizes 2017-07-20 18:45:33 -07:00
Paul Cruz
82e488770c fixed bug where writeSize could be zero 2017-07-20 16:38:02 -07:00
Paul Cruz
a19916425d reworked adaptCompressionLevel to only account for completion information 2017-07-20 16:19:16 -07:00
Nick Terrell
7d3ac0710d [linux] Update patches for v3 2017-07-20 13:33:55 -07:00
Paul Cruz
7ab758a640 changed how completion is actually sampled 2017-07-20 10:53:51 -07:00
Stella Lau
13a01ffb27 Fix off-by-one in size calculations 2017-07-19 17:24:09 -07:00
Stella Lau
2427a154cb Minor refactoring 2017-07-19 16:56:28 -07:00
Paul Cruz
dcf609f835 make adaptCompressionLevel oscillate less 2017-07-19 16:36:33 -07:00
Paul Cruz
2a22c7915e call ZSTD_compressBegin() once 2017-07-19 16:00:54 -07:00
Paul Cruz
6767abe652 fixing error when file size is multiple of job size (in which case, the srcSize of the last job is 0) 2017-07-19 14:54:15 -07:00
Stella Lau
030264ca51 Experiment with integrating ZSTD_count with findBestMatch 2017-07-19 14:14:26 -07:00
Paul Cruz
42382c1216 added some debug statements, adjusted end condition 2017-07-19 13:30:07 -07:00
Paul Cruz
5a85c57e30 set up new calculations compression completion progress 2017-07-19 11:47:17 -07:00
Paul Cruz
f1ac518b59 split compression into smaller blocks 2017-07-19 11:23:40 -07:00
Paul Cruz
338951cd48 moved compression adapt to avoid warning 2017-07-19 10:23:46 -07:00
Paul Cruz
4497ecf297 change compression level only right before actually performing compression. When waiting, only update waiting statistics. 2017-07-19 10:14:00 -07:00
Paul Cruz
e11bf55d0b added mechanism for measuring how much of a job has been created 2017-07-19 10:10:47 -07:00
Paul Cruz
559ea4ff25 split up read process into smaller chunks 2017-07-19 09:59:17 -07:00
Paul Cruz
6119cd2164 added additional print for help menu 2017-07-19 09:43:17 -07:00
Stella Lau
4352e09cb0 Avoid recounting match lengths with ZSTD_count 2017-07-18 18:35:25 -07:00
Stella Lau
1fa223859f Switch to using ZSTD_count instead of function pointer 2017-07-18 18:05:10 -07:00
Paul Cruz
3d7f1afadd changed createCCtx() to split into initialization and creation 2017-07-18 17:32:36 -07:00
Paul Cruz
2c4e4ddc50 added mutex for stats struct 2017-07-18 15:55:58 -07:00
Paul Cruz
ad66faf16a added progress check for filewriting, put important shared data behind mutex when being read from/written to 2017-07-18 15:23:11 -07:00
Stella Lau
19258f51c1 Make the meaning of LDM_MEMORY_USAGE consistent across tables 2017-07-18 14:25:39 -07:00
Paul Cruz
a34bc30237 setting up basic readme 2017-07-18 13:31:02 -07:00
Paul Cruz
29c36cf051 rename completion variable, split up fwrite operations in order to track progress 2017-07-18 13:30:29 -07:00
Paul Cruz
ae47eab2fd changed test cases to use -s setting on the diffs 2017-07-18 12:58:50 -07:00
Stella Lau
fc41a87964 Experiment with using a lag when hashing 2017-07-17 18:13:09 -07:00
Paul Cruz
5af04c57b0 change parameters for compression level adapt 2017-07-17 17:59:50 -07:00
Paul Cruz
b3c9e02bb6 added signal to other threads whenever error occurs 2017-07-17 15:34:58 -07:00
Stella Lau
a00e406231 Remove version archive 2017-07-17 15:17:32 -07:00
Stella Lau
15a041adbf Add function to get valid entries only from table 2017-07-17 15:16:58 -07:00
Paul Cruz
6be22f1f84 swap buffers instead of copying memory over 2017-07-17 14:39:10 -07:00
Paul Cruz
708238e07e open file outside of adaptCCtx, pass to the output thread 2017-07-17 14:01:13 -07:00
Stella Lau
4bb42b02c1 Add basic chaining table 2017-07-17 11:53:54 -07:00
Paul Cruz
044e40db5a removed freeCCtx() calls from createCCtx() so that it is not called twice during errors 2017-07-17 11:19:23 -07:00
Paul Cruz
50ce4eaeb6 added error detection for pthread initialization, added compression completion measurement, fixed const values 2017-07-17 10:12:44 -07:00
Stella Lau
ca300ce6e0 Decouple hash table from compression function 2017-07-14 17:17:00 -07:00
Paul Cruz
1ab3f06f00 updated tests to use different seeds when executing different tests 2017-07-14 16:29:29 -07:00
Stella Lau
6e443b4960 Move hash table access for own functions 2017-07-14 14:27:55 -07:00
Stella Lau
2d8e6c6608 Add more statistics 2017-07-14 12:31:01 -07:00
Stella Lau
55f960e8db Add percentages to offset histogram 2017-07-14 11:00:20 -07:00
Stella Lau
4db7f12ef3 Add offset histogram 2017-07-14 10:52:03 -07:00
Paul Cruz
0c8b9436b7 removed goto statements for the most part 2017-07-13 16:38:20 -07:00
Stella Lau
175a6c6029 [ldm] Minor refactoring 2017-07-13 16:16:31 -07:00
Stella Lau
361c06df75 Add min/max offset to stats 2017-07-13 15:29:41 -07:00
Paul Cruz
65a4ce2635 added tests for forced compression level 2017-07-13 14:57:24 -07:00
Paul Cruz
0d9665cef5 added additional tests for performance, allowed force compression level for testing purposes 2017-07-13 14:46:54 -07:00
Stella Lau
2b3c7e4199 [ldm] Make some functions shared 2017-07-13 14:39:35 -07:00
Paul Cruz
9165e97fc6 added some tests for correctness, time, and compression ratio 2017-07-13 13:50:23 -07:00
Stella Lau
9306feb8fa [ldm] Switch to using lib/common/mem.h and move typedefs to ldm.h
Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

Blame Revision:
2017-07-13 13:44:48 -07:00
Stella Lau
50421d9474 [ldm] Remove old main files 2017-07-13 11:45:00 -07:00
Stella Lau
68c4560701 [ldm] Add TODO and comment for segfaulting in compress function 2017-07-13 10:38:19 -07:00
Paul Cruz
766663f1f1 added altering dictionary size depending on compression level 2017-07-13 10:15:27 -07:00
Stella Lau
92bed4a7e0 [ldm] Add CHAR_OFFSET in hash function and extend header size 2017-07-12 18:47:26 -07:00
Paul Cruz
7c886db0a8 changed to stderr 2017-07-12 17:28:53 -07:00
Paul Cruz
b5b18cf664 changed to malloc, added comment about adaptive compression level, and changed ternary operators 2017-07-12 17:10:58 -07:00
Paul Cruz
954d999abf fixed up freeCCtx() removed BYTE since it wasn't being used 2017-07-12 16:50:43 -07:00
Paul Cruz
3c16edd26a added copyright header, removed clean from makefile 2017-07-12 16:40:24 -07:00
Stella Lau
8de82b6eb0 [ldm] Clean up versions 2017-07-12 16:31:31 -07:00
Paul Cruz
74d3a6f5ae passes tests with adaptive compression level 2017-07-12 16:18:41 -07:00
Paul Cruz
5353d350ae working with fixed compression level and fixed dictionary size 2017-07-12 16:02:20 -07:00
Stella Lau
8ff8cdb15b [ldm] Clean up code 2017-07-12 15:12:07 -07:00
Paul Cruz
356ddb649f working with flush job->src.size and fixed cLevel 2017-07-12 12:21:21 -07:00
Stella Lau
3a48ffd4fd Fix sumToHash to use hash space more efficiently 2017-07-12 10:53:19 -07:00
Stella Lau
e0d4162464 Minor fix for non-rolling hash 2017-07-12 09:50:24 -07:00
Stella Lau
50502519fb Switch to using rolling hash only 2017-07-12 09:47:00 -07:00
Stella Lau
583dda17a8 Update rolling hash 2017-07-11 18:13:26 -07:00
Paul Cruz
0a401852c4 added debug statement 2017-07-11 16:50:50 -07:00
Paul Cruz
72a183efad changed dictionary size, added debugging statements 2017-07-11 15:49:52 -07:00
Paul Cruz
7c54e09347 updated DEBUG statements 2017-07-11 15:15:41 -07:00
Paul Cruz
a3c077b8c6 added error message, updated copying dictionary into the input buffer 2017-07-11 15:00:52 -07:00
Paul Cruz
34afb9b23e changed to using ZSTD_compressBegin_usingDict() and fixed strange issue with ZSTD_compressContinue() 2017-07-11 11:50:00 -07:00
Paul Cruz
7ec5928626 fixed an error where -c argument wasn't working for single files 2017-07-11 10:23:25 -07:00
Stella Lau
f6c5d07fe2 Save v3 2017-07-11 09:23:44 -07:00
Stella Lau
6c3673f4c3 Add rolling hash 2017-07-10 22:27:43 -07:00
Paul Cruz
f918545491 made some progress on improving compression ratio, but problems exist with speed limits, and for some reason higher compression levels are really slow 2017-07-10 18:16:42 -07:00
Paul Cruz
01fc7c4244 changed how the detection of the last job works 2017-07-10 16:27:58 -07:00
Paul Cruz
c36552ef8a dst buffer should use ZSTD_compressBound to determine how much space it needs 2017-07-10 16:10:19 -07:00
Paul Cruz
7aa36df6df fixed memory leak that was happening when creating jobs 2017-07-10 16:03:09 -07:00
Stella Lau
ef2b728316 Clean up and refactor compress function 2017-07-10 15:48:47 -07:00
Paul Cruz
e410d63d45 made input buffer an internal part of the compression context 2017-07-10 15:37:14 -07:00
Stella Lau
e4155b11d7 Add warning flags to makefile and clean up code to remove warnings 2017-07-10 13:08:19 -07:00
Stella Lau
10a71d9f1c Add compression context 2017-07-10 12:38:27 -07:00
Paul Cruz
cc7f8e4d71 small changes 2017-07-10 11:10:11 -07:00
Paul Cruz
7e09b508ff changed name 2017-07-10 11:05:37 -07:00
Paul Cruz
ed72ea5438 removed single from Makefile 2017-07-10 10:58:03 -07:00
Paul Cruz
ced3ec5714 removed scripts 2017-07-10 10:53:02 -07:00
Paul Cruz
82f0d64bee removed single.c 2017-07-10 10:51:50 -07:00
Paul Cruz
62ebbabd32 updated error checking in each thread 2017-07-10 09:36:22 -07:00
Stella Lau
ae9cf235d6 Add LDM_DCtx 2017-07-10 07:38:09 -07:00
Stella Lau
5432214ee3 Minor refactoring 2017-07-10 06:50:49 -07:00
Stella Lau
b94b468e84 Merge branch 'ldm' of https://github.com/stellamplau/zstd into ldm 2017-07-10 06:32:46 -07:00
Stella Lau
474e06ac5b Minor refactoring 2017-07-10 06:32:29 -07:00
Stella Lau
eb280cd568 Add folder for old versions 2017-07-10 06:32:05 -07:00
Stella Lau
719ccdc5a5 Update mainfile 2017-07-09 22:45:54 -07:00
Stella Lau
acdeb9f302 Add compression statistics 2017-07-07 17:09:28 -07:00
Paul Cruz
c3ae23d459 added ability to compress without specifying out filename 2017-07-07 17:07:05 -07:00
Paul Cruz
7163ffafde playing around with adapt param 2017-07-07 15:56:00 -07:00
Paul Cruz
1c9d6b2c6b rewrote time elapsed with UTIL 2017-07-07 15:42:20 -07:00
Paul Cruz
c0c236a28b changed to using compressCCtx 2017-07-07 15:13:40 -07:00
Stella Lau
4076be09ec [ldm] Update to hash every position 2017-07-07 14:52:40 -07:00
Stella Lau
7945f9ee47 Fix offset overflow bug 2017-07-07 14:14:01 -07:00
Paul Cruz
11fc0f4119 changed completed -> compressed 2017-07-07 13:55:38 -07:00
Paul Cruz
09d7c6a994 changed completed variables to compressed for clarity 2017-07-07 13:18:55 -07:00
Stella Lau
f791fc27e3 Add header with compress and decompress size 2017-07-07 12:44:29 -07:00
Paul Cruz
8c0eb62920 removed unnecessary comments, uncommented DEBUGLOG for later use 2017-07-07 11:47:16 -07:00
Paul Cruz
70a4153bd3 added ability to force output to stdout, wrote an additional test for this functionality 2017-07-07 11:32:14 -07:00
Paul Cruz
532f439961 cleaned up code for arguments a bit 2017-07-07 10:58:43 -07:00
Paul Cruz
f7e6b358d0 added tests that check to ensure stdout is working 2017-07-07 10:29:06 -07:00
Paul Cruz
4679132f59 updated avg compression rate, also hiding progress bar behind a flag now 2017-07-07 10:25:38 -07:00
Paul Cruz
00bc5df4e0 added compression rate to status bar 2017-07-07 09:35:39 -07:00
Paul Cruz
f351848b76 added data amount 2017-07-06 20:40:00 -07:00
Paul Cruz
2939301023 fixed problem with progress bar not persisting, added time elapsed 2017-07-06 20:30:20 -07:00
Paul Cruz
57ec0232a8 added help menu 2017-07-06 18:09:10 -07:00
Paul Cruz
b6cc084716 added really simple progress update in the corner 2017-07-06 17:48:18 -07:00
Stella Lau
3bbfa1249e Update compressor and decompressor 2017-07-06 16:47:08 -07:00
Paul Cruz
ff9f2cd057 added some basic logic for altering compression level 2017-07-06 16:06:53 -07:00
Stella Lau
b96ad327a4 Add simple compress and decompress functions 2017-07-06 15:23:15 -07:00
Paul Cruz
a407ccc215 added ability to congregate statistics into single print statement rather than using debug 2017-07-06 13:09:17 -07:00
Igor Vuk
e6e25c9507 Fix typos in README.md 2017-07-06 20:43:14 +02:00
Paul Cruz
f57849b9c6 added ability to set initial compression level 2017-07-06 11:05:51 -07:00
Paul Cruz
592a0d9495 changed to work with std out 2017-07-06 10:49:26 -07:00
Paul Cruz
94fe291b83 small changes 2017-07-06 10:29:16 -07:00
Stella Lau
8aa34a7608 Switch to mmapping files 2017-07-06 07:30:49 -07:00
Paul Cruz
79d4657ce5 small changes 2017-07-05 17:44:36 -07:00
Paul Cruz
6f3ad1b22e fixed the problem with pipeline tests by changing how jobs move through the threads 2017-07-05 17:24:21 -07:00
Paul Cruz
cc714f3bd3 added print statements and debuglog 2017-07-05 16:54:34 -07:00
Paul Cruz
3f52ca94bf added more tests, changed makefile 2017-07-05 14:36:09 -07:00
Paul Cruz
faeb6e0b1b added filenameTable for multiple files 2017-07-05 14:19:56 -07:00
Stella Lau
88f3d8641e Initial long distance matcher commit 2017-07-05 13:57:07 -07:00
Paul Cruz
f0b9a153f3 added tests to run.sh 2017-07-05 13:23:34 -07:00
Paul Cruz
b42108386a added some basic parsing for args 2017-07-05 12:20:16 -07:00
Paul Cruz
898c1a5b46 removed references to file size computation and file size function 2017-07-05 11:54:21 -07:00
Paul Cruz
a2680e5b96 removed calculation of file size and replaced with limited number of available jobs 2017-07-05 11:52:55 -07:00
Paul Cruz
dd8a591d5d moved main logic for job creation into a separate function 2017-07-05 10:48:04 -07:00
Paul Cruz
9ccd55f3a8 free ctx fields when error occurs during creation 2017-07-05 10:20:56 -07:00
Paul Cruz
5df4cb0530 renamed files 2017-07-05 09:57:50 -07:00
Paul Cruz
c9f49198b8 fixed TODOs 2017-07-05 09:49:27 -07:00