2017-08-18 23:52:05 +00:00
/*
2017-01-20 22:00:41 +00:00
* Copyright ( c ) 2016 - present , Yann Collet , Facebook , Inc .
* All rights reserved .
*
2017-08-18 23:52:05 +00:00
* This source code is licensed under both the BSD - style license ( found in the
* LICENSE file in the root directory of this source tree ) and the GPLv2 ( found
* in the COPYING file in the root directory of this source tree ) .
2017-09-08 07:09:23 +00:00
* You may select , at your option , one of the above - listed licenses .
2017-01-20 22:00:41 +00:00
*/
2016-12-27 06:19:36 +00:00
2017-01-28 00:00:19 +00:00
# ifndef ZSTDMT_COMPRESS_H
# define ZSTDMT_COMPRESS_H
# if defined (__cplusplus)
extern " C " {
# endif
2017-01-25 01:02:26 +00:00
2017-07-06 00:20:52 +00:00
/* Note : This is an internal API.
2017-07-12 00:18:26 +00:00
* Some methods are still exposed ( ZSTDLIB_API ) ,
* because it used to be the only way to invoke MT compression .
2017-07-06 00:20:52 +00:00
* Now , it ' s recommended to use ZSTD_compress_generic ( ) instead .
* These methods will stop being exposed in a future version */
2017-01-25 01:02:26 +00:00
2017-01-12 00:25:46 +00:00
/* === Dependencies === */
2017-06-20 21:11:49 +00:00
# include <stddef.h> /* size_t */
2017-01-19 23:32:07 +00:00
# define ZSTD_STATIC_LINKING_ONLY /* ZSTD_parameters */
2017-06-20 21:11:49 +00:00
# include "zstd.h" /* ZSTD_inBuffer, ZSTD_outBuffer, ZSTDLIB_API */
2016-12-27 06:19:36 +00:00
2017-06-02 00:56:14 +00:00
/* === Memory management === */
2017-01-12 00:25:46 +00:00
typedef struct ZSTDMT_CCtx_s ZSTDMT_CCtx ;
2017-01-21 01:18:41 +00:00
ZSTDLIB_API ZSTDMT_CCtx * ZSTDMT_createCCtx ( unsigned nbThreads ) ;
2017-06-02 00:56:14 +00:00
ZSTDLIB_API ZSTDMT_CCtx * ZSTDMT_createCCtx_advanced ( unsigned nbThreads ,
ZSTD_customMem cMem ) ;
ZSTDLIB_API size_t ZSTDMT_freeCCtx ( ZSTDMT_CCtx * mtctx ) ;
ZSTDLIB_API size_t ZSTDMT_sizeof_CCtx ( ZSTDMT_CCtx * mtctx ) ;
/* === Simple buffer-to-butter one-pass function === */
2016-12-27 06:19:36 +00:00
2017-06-02 00:56:14 +00:00
ZSTDLIB_API size_t ZSTDMT_compressCCtx ( ZSTDMT_CCtx * mtctx ,
2017-06-02 20:47:11 +00:00
void * dst , size_t dstCapacity ,
const void * src , size_t srcSize ,
int compressionLevel ) ;
2017-01-12 00:25:46 +00:00
2017-06-30 21:51:01 +00:00
2017-01-12 00:25:46 +00:00
/* === Streaming functions === */
2017-01-25 01:02:26 +00:00
ZSTDLIB_API size_t ZSTDMT_initCStream ( ZSTDMT_CCtx * mtctx , int compressionLevel ) ;
2017-10-17 21:07:43 +00:00
ZSTDLIB_API size_t ZSTDMT_resetCStream ( ZSTDMT_CCtx * mtctx , unsigned long long pledgedSrcSize ) ; /**< if srcSize is not known at reset time, use ZSTD_CONTENTSIZE_UNKNOWN. Note: for compatibility with older programs, 0 means the same as ZSTD_CONTENTSIZE_UNKNOWN, but it may change in the future, to mean "empty" */
2017-01-25 01:02:26 +00:00
ZSTDLIB_API size_t ZSTDMT_compressStream ( ZSTDMT_CCtx * mtctx , ZSTD_outBuffer * output , ZSTD_inBuffer * input ) ;
ZSTDLIB_API size_t ZSTDMT_flushStream ( ZSTDMT_CCtx * mtctx , ZSTD_outBuffer * output ) ; /**< @return : 0 == all flushed; >0 : still some data to be flushed; or an error code (ZSTD_isError()) */
ZSTDLIB_API size_t ZSTDMT_endStream ( ZSTDMT_CCtx * mtctx , ZSTD_outBuffer * output ) ; /**< @return : 0 == all flushed; >0 : still some data to be flushed; or an error code (ZSTD_isError()) */
/* === Advanced functions and parameters === */
Fixed Btree update
ZSTD_updateTree() expected to be followed by a Bt match finder, which would update zc->nextToUpdate.
With the new optimal match finder, it's not necessarily the case : a match might be found during repcode or hash3, and stops there because it reaches sufficient_len, without even entering the binary tree.
Previous policy was to nonetheless update zc->nextToUpdate, but the current position would not be inserted, creating "holes" in the btree, aka positions that will no longer be searched.
Now, when current position is not inserted, zc->nextToUpdate is not update, expecting ZSTD_updateTree() to fill the tree later on.
Solution selected is that ZSTD_updateTree() takes care of properly setting zc->nextToUpdate,
so that it no longer depends on a future function to do this job.
It took time to get there, as the issue started with a memory sanitizer error.
The pb would have been easier to spot with a proper `assert()`.
So this patch add a few of them.
Additionnally, I discovered that `make test` does not enable `assert()` during CLI tests.
This patch enables them.
Unfortunately, these `assert()` triggered other (unrelated) bugs during CLI tests, mostly within zstdmt.
So this patch also fixes them.
- Changed packed structure for gcc memory access : memory sanitizer would complain that a read "might" reach out-of-bound position on the ground that the `union` is larger than the type accessed.
Now, to avoid this issue, each type is independent.
- ZSTD_CCtxParams_setParameter() : @return provides the value of parameter, clamped/fixed appropriately.
- ZSTDMT : changed constant name to ZSTDMT_JOBSIZE_MIN
- ZSTDMT : multithreading is automatically disabled when srcSize <= ZSTDMT_JOBSIZE_MIN, since only one thread will be used in this case (saves memory and runtime).
- ZSTDMT : nbThreads is automatically clamped on setting the value.
2017-11-16 20:18:56 +00:00
# ifndef ZSTDMT_JOBSIZE_MIN
# define ZSTDMT_JOBSIZE_MIN (1U << 20) /* 1 MB - Minimum size of each compression job */
2017-01-25 01:02:26 +00:00
# endif
2017-06-30 21:51:01 +00:00
ZSTDLIB_API size_t ZSTDMT_compress_advanced ( ZSTDMT_CCtx * mtctx ,
void * dst , size_t dstCapacity ,
const void * src , size_t srcSize ,
const ZSTD_CDict * cdict ,
ZSTD_parameters const params ,
2017-07-13 09:22:58 +00:00
unsigned overlapLog ) ;
2017-06-30 21:51:01 +00:00
2017-06-03 01:20:48 +00:00
ZSTDLIB_API size_t ZSTDMT_initCStream_advanced ( ZSTDMT_CCtx * mtctx ,
const void * dict , size_t dictSize , /* dict can be released after init, a local copy is preserved within zcs */
ZSTD_parameters params ,
unsigned long long pledgedSrcSize ) ; /* pledgedSrcSize is optional and can be zero == unknown */
2017-01-19 23:32:07 +00:00
2017-06-03 08:15:02 +00:00
ZSTDLIB_API size_t ZSTDMT_initCStream_usingCDict ( ZSTDMT_CCtx * mtctx ,
const ZSTD_CDict * cdict ,
ZSTD_frameParameters fparams ,
unsigned long long pledgedSrcSize ) ; /* note : zero means empty */
2017-05-31 00:11:39 +00:00
2017-08-25 23:13:40 +00:00
/* ZSTDMT_parameter :
2017-01-25 01:02:26 +00:00
* List of parameters that can be set using ZSTDMT_setMTCtxParameter ( ) */
2017-01-26 00:39:03 +00:00
typedef enum {
2017-12-13 00:20:51 +00:00
ZSTDMT_p_jobSize , /* Each job is compressed in parallel. By default, this value is dynamically determined depending on compression parameters. Can be set explicitly here. */
ZSTDMT_p_overlapSectionLog /* Each job may reload a part of previous job to enhance compressionr ratio; 0 == no overlap, 6(default) == use 1/8th of window, >=9 == use full window */
2017-08-25 23:13:40 +00:00
} ZSTDMT_parameter ;
2017-01-19 23:32:07 +00:00
2017-01-25 01:02:26 +00:00
/* ZSTDMT_setMTCtxParameter() :
* allow setting individual parameters , one at a time , among a list of enums defined in ZSTDMT_parameter .
2017-12-13 00:20:51 +00:00
* The function must be called typically after ZSTD_createCCtx ( ) but __before ZSTDMT_init * ( ) ! __
2017-01-25 01:02:26 +00:00
* Parameters not explicitly reset by ZSTDMT_init * ( ) remain the same in consecutive compression sessions .
* @ return : 0 , or an error code ( which can be tested using ZSTD_isError ( ) ) */
2017-08-25 23:13:40 +00:00
ZSTDLIB_API size_t ZSTDMT_setMTCtxParameter ( ZSTDMT_CCtx * mtctx , ZSTDMT_parameter parameter , unsigned value ) ;
2017-01-28 00:00:19 +00:00
2017-06-06 01:32:48 +00:00
/*! ZSTDMT_compressStream_generic() :
* Combines ZSTDMT_compressStream ( ) with ZSTDMT_flushStream ( ) or ZSTDMT_endStream ( )
2017-06-20 21:11:49 +00:00
* depending on flush directive .
2017-06-06 01:32:48 +00:00
* @ return : minimum amount of data still to be flushed
* 0 if fully flushed
* or an error code */
ZSTDLIB_API size_t ZSTDMT_compressStream_generic ( ZSTDMT_CCtx * mtctx ,
ZSTD_outBuffer * output ,
ZSTD_inBuffer * input ,
ZSTD_EndDirective endOp ) ;
2017-09-11 21:09:34 +00:00
/* === Private definitions; never ever use directly === */
size_t ZSTDMT_CCtxParam_setMTCtxParameter ( ZSTD_CCtx_params * params , ZSTDMT_parameter parameter , unsigned value ) ;
2017-11-16 23:02:28 +00:00
/* ZSTDMT_CCtxParam_setNbThreads()
* Set nbThreads , and clamp it correctly ,
zstdmt via compress_generic: reduce opportunity to free/create mtctx
`zstreamtest --newapi` (and `--opaqueapi`) create and destroy way too many threads
resulting in failure of tsan tests,
and potentially connected to the qemu flaky tests.
This is because, at each test, the nb of threads can be changed (random).
The `--no-big-tests` directive reduce this choice to 1/2 threads,
in order to limit memory usage, especially for qemu and 32-bits builds.
Unfortunately, swapping between 1 and 2 threads is enough to constantly create/destroy new mtctx.
This patch takes advantage of the following property :
via compress_generic, no internal mtctx is needed for nbThreads < 2.
As a consequence, when nbThreads == 2, the currently active mtctx is necessarily good.
This dramatically reduces the nb of thread creations when invoking `zstreamtest --newapi --no-big-tests`
(only when parent cctx itself is created, which is randomized to 1/256 tests).
Expected outcome :
- at a minimum : tsan tests shall now work continuously without exploding the thread counter
- at best : flaky qemu tests on `zstreamtest --newapi --no-big-tests` may stop being flaky, due to less stress from constant thread creation/destruction
Real world impact :
minimal, I don't expect users to constantly change `nbThreads` between each invocation.
If `nbThreads` remains stable, existing implementation re-uses existing mtctx.
Also : `zstreamtest --newapi` but without `--no-big-tests` doesn't benefit as much,
since this test can select a random `nbThreads` value between 1 and 4.
The current patch only reduces opportunity to free/create mtctx (for example : 2->1->2 doesn't need a new mtctx)
but doesn't completely eliminate it, since `nbThreads` can still change between 2/3/4.
A more complete solution could be to only use 2 out of 4 allocated threads, thus keeping the pool at a constant size.
This would require a larger change to `POOL_*` api though.
2017-12-16 20:48:13 +00:00
* also reset jobSize and overlapLog */
Fixed Btree update
ZSTD_updateTree() expected to be followed by a Bt match finder, which would update zc->nextToUpdate.
With the new optimal match finder, it's not necessarily the case : a match might be found during repcode or hash3, and stops there because it reaches sufficient_len, without even entering the binary tree.
Previous policy was to nonetheless update zc->nextToUpdate, but the current position would not be inserted, creating "holes" in the btree, aka positions that will no longer be searched.
Now, when current position is not inserted, zc->nextToUpdate is not update, expecting ZSTD_updateTree() to fill the tree later on.
Solution selected is that ZSTD_updateTree() takes care of properly setting zc->nextToUpdate,
so that it no longer depends on a future function to do this job.
It took time to get there, as the issue started with a memory sanitizer error.
The pb would have been easier to spot with a proper `assert()`.
So this patch add a few of them.
Additionnally, I discovered that `make test` does not enable `assert()` during CLI tests.
This patch enables them.
Unfortunately, these `assert()` triggered other (unrelated) bugs during CLI tests, mostly within zstdmt.
So this patch also fixes them.
- Changed packed structure for gcc memory access : memory sanitizer would complain that a read "might" reach out-of-bound position on the ground that the `union` is larger than the type accessed.
Now, to avoid this issue, each type is independent.
- ZSTD_CCtxParams_setParameter() : @return provides the value of parameter, clamped/fixed appropriately.
- ZSTDMT : changed constant name to ZSTDMT_JOBSIZE_MIN
- ZSTDMT : multithreading is automatically disabled when srcSize <= ZSTDMT_JOBSIZE_MIN, since only one thread will be used in this case (saves memory and runtime).
- ZSTDMT : nbThreads is automatically clamped on setting the value.
2017-11-16 20:18:56 +00:00
size_t ZSTDMT_CCtxParam_setNbThreads ( ZSTD_CCtx_params * params , unsigned nbThreads ) ;
2017-09-11 21:09:34 +00:00
zstdmt via compress_generic: reduce opportunity to free/create mtctx
`zstreamtest --newapi` (and `--opaqueapi`) create and destroy way too many threads
resulting in failure of tsan tests,
and potentially connected to the qemu flaky tests.
This is because, at each test, the nb of threads can be changed (random).
The `--no-big-tests` directive reduce this choice to 1/2 threads,
in order to limit memory usage, especially for qemu and 32-bits builds.
Unfortunately, swapping between 1 and 2 threads is enough to constantly create/destroy new mtctx.
This patch takes advantage of the following property :
via compress_generic, no internal mtctx is needed for nbThreads < 2.
As a consequence, when nbThreads == 2, the currently active mtctx is necessarily good.
This dramatically reduces the nb of thread creations when invoking `zstreamtest --newapi --no-big-tests`
(only when parent cctx itself is created, which is randomized to 1/256 tests).
Expected outcome :
- at a minimum : tsan tests shall now work continuously without exploding the thread counter
- at best : flaky qemu tests on `zstreamtest --newapi --no-big-tests` may stop being flaky, due to less stress from constant thread creation/destruction
Real world impact :
minimal, I don't expect users to constantly change `nbThreads` between each invocation.
If `nbThreads` remains stable, existing implementation re-uses existing mtctx.
Also : `zstreamtest --newapi` but without `--no-big-tests` doesn't benefit as much,
since this test can select a random `nbThreads` value between 1 and 4.
The current patch only reduces opportunity to free/create mtctx (for example : 2->1->2 doesn't need a new mtctx)
but doesn't completely eliminate it, since `nbThreads` can still change between 2/3/4.
A more complete solution could be to only use 2 out of 4 allocated threads, thus keeping the pool at a constant size.
This would require a larger change to `POOL_*` api though.
2017-12-16 20:48:13 +00:00
/* ZSTDMT_getNbThreads():
* @ return nb threads currently active in mtctx .
* mtctx must be valid */
size_t ZSTDMT_getNbThreads ( const ZSTDMT_CCtx * mtctx ) ;
2017-09-11 21:09:34 +00:00
/*! ZSTDMT_initCStream_internal() :
* Private use only . Init streaming operation .
* expects params to be valid .
* must receive dict , or cdict , or none , but not both .
* @ return : 0 , or an error code */
size_t ZSTDMT_initCStream_internal ( ZSTDMT_CCtx * zcs ,
const void * dict , size_t dictSize , ZSTD_dictMode_e dictMode ,
const ZSTD_CDict * cdict ,
ZSTD_CCtx_params params , unsigned long long pledgedSrcSize ) ;
2017-06-06 01:32:48 +00:00
2017-01-28 00:00:19 +00:00
# if defined (__cplusplus)
}
# endif
# endif /* ZSTDMT_COMPRESS_H */