1*01826a49SYabin CuiZstandard wrapper for zlib 2*01826a49SYabin Cui================================ 3*01826a49SYabin Cui 4*01826a49SYabin CuiThe main objective of creating a zstd wrapper for [zlib](https://zlib.net/) is to allow a quick and smooth transition to zstd for projects already using zlib. 5*01826a49SYabin Cui 6*01826a49SYabin Cui#### Required files 7*01826a49SYabin Cui 8*01826a49SYabin CuiTo build the zstd wrapper for zlib the following files are required: 9*01826a49SYabin Cui- zlib.h 10*01826a49SYabin Cui- a static or dynamic zlib library 11*01826a49SYabin Cui- zlibWrapper/zstd_zlibwrapper.h 12*01826a49SYabin Cui- zlibWrapper/zstd_zlibwrapper.c 13*01826a49SYabin Cui- zlibWrapper/gz*.c files (gzclose.c, gzlib.c, gzread.c, gzwrite.c) 14*01826a49SYabin Cui- zlibWrapper/gz*.h files (gzcompatibility.h, gzguts.h) 15*01826a49SYabin Cui- a static or dynamic zstd library 16*01826a49SYabin Cui 17*01826a49SYabin CuiThe first two files are required by all projects using zlib and they are not included with the zstd distribution. 18*01826a49SYabin CuiThe further files are supplied with the zstd distribution. 19*01826a49SYabin Cui 20*01826a49SYabin Cui 21*01826a49SYabin Cui#### Embedding the zstd wrapper within your project 22*01826a49SYabin Cui 23*01826a49SYabin CuiLet's assume that your project that uses zlib is compiled with: 24*01826a49SYabin Cui```gcc project.o -lz``` 25*01826a49SYabin Cui 26*01826a49SYabin CuiTo compile the zstd wrapper with your project you have to do the following: 27*01826a49SYabin Cui- change all references with `#include "zlib.h"` to `#include "zstd_zlibwrapper.h"` 28*01826a49SYabin Cui- compile your project with `zstd_zlibwrapper.c`, `gz*.c` and a static or dynamic zstd library 29*01826a49SYabin Cui 30*01826a49SYabin CuiThe linking should be changed to: 31*01826a49SYabin Cui```gcc project.o zstd_zlibwrapper.o gz*.c -lz -lzstd``` 32*01826a49SYabin Cui 33*01826a49SYabin Cui 34*01826a49SYabin Cui#### Enabling zstd compression within your project 35*01826a49SYabin Cui 36*01826a49SYabin CuiAfter embedding the zstd wrapper within your project the zstd library is turned off by default. 37*01826a49SYabin CuiYour project should work as before with zlib. There are two options to enable zstd compression: 38*01826a49SYabin Cui- compilation with `-DZWRAP_USE_ZSTD=1` (or using `#define ZWRAP_USE_ZSTD 1` before `#include "zstd_zlibwrapper.h"`) 39*01826a49SYabin Cui- using the `void ZWRAP_useZSTDcompression(int turn_on)` function (declared in `#include "zstd_zlibwrapper.h"`) 40*01826a49SYabin Cui 41*01826a49SYabin CuiDuring decompression zlib and zstd streams are automatically detected and decompressed using a proper library. 42*01826a49SYabin CuiThis behavior can be changed using `ZWRAP_setDecompressionType(ZWRAP_FORCE_ZLIB)` what will make zlib decompression slightly faster. 43*01826a49SYabin Cui 44*01826a49SYabin Cui 45*01826a49SYabin Cui#### Example 46*01826a49SYabin CuiWe have taken the file `test/example.c` from [the zlib library distribution](https://zlib.net/) and copied it to [zlibWrapper/examples/example.c](examples/example.c). 47*01826a49SYabin CuiAfter compilation and execution it shows the following results: 48*01826a49SYabin Cui``` 49*01826a49SYabin Cuizlib version 1.2.8 = 0x1280, compile flags = 0x65 50*01826a49SYabin Cuiuncompress(): hello, hello! 51*01826a49SYabin Cuigzread(): hello, hello! 52*01826a49SYabin Cuigzgets() after gzseek: hello! 53*01826a49SYabin Cuiinflate(): hello, hello! 54*01826a49SYabin Cuilarge_inflate(): OK 55*01826a49SYabin Cuiafter inflateSync(): hello, hello! 56*01826a49SYabin Cuiinflate with dictionary: hello, hello! 57*01826a49SYabin Cui``` 58*01826a49SYabin CuiThen we have changed `#include "zlib.h"` to `#include "zstd_zlibwrapper.h"`, compiled the [example.c](examples/example.c) file 59*01826a49SYabin Cuiwith `-DZWRAP_USE_ZSTD=1` and linked with additional `zstd_zlibwrapper.o gz*.c -lzstd`. 60*01826a49SYabin CuiWe were forced to turn off the following functions: `test_flush`, `test_sync` which use currently unsupported features. 61*01826a49SYabin CuiAfter running it shows the following results: 62*01826a49SYabin Cui``` 63*01826a49SYabin Cuizlib version 1.2.8 = 0x1280, compile flags = 0x65 64*01826a49SYabin Cuiuncompress(): hello, hello! 65*01826a49SYabin Cuigzread(): hello, hello! 66*01826a49SYabin Cuigzgets() after gzseek: hello! 67*01826a49SYabin Cuiinflate(): hello, hello! 68*01826a49SYabin Cuilarge_inflate(): OK 69*01826a49SYabin Cuiinflate with dictionary: hello, hello! 70*01826a49SYabin Cui``` 71*01826a49SYabin CuiThe script used for compilation can be found at [zlibWrapper/Makefile](Makefile). 72*01826a49SYabin Cui 73*01826a49SYabin Cui 74*01826a49SYabin Cui#### The measurement of performance of Zstandard wrapper for zlib 75*01826a49SYabin Cui 76*01826a49SYabin CuiThe zstd distribution contains a tool called `zwrapbench` which can measure speed and ratio of zlib, zstd, and the wrapper. 77*01826a49SYabin CuiThe benchmark is conducted using given filenames or synthetic data if filenames are not provided. 78*01826a49SYabin CuiThe files are read into memory and processed independently. 79*01826a49SYabin CuiIt makes benchmark more precise as it eliminates I/O overhead. 80*01826a49SYabin CuiMany filenames can be supplied as multiple parameters, parameters with wildcards or names of directories can be used as parameters with the -r option. 81*01826a49SYabin CuiOne can select compression levels starting from `-b` and ending with `-e`. The `-i` parameter selects minimal time used for each of tested levels. 82*01826a49SYabin CuiWith `-B` option bigger files can be divided into smaller, independently compressed blocks. 83*01826a49SYabin CuiThe benchmark tool can be compiled with `make zwrapbench` using [zlibWrapper/Makefile](Makefile). 84*01826a49SYabin Cui 85*01826a49SYabin Cui 86*01826a49SYabin Cui#### Improving speed of streaming compression 87*01826a49SYabin Cui 88*01826a49SYabin CuiDuring streaming compression the compressor never knows how big is data to compress. 89*01826a49SYabin CuiZstandard compression can be improved by providing size of source data to the compressor. By default streaming compressor assumes that data is bigger than 256 KB but it can hurt compression speed on smaller data. 90*01826a49SYabin CuiThe zstd wrapper provides the `ZWRAP_setPledgedSrcSize()` function that allows to change a pledged source size for a given compression stream. 91*01826a49SYabin CuiThe function will change zstd compression parameters what may improve compression speed and/or ratio. 92*01826a49SYabin CuiIt should be called just after `deflateInit()`or `deflateReset()` and before `deflate()` or `deflateSetDictionary()`. The function is only helpful when data is compressed in blocks. There will be no change in case of `deflateInit()` or `deflateReset()` immediately followed by `deflate(strm, Z_FINISH)` 93*01826a49SYabin Cuias this case is automatically detected. 94*01826a49SYabin Cui 95*01826a49SYabin Cui 96*01826a49SYabin Cui#### Reusing contexts 97*01826a49SYabin Cui 98*01826a49SYabin CuiThe ordinary zlib compression of two files/streams allocates two contexts: 99*01826a49SYabin Cui- for the 1st file calls `deflateInit`, `deflate`, `...`, `deflate`, `deflateEnd` 100*01826a49SYabin Cui- for the 2nd file calls `deflateInit`, `deflate`, `...`, `deflate`, `deflateEnd` 101*01826a49SYabin Cui 102*01826a49SYabin CuiThe speed of compression can be improved with reusing a single context with following steps: 103*01826a49SYabin Cui- initialize the context with `deflateInit` 104*01826a49SYabin Cui- for the 1st file call `deflate`, `...`, `deflate` 105*01826a49SYabin Cui- for the 2nd file call `deflateReset`, `deflate`, `...`, `deflate` 106*01826a49SYabin Cui- free the context with `deflateEnd` 107*01826a49SYabin Cui 108*01826a49SYabin CuiTo check the difference we made experiments using `zwrapbench -ri6b6` with zstd and zlib compression (both at level 6). 109*01826a49SYabin CuiThe input data was decompressed git repository downloaded from https://github.com/git/git/archive/master.zip which contains 2979 files. 110*01826a49SYabin CuiThe table below shows that reusing contexts has a minor influence on zlib but it gives improvement for zstd. 111*01826a49SYabin CuiIn our example (the last 2 lines) it gives 4% better compression speed and 5% better decompression speed. 112*01826a49SYabin Cui 113*01826a49SYabin Cui| Compression type | Compression | Decompress.| Compr. size | Ratio | 114*01826a49SYabin Cui| ------------------------------------------------- | ------------| -----------| ----------- | ----- | 115*01826a49SYabin Cui| zlib 1.2.8 | 30.51 MB/s | 219.3 MB/s | 6819783 | 3.459 | 116*01826a49SYabin Cui| zlib 1.2.8 not reusing a context | 30.22 MB/s | 218.1 MB/s | 6819783 | 3.459 | 117*01826a49SYabin Cui| zlib 1.2.8 with zlibWrapper and reusing a context | 30.40 MB/s | 218.9 MB/s | 6819783 | 3.459 | 118*01826a49SYabin Cui| zlib 1.2.8 with zlibWrapper not reusing a context | 30.28 MB/s | 218.1 MB/s | 6819783 | 3.459 | 119*01826a49SYabin Cui| zstd 1.1.0 using ZSTD_CCtx | 68.35 MB/s | 430.9 MB/s | 6868521 | 3.435 | 120*01826a49SYabin Cui| zstd 1.1.0 using ZSTD_CStream | 66.63 MB/s | 422.3 MB/s | 6868521 | 3.435 | 121*01826a49SYabin Cui| zstd 1.1.0 with zlibWrapper and reusing a context | 54.01 MB/s | 403.2 MB/s | 6763482 | 3.488 | 122*01826a49SYabin Cui| zstd 1.1.0 with zlibWrapper not reusing a context | 51.59 MB/s | 383.7 MB/s | 6763482 | 3.488 | 123*01826a49SYabin Cui 124*01826a49SYabin Cui 125*01826a49SYabin Cui#### Compatibility issues 126*01826a49SYabin CuiAfter enabling zstd compression not all native zlib functions are supported. When calling unsupported methods they put error message into `strm->msg` and return Z_STREAM_ERROR. 127*01826a49SYabin Cui 128*01826a49SYabin CuiSupported methods: 129*01826a49SYabin Cui- deflateInit 130*01826a49SYabin Cui- deflate (with exception of Z_FULL_FLUSH, Z_BLOCK, and Z_TREES) 131*01826a49SYabin Cui- deflateSetDictionary 132*01826a49SYabin Cui- deflateEnd 133*01826a49SYabin Cui- deflateReset 134*01826a49SYabin Cui- deflateBound 135*01826a49SYabin Cui- inflateInit 136*01826a49SYabin Cui- inflate 137*01826a49SYabin Cui- inflateSetDictionary 138*01826a49SYabin Cui- inflateReset 139*01826a49SYabin Cui- inflateReset2 140*01826a49SYabin Cui- compress 141*01826a49SYabin Cui- compress2 142*01826a49SYabin Cui- compressBound 143*01826a49SYabin Cui- uncompress 144*01826a49SYabin Cui- gzip file access functions 145*01826a49SYabin Cui 146*01826a49SYabin CuiIgnored methods (they do nothing): 147*01826a49SYabin Cui- deflateParams 148*01826a49SYabin Cui 149*01826a49SYabin CuiUnsupported methods: 150*01826a49SYabin Cui- deflateCopy 151*01826a49SYabin Cui- deflateTune 152*01826a49SYabin Cui- deflatePending 153*01826a49SYabin Cui- deflatePrime 154*01826a49SYabin Cui- deflateSetHeader 155*01826a49SYabin Cui- inflateGetDictionary 156*01826a49SYabin Cui- inflateCopy 157*01826a49SYabin Cui- inflateSync 158*01826a49SYabin Cui- inflatePrime 159*01826a49SYabin Cui- inflateMark 160*01826a49SYabin Cui- inflateGetHeader 161*01826a49SYabin Cui- inflateBackInit 162*01826a49SYabin Cui- inflateBack 163*01826a49SYabin Cui- inflateBackEnd 164