1Programs and scripts for automated testing of Zstandard 2======================================================= 3 4This directory contains the following programs and scripts: 5- `datagen` : Synthetic and parametrable data generator, for tests 6- `fullbench` : Precisely measure speed for each zstd inner functions 7- `fuzzer` : Test tool, to check zstd integrity on target platform 8- `paramgrill` : parameter tester for zstd 9- `test-zstd-speed.py` : script for testing zstd speed difference between commits 10- `test-zstd-versions.py` : compatibility test between zstd versions stored on Github (v0.1+) 11- `zstreamtest` : Fuzzer test tool for zstd streaming API 12- `legacy` : Test tool to test decoding of legacy zstd frames 13- `decodecorpus` : Tool to generate valid Zstandard frames, for verifying decoder implementations 14 15 16#### `test-zstd-versions.py` - script for testing zstd interoperability between versions 17 18This script creates `versionsTest` directory to which zstd repository is cloned. 19Then all tagged (released) versions of zstd are compiled. 20In the following step interoperability between zstd versions is checked. 21 22#### `automated-benchmarking.py` - script for benchmarking zstd prs to dev 23 24This script benchmarks facebook:dev and changes from pull requests made to zstd and compares 25them against facebook:dev to detect regressions. This script currently runs on a dedicated 26desktop machine for every pull request that is made to the zstd repo but can also 27be run on any machine via the command line interface. 28 29There are three modes of usage for this script: fastmode will just run a minimal single 30build comparison (between facebook:dev and facebook:release), onetime will pull all the current 31pull requests from the zstd repo and compare facebook:dev to all of them once, continuous 32will continuously get pull requests from the zstd repo and run benchmarks against facebook:dev. 33 34``` 35Example usage: python automated_benchmarking.py 36``` 37 38``` 39usage: automated_benchmarking.py [-h] [--directory DIRECTORY] 40 [--levels LEVELS] [--iterations ITERATIONS] 41 [--emails EMAILS] [--frequency FREQUENCY] 42 [--mode MODE] [--dict DICT] 43 44optional arguments: 45 -h, --help show this help message and exit 46 --directory DIRECTORY 47 directory with files to benchmark 48 --levels LEVELS levels to test e.g. ('1,2,3') 49 --iterations ITERATIONS 50 number of benchmark iterations to run 51 --emails EMAILS email addresses of people who will be alerted upon 52 regression. Only for continuous mode 53 --frequency FREQUENCY 54 specifies the number of seconds to wait before each 55 successive check for new PRs in continuous mode 56 --mode MODE 'fastmode', 'onetime', 'current', or 'continuous' (see 57 README.md for details) 58 --dict DICT filename of dictionary to use (when set, this 59 dictionary will be used to compress the files provided 60 inside --directory) 61``` 62 63#### `test-zstd-speed.py` - script for testing zstd speed difference between commits 64 65DEPRECATED 66 67This script creates `speedTest` directory to which zstd repository is cloned. 68Then it compiles all branches of zstd and performs a speed benchmark for a given list of files (the `testFileNames` parameter). 69After `sleepTime` (an optional parameter, default 300 seconds) seconds the script checks repository for new commits. 70If a new commit is found it is compiled and a speed benchmark for this commit is performed. 71The results of the speed benchmark are compared to the previous results. 72If compression or decompression speed for one of zstd levels is lower than `lowerLimit` (an optional parameter, default 0.98) the speed benchmark is restarted. 73If second results are also lower than `lowerLimit` the warning e-mail is sent to recipients from the list (the `emails` parameter). 74 75Additional remarks: 76- To be sure that speed results are accurate the script should be run on a "stable" target system with no other jobs running in parallel 77- Using the script with virtual machines can lead to large variations of speed results 78- The speed benchmark is not performed until computers' load average is lower than `maxLoadAvg` (an optional parameter, default 0.75) 79- The script sends e-mails using `mutt`; if `mutt` is not available it sends e-mails without attachments using `mail`; if both are not available it only prints a warning 80 81 82The example usage with two test files, one e-mail address, and with an additional message: 83``` 84./test-zstd-speed.py "silesia.tar calgary.tar" "[email protected]" --message "tested on my laptop" --sleepTime 60 85``` 86 87To run the script in background please use: 88``` 89nohup ./test-zstd-speed.py testFileNames emails & 90``` 91 92The full list of parameters: 93``` 94positional arguments: 95 testFileNames file names list for speed benchmark 96 emails list of e-mail addresses to send warnings 97 98optional arguments: 99 -h, --help show this help message and exit 100 --message MESSAGE attach an additional message to e-mail 101 --lowerLimit LOWERLIMIT 102 send email if speed is lower than given limit 103 --maxLoadAvg MAXLOADAVG 104 maximum load average to start testing 105 --lastCLevel LASTCLEVEL 106 last compression level for testing 107 --sleepTime SLEEPTIME 108 frequency of repository checking in seconds 109``` 110 111#### `decodecorpus` - tool to generate Zstandard frames for decoder testing 112Command line tool to generate test .zst files. 113 114This tool will generate .zst files with checksums, 115as well as optionally output the corresponding correct uncompressed data for 116extra verification. 117 118Example: 119``` 120./decodecorpus -ptestfiles -otestfiles -n10000 -s5 121``` 122will generate 10,000 sample .zst files using a seed of 5 in the `testfiles` directory, 123with the zstd checksum field set, 124as well as the 10,000 original files for more detailed comparison of decompression results. 125 126``` 127./decodecorpus -t -T1mn 128``` 129will choose a random seed, and for 1 minute, 130generate random test frames and ensure that the 131zstd library correctly decompresses them in both simple and streaming modes. 132 133#### `paramgrill` - tool for generating compression table parameters and optimizing parameters on file given constraints 134 135Full list of arguments 136``` 137 -T# : set level 1 speed objective 138 -B# : cut input into blocks of size # (default : single block) 139 -S : benchmarks a single run (example command: -Sl3w10h12) 140 w# - windowLog 141 h# - hashLog 142 c# - chainLog 143 s# - searchLog 144 l# - minMatch 145 t# - targetLength 146 S# - strategy 147 L# - level 148 --zstd= : Single run, parameter selection syntax same as zstdcli with more parameters 149 (Added forceAttachDictionary / fadt) 150 When invoked with --optimize, this represents the sample to exceed. 151 --optimize= : find parameters to maximize compression ratio given parameters 152 Can use all --zstd= commands to constrain the type of solution found in addition to the following constraints 153 cSpeed= : Minimum compression speed 154 dSpeed= : Minimum decompression speed 155 cMem= : Maximum compression memory 156 lvl= : Searches for solutions which are strictly better than that compression lvl in ratio and cSpeed, 157 stc= : When invoked with lvl=, represents percentage slack in ratio/cSpeed allowed for a solution to be considered (Default 100%) 158 : In normal operation, represents percentage slack in choosing viable starting strategy selection in choosing the default parameters 159 (Lower value will begin with stronger strategies) (Default 90%) 160 speedRatio= (accepts decimals) 161 : determines value of gains in speed vs gains in ratio 162 when determining overall winner (default 5 (1% ratio = 5% speed)). 163 tries= : Maximum number of random restarts on a single strategy before switching (Default 5) 164 Higher values will make optimizer run longer, more chances to find better solution. 165 memLog : Limits the log of the size of each memotable (1 per strategy). Will use hash tables when state space is larger than max size. 166 Setting memLog = 0 turns off memoization 167 --display= : specify which parameters are included in the output 168 can use all --zstd parameter names and 'cParams' as a shorthand for all parameters used in ZSTD_compressionParameters 169 (Default: display all params available) 170 -P# : generated sample compressibility (when no file is provided) 171 -t# : Caps runtime of operation in seconds (default: 99999 seconds (about 27 hours)) 172 -v : Prints Benchmarking output 173 -D : Next argument dictionary file 174 -s : Benchmark all files separately 175 -q : Quiet, repeat for more quiet 176 -q Prints parameters + results whenever a new best is found 177 -qq Only prints parameters whenever a new best is found, prints final parameters + results 178 -qqq Only print final parameters + results 179 -qqqq Only prints final parameter set in the form --zstd= 180 -v : Verbose, cancels quiet, repeat for more volume 181 -v Prints all candidate parameters and results 182 183``` 184 Any inputs afterwards are treated as files to benchmark. 185