1*01826a49SYabin CuiPrograms and scripts for automated testing of Zstandard 2*01826a49SYabin Cui======================================================= 3*01826a49SYabin Cui 4*01826a49SYabin CuiThis directory contains the following programs and scripts: 5*01826a49SYabin Cui- `datagen` : Synthetic and parametrable data generator, for tests 6*01826a49SYabin Cui- `fullbench` : Precisely measure speed for each zstd inner functions 7*01826a49SYabin Cui- `fuzzer` : Test tool, to check zstd integrity on target platform 8*01826a49SYabin Cui- `paramgrill` : parameter tester for zstd 9*01826a49SYabin Cui- `test-zstd-speed.py` : script for testing zstd speed difference between commits 10*01826a49SYabin Cui- `test-zstd-versions.py` : compatibility test between zstd versions stored on Github (v0.1+) 11*01826a49SYabin Cui- `zstreamtest` : Fuzzer test tool for zstd streaming API 12*01826a49SYabin Cui- `legacy` : Test tool to test decoding of legacy zstd frames 13*01826a49SYabin Cui- `decodecorpus` : Tool to generate valid Zstandard frames, for verifying decoder implementations 14*01826a49SYabin Cui 15*01826a49SYabin Cui 16*01826a49SYabin Cui#### `test-zstd-versions.py` - script for testing zstd interoperability between versions 17*01826a49SYabin Cui 18*01826a49SYabin CuiThis script creates `versionsTest` directory to which zstd repository is cloned. 19*01826a49SYabin CuiThen all tagged (released) versions of zstd are compiled. 20*01826a49SYabin CuiIn the following step interoperability between zstd versions is checked. 21*01826a49SYabin Cui 22*01826a49SYabin Cui#### `automated-benchmarking.py` - script for benchmarking zstd prs to dev 23*01826a49SYabin Cui 24*01826a49SYabin CuiThis script benchmarks facebook:dev and changes from pull requests made to zstd and compares 25*01826a49SYabin Cuithem against facebook:dev to detect regressions. This script currently runs on a dedicated 26*01826a49SYabin Cuidesktop machine for every pull request that is made to the zstd repo but can also 27*01826a49SYabin Cuibe run on any machine via the command line interface. 28*01826a49SYabin Cui 29*01826a49SYabin CuiThere are three modes of usage for this script: fastmode will just run a minimal single 30*01826a49SYabin Cuibuild comparison (between facebook:dev and facebook:release), onetime will pull all the current 31*01826a49SYabin Cuipull requests from the zstd repo and compare facebook:dev to all of them once, continuous 32*01826a49SYabin Cuiwill continuously get pull requests from the zstd repo and run benchmarks against facebook:dev. 33*01826a49SYabin Cui 34*01826a49SYabin Cui``` 35*01826a49SYabin CuiExample usage: python automated_benchmarking.py 36*01826a49SYabin Cui``` 37*01826a49SYabin Cui 38*01826a49SYabin Cui``` 39*01826a49SYabin Cuiusage: automated_benchmarking.py [-h] [--directory DIRECTORY] 40*01826a49SYabin Cui [--levels LEVELS] [--iterations ITERATIONS] 41*01826a49SYabin Cui [--emails EMAILS] [--frequency FREQUENCY] 42*01826a49SYabin Cui [--mode MODE] [--dict DICT] 43*01826a49SYabin Cui 44*01826a49SYabin Cuioptional arguments: 45*01826a49SYabin Cui -h, --help show this help message and exit 46*01826a49SYabin Cui --directory DIRECTORY 47*01826a49SYabin Cui directory with files to benchmark 48*01826a49SYabin Cui --levels LEVELS levels to test e.g. ('1,2,3') 49*01826a49SYabin Cui --iterations ITERATIONS 50*01826a49SYabin Cui number of benchmark iterations to run 51*01826a49SYabin Cui --emails EMAILS email addresses of people who will be alerted upon 52*01826a49SYabin Cui regression. Only for continuous mode 53*01826a49SYabin Cui --frequency FREQUENCY 54*01826a49SYabin Cui specifies the number of seconds to wait before each 55*01826a49SYabin Cui successive check for new PRs in continuous mode 56*01826a49SYabin Cui --mode MODE 'fastmode', 'onetime', 'current', or 'continuous' (see 57*01826a49SYabin Cui README.md for details) 58*01826a49SYabin Cui --dict DICT filename of dictionary to use (when set, this 59*01826a49SYabin Cui dictionary will be used to compress the files provided 60*01826a49SYabin Cui inside --directory) 61*01826a49SYabin Cui``` 62*01826a49SYabin Cui 63*01826a49SYabin Cui#### `test-zstd-speed.py` - script for testing zstd speed difference between commits 64*01826a49SYabin Cui 65*01826a49SYabin CuiDEPRECATED 66*01826a49SYabin Cui 67*01826a49SYabin CuiThis script creates `speedTest` directory to which zstd repository is cloned. 68*01826a49SYabin CuiThen it compiles all branches of zstd and performs a speed benchmark for a given list of files (the `testFileNames` parameter). 69*01826a49SYabin CuiAfter `sleepTime` (an optional parameter, default 300 seconds) seconds the script checks repository for new commits. 70*01826a49SYabin CuiIf a new commit is found it is compiled and a speed benchmark for this commit is performed. 71*01826a49SYabin CuiThe results of the speed benchmark are compared to the previous results. 72*01826a49SYabin CuiIf compression or decompression speed for one of zstd levels is lower than `lowerLimit` (an optional parameter, default 0.98) the speed benchmark is restarted. 73*01826a49SYabin CuiIf second results are also lower than `lowerLimit` the warning e-mail is sent to recipients from the list (the `emails` parameter). 74*01826a49SYabin Cui 75*01826a49SYabin CuiAdditional remarks: 76*01826a49SYabin Cui- To be sure that speed results are accurate the script should be run on a "stable" target system with no other jobs running in parallel 77*01826a49SYabin Cui- Using the script with virtual machines can lead to large variations of speed results 78*01826a49SYabin Cui- The speed benchmark is not performed until computers' load average is lower than `maxLoadAvg` (an optional parameter, default 0.75) 79*01826a49SYabin Cui- The script sends e-mails using `mutt`; if `mutt` is not available it sends e-mails without attachments using `mail`; if both are not available it only prints a warning 80*01826a49SYabin Cui 81*01826a49SYabin Cui 82*01826a49SYabin CuiThe example usage with two test files, one e-mail address, and with an additional message: 83*01826a49SYabin Cui``` 84*01826a49SYabin Cui./test-zstd-speed.py "silesia.tar calgary.tar" "[email protected]" --message "tested on my laptop" --sleepTime 60 85*01826a49SYabin Cui``` 86*01826a49SYabin Cui 87*01826a49SYabin CuiTo run the script in background please use: 88*01826a49SYabin Cui``` 89*01826a49SYabin Cuinohup ./test-zstd-speed.py testFileNames emails & 90*01826a49SYabin Cui``` 91*01826a49SYabin Cui 92*01826a49SYabin CuiThe full list of parameters: 93*01826a49SYabin Cui``` 94*01826a49SYabin Cuipositional arguments: 95*01826a49SYabin Cui testFileNames file names list for speed benchmark 96*01826a49SYabin Cui emails list of e-mail addresses to send warnings 97*01826a49SYabin Cui 98*01826a49SYabin Cuioptional arguments: 99*01826a49SYabin Cui -h, --help show this help message and exit 100*01826a49SYabin Cui --message MESSAGE attach an additional message to e-mail 101*01826a49SYabin Cui --lowerLimit LOWERLIMIT 102*01826a49SYabin Cui send email if speed is lower than given limit 103*01826a49SYabin Cui --maxLoadAvg MAXLOADAVG 104*01826a49SYabin Cui maximum load average to start testing 105*01826a49SYabin Cui --lastCLevel LASTCLEVEL 106*01826a49SYabin Cui last compression level for testing 107*01826a49SYabin Cui --sleepTime SLEEPTIME 108*01826a49SYabin Cui frequency of repository checking in seconds 109*01826a49SYabin Cui``` 110*01826a49SYabin Cui 111*01826a49SYabin Cui#### `decodecorpus` - tool to generate Zstandard frames for decoder testing 112*01826a49SYabin CuiCommand line tool to generate test .zst files. 113*01826a49SYabin Cui 114*01826a49SYabin CuiThis tool will generate .zst files with checksums, 115*01826a49SYabin Cuias well as optionally output the corresponding correct uncompressed data for 116*01826a49SYabin Cuiextra verification. 117*01826a49SYabin Cui 118*01826a49SYabin CuiExample: 119*01826a49SYabin Cui``` 120*01826a49SYabin Cui./decodecorpus -ptestfiles -otestfiles -n10000 -s5 121*01826a49SYabin Cui``` 122*01826a49SYabin Cuiwill generate 10,000 sample .zst files using a seed of 5 in the `testfiles` directory, 123*01826a49SYabin Cuiwith the zstd checksum field set, 124*01826a49SYabin Cuias well as the 10,000 original files for more detailed comparison of decompression results. 125*01826a49SYabin Cui 126*01826a49SYabin Cui``` 127*01826a49SYabin Cui./decodecorpus -t -T1mn 128*01826a49SYabin Cui``` 129*01826a49SYabin Cuiwill choose a random seed, and for 1 minute, 130*01826a49SYabin Cuigenerate random test frames and ensure that the 131*01826a49SYabin Cuizstd library correctly decompresses them in both simple and streaming modes. 132*01826a49SYabin Cui 133*01826a49SYabin Cui#### `paramgrill` - tool for generating compression table parameters and optimizing parameters on file given constraints 134*01826a49SYabin Cui 135*01826a49SYabin CuiFull list of arguments 136*01826a49SYabin Cui``` 137*01826a49SYabin Cui -T# : set level 1 speed objective 138*01826a49SYabin Cui -B# : cut input into blocks of size # (default : single block) 139*01826a49SYabin Cui -S : benchmarks a single run (example command: -Sl3w10h12) 140*01826a49SYabin Cui w# - windowLog 141*01826a49SYabin Cui h# - hashLog 142*01826a49SYabin Cui c# - chainLog 143*01826a49SYabin Cui s# - searchLog 144*01826a49SYabin Cui l# - minMatch 145*01826a49SYabin Cui t# - targetLength 146*01826a49SYabin Cui S# - strategy 147*01826a49SYabin Cui L# - level 148*01826a49SYabin Cui --zstd= : Single run, parameter selection syntax same as zstdcli with more parameters 149*01826a49SYabin Cui (Added forceAttachDictionary / fadt) 150*01826a49SYabin Cui When invoked with --optimize, this represents the sample to exceed. 151*01826a49SYabin Cui --optimize= : find parameters to maximize compression ratio given parameters 152*01826a49SYabin Cui Can use all --zstd= commands to constrain the type of solution found in addition to the following constraints 153*01826a49SYabin Cui cSpeed= : Minimum compression speed 154*01826a49SYabin Cui dSpeed= : Minimum decompression speed 155*01826a49SYabin Cui cMem= : Maximum compression memory 156*01826a49SYabin Cui lvl= : Searches for solutions which are strictly better than that compression lvl in ratio and cSpeed, 157*01826a49SYabin Cui stc= : When invoked with lvl=, represents percentage slack in ratio/cSpeed allowed for a solution to be considered (Default 100%) 158*01826a49SYabin Cui : In normal operation, represents percentage slack in choosing viable starting strategy selection in choosing the default parameters 159*01826a49SYabin Cui (Lower value will begin with stronger strategies) (Default 90%) 160*01826a49SYabin Cui speedRatio= (accepts decimals) 161*01826a49SYabin Cui : determines value of gains in speed vs gains in ratio 162*01826a49SYabin Cui when determining overall winner (default 5 (1% ratio = 5% speed)). 163*01826a49SYabin Cui tries= : Maximum number of random restarts on a single strategy before switching (Default 5) 164*01826a49SYabin Cui Higher values will make optimizer run longer, more chances to find better solution. 165*01826a49SYabin Cui memLog : Limits the log of the size of each memotable (1 per strategy). Will use hash tables when state space is larger than max size. 166*01826a49SYabin Cui Setting memLog = 0 turns off memoization 167*01826a49SYabin Cui --display= : specify which parameters are included in the output 168*01826a49SYabin Cui can use all --zstd parameter names and 'cParams' as a shorthand for all parameters used in ZSTD_compressionParameters 169*01826a49SYabin Cui (Default: display all params available) 170*01826a49SYabin Cui -P# : generated sample compressibility (when no file is provided) 171*01826a49SYabin Cui -t# : Caps runtime of operation in seconds (default: 99999 seconds (about 27 hours)) 172*01826a49SYabin Cui -v : Prints Benchmarking output 173*01826a49SYabin Cui -D : Next argument dictionary file 174*01826a49SYabin Cui -s : Benchmark all files separately 175*01826a49SYabin Cui -q : Quiet, repeat for more quiet 176*01826a49SYabin Cui -q Prints parameters + results whenever a new best is found 177*01826a49SYabin Cui -qq Only prints parameters whenever a new best is found, prints final parameters + results 178*01826a49SYabin Cui -qqq Only print final parameters + results 179*01826a49SYabin Cui -qqqq Only prints final parameter set in the form --zstd= 180*01826a49SYabin Cui -v : Verbose, cancels quiet, repeat for more volume 181*01826a49SYabin Cui -v Prints all candidate parameters and results 182*01826a49SYabin Cui 183*01826a49SYabin Cui``` 184*01826a49SYabin Cui Any inputs afterwards are treated as files to benchmark. 185