zstd/tests/README.md

*01826a49SYabin CuiPrograms and scripts for automated testing of Zstandard
*01826a49SYabin Cui=======================================================
*01826a49SYabin Cui
*01826a49SYabin CuiThis directory contains the following programs and scripts:
*01826a49SYabin Cui- `datagen` : Synthetic and parametrable data generator, for tests
*01826a49SYabin Cui- `fullbench`  : Precisely measure speed for each zstd inner functions
*01826a49SYabin Cui- `fuzzer`  : Test tool, to check zstd integrity on target platform
*01826a49SYabin Cui- `paramgrill` : parameter tester for zstd
*01826a49SYabin Cui- `test-zstd-speed.py` : script for testing zstd speed difference between commits
*01826a49SYabin Cui- `test-zstd-versions.py` : compatibility test between zstd versions stored on Github (v0.1+)
*01826a49SYabin Cui- `zstreamtest` : Fuzzer test tool for zstd streaming API
*01826a49SYabin Cui- `legacy` : Test tool to test decoding of legacy zstd frames
*01826a49SYabin Cui- `decodecorpus` : Tool to generate valid Zstandard frames, for verifying decoder implementations
*01826a49SYabin Cui
*01826a49SYabin Cui
*01826a49SYabin Cui#### `test-zstd-versions.py` - script for testing zstd interoperability between versions
*01826a49SYabin Cui
*01826a49SYabin CuiThis script creates `versionsTest` directory to which zstd repository is cloned.
*01826a49SYabin CuiThen all tagged (released) versions of zstd are compiled.
*01826a49SYabin CuiIn the following step interoperability between zstd versions is checked.
*01826a49SYabin Cui
*01826a49SYabin Cui#### `automated-benchmarking.py` - script for benchmarking zstd prs to dev
*01826a49SYabin Cui
*01826a49SYabin CuiThis script benchmarks facebook:dev and changes from pull requests made to zstd and compares
*01826a49SYabin Cuithem against facebook:dev to detect regressions. This script currently runs on a dedicated
*01826a49SYabin Cuidesktop machine for every pull request that is made to the zstd repo but can also
*01826a49SYabin Cuibe run on any machine via the command line interface.
*01826a49SYabin Cui
*01826a49SYabin CuiThere are three modes of usage for this script: fastmode will just run a minimal single
*01826a49SYabin Cuibuild comparison (between facebook:dev and facebook:release), onetime will pull all the current
*01826a49SYabin Cuipull requests from the zstd repo and compare facebook:dev to all of them once, continuous
*01826a49SYabin Cuiwill continuously get pull requests from the zstd repo and run benchmarks against facebook:dev.
*01826a49SYabin Cui
*01826a49SYabin Cui```
*01826a49SYabin CuiExample usage: python automated_benchmarking.py
*01826a49SYabin Cui```
*01826a49SYabin Cui
*01826a49SYabin Cui```
*01826a49SYabin Cuiusage: automated_benchmarking.py [-h] [--directory DIRECTORY]
*01826a49SYabin Cui                                 [--levels LEVELS] [--iterations ITERATIONS]
*01826a49SYabin Cui                                 [--emails EMAILS] [--frequency FREQUENCY]
*01826a49SYabin Cui                                 [--mode MODE] [--dict DICT]
*01826a49SYabin Cui
*01826a49SYabin Cuioptional arguments:
*01826a49SYabin Cui  -h, --help            show this help message and exit
*01826a49SYabin Cui  --directory DIRECTORY
*01826a49SYabin Cui                        directory with files to benchmark
*01826a49SYabin Cui  --levels LEVELS       levels to test e.g. ('1,2,3')
*01826a49SYabin Cui  --iterations ITERATIONS
*01826a49SYabin Cui                        number of benchmark iterations to run
*01826a49SYabin Cui  --emails EMAILS       email addresses of people who will be alerted upon
*01826a49SYabin Cui                        regression. Only for continuous mode
*01826a49SYabin Cui  --frequency FREQUENCY
*01826a49SYabin Cui                        specifies the number of seconds to wait before each
*01826a49SYabin Cui                        successive check for new PRs in continuous mode
*01826a49SYabin Cui  --mode MODE           'fastmode', 'onetime', 'current', or 'continuous' (see
*01826a49SYabin Cui                        README.md for details)
*01826a49SYabin Cui  --dict DICT           filename of dictionary to use (when set, this
*01826a49SYabin Cui                        dictionary will be used to compress the files provided
*01826a49SYabin Cui                        inside --directory)
*01826a49SYabin Cui```
*01826a49SYabin Cui
*01826a49SYabin Cui#### `test-zstd-speed.py` - script for testing zstd speed difference between commits
*01826a49SYabin Cui
*01826a49SYabin CuiDEPRECATED
*01826a49SYabin Cui
*01826a49SYabin CuiThis script creates `speedTest` directory to which zstd repository is cloned.
*01826a49SYabin CuiThen it compiles all branches of zstd and performs a speed benchmark for a given list of files (the `testFileNames` parameter).
*01826a49SYabin CuiAfter `sleepTime` (an optional parameter, default 300 seconds) seconds the script checks repository for new commits.
*01826a49SYabin CuiIf a new commit is found it is compiled and a speed benchmark for this commit is performed.
*01826a49SYabin CuiThe results of the speed benchmark are compared to the previous results.
*01826a49SYabin CuiIf compression or decompression speed for one of zstd levels is lower than `lowerLimit` (an optional parameter, default 0.98) the speed benchmark is restarted.
*01826a49SYabin CuiIf second results are also lower than `lowerLimit` the warning e-mail is sent to recipients from the list (the `emails` parameter).
*01826a49SYabin Cui
*01826a49SYabin CuiAdditional remarks:
*01826a49SYabin Cui- To be sure that speed results are accurate the script should be run on a "stable" target system with no other jobs running in parallel
*01826a49SYabin Cui- Using the script with virtual machines can lead to large variations of speed results
*01826a49SYabin Cui- The speed benchmark is not performed until computers' load average is lower than `maxLoadAvg` (an optional parameter, default 0.75)
*01826a49SYabin Cui- The script sends e-mails using `mutt`; if `mutt` is not available it sends e-mails without attachments using `mail`; if both are not available it only prints a warning
*01826a49SYabin Cui
*01826a49SYabin Cui
*01826a49SYabin CuiThe example usage with two test files, one e-mail address, and with an additional message:
*01826a49SYabin Cui```
*01826a49SYabin Cui./test-zstd-speed.py "silesia.tar calgary.tar" "[email protected]" --message "tested on my laptop" --sleepTime 60
*01826a49SYabin Cui```
*01826a49SYabin Cui
*01826a49SYabin CuiTo run the script in background please use:
*01826a49SYabin Cui```
*01826a49SYabin Cuinohup ./test-zstd-speed.py testFileNames emails &
*01826a49SYabin Cui```
*01826a49SYabin Cui
*01826a49SYabin CuiThe full list of parameters:
*01826a49SYabin Cui```
*01826a49SYabin Cuipositional arguments:
*01826a49SYabin Cui  testFileNames         file names list for speed benchmark
*01826a49SYabin Cui  emails                list of e-mail addresses to send warnings
*01826a49SYabin Cui
*01826a49SYabin Cuioptional arguments:
*01826a49SYabin Cui  -h, --help            show this help message and exit
*01826a49SYabin Cui  --message MESSAGE     attach an additional message to e-mail
*01826a49SYabin Cui  --lowerLimit LOWERLIMIT
*01826a49SYabin Cui                        send email if speed is lower than given limit
*01826a49SYabin Cui  --maxLoadAvg MAXLOADAVG
*01826a49SYabin Cui                        maximum load average to start testing
*01826a49SYabin Cui  --lastCLevel LASTCLEVEL
*01826a49SYabin Cui                        last compression level for testing
*01826a49SYabin Cui  --sleepTime SLEEPTIME
*01826a49SYabin Cui                        frequency of repository checking in seconds
*01826a49SYabin Cui```
*01826a49SYabin Cui
*01826a49SYabin Cui#### `decodecorpus` - tool to generate Zstandard frames for decoder testing
*01826a49SYabin CuiCommand line tool to generate test .zst files.
*01826a49SYabin Cui
*01826a49SYabin CuiThis tool will generate .zst files with checksums,
*01826a49SYabin Cuias well as optionally output the corresponding correct uncompressed data for
*01826a49SYabin Cuiextra verification.
*01826a49SYabin Cui
*01826a49SYabin CuiExample:
*01826a49SYabin Cui```
*01826a49SYabin Cui./decodecorpus -ptestfiles -otestfiles -n10000 -s5
*01826a49SYabin Cui```
*01826a49SYabin Cuiwill generate 10,000 sample .zst files using a seed of 5 in the `testfiles` directory,
*01826a49SYabin Cuiwith the zstd checksum field set,
*01826a49SYabin Cuias well as the 10,000 original files for more detailed comparison of decompression results.
*01826a49SYabin Cui
*01826a49SYabin Cui```
*01826a49SYabin Cui./decodecorpus -t -T1mn
*01826a49SYabin Cui```
*01826a49SYabin Cuiwill choose a random seed, and for 1 minute,
*01826a49SYabin Cuigenerate random test frames and ensure that the
*01826a49SYabin Cuizstd library correctly decompresses them in both simple and streaming modes.
*01826a49SYabin Cui
*01826a49SYabin Cui#### `paramgrill` - tool for generating compression table parameters and optimizing parameters on file given constraints
*01826a49SYabin Cui
*01826a49SYabin CuiFull list of arguments
*01826a49SYabin Cui```
*01826a49SYabin Cui -T#          : set level 1 speed objective
*01826a49SYabin Cui -B#          : cut input into blocks of size # (default : single block)
*01826a49SYabin Cui -S           : benchmarks a single run (example command: -Sl3w10h12)
*01826a49SYabin Cui    w# - windowLog
*01826a49SYabin Cui    h# - hashLog
*01826a49SYabin Cui    c# - chainLog
*01826a49SYabin Cui    s# - searchLog
*01826a49SYabin Cui    l# - minMatch
*01826a49SYabin Cui    t# - targetLength
*01826a49SYabin Cui    S# - strategy
*01826a49SYabin Cui    L# - level
*01826a49SYabin Cui --zstd=      : Single run, parameter selection syntax same as zstdcli with more parameters
*01826a49SYabin Cui                    (Added forceAttachDictionary / fadt)
*01826a49SYabin Cui                    When invoked with --optimize, this represents the sample to exceed.
*01826a49SYabin Cui --optimize=  : find parameters to maximize compression ratio given parameters
*01826a49SYabin Cui                    Can use all --zstd= commands to constrain the type of solution found in addition to the following constraints
*01826a49SYabin Cui    cSpeed=   : Minimum compression speed
*01826a49SYabin Cui    dSpeed=   : Minimum decompression speed
*01826a49SYabin Cui    cMem=     : Maximum compression memory
*01826a49SYabin Cui    lvl=      : Searches for solutions which are strictly better than that compression lvl in ratio and cSpeed,
*01826a49SYabin Cui    stc=      : When invoked with lvl=, represents percentage slack in ratio/cSpeed allowed for a solution to be considered (Default 100%)
*01826a49SYabin Cui              : In normal operation, represents percentage slack in choosing viable starting strategy selection in choosing the default parameters
*01826a49SYabin Cui                    (Lower value will begin with stronger strategies) (Default 90%)
*01826a49SYabin Cui    speedRatio=   (accepts decimals)
*01826a49SYabin Cui              : determines value of gains in speed vs gains in ratio
*01826a49SYabin Cui                    when determining overall winner (default 5 (1% ratio = 5% speed)).
*01826a49SYabin Cui    tries=    : Maximum number of random restarts on a single strategy before switching (Default 5)
*01826a49SYabin Cui                    Higher values will make optimizer run longer, more chances to find better solution.
*01826a49SYabin Cui    memLog    : Limits the log of the size of each memotable (1 per strategy). Will use hash tables when state space is larger than max size.
*01826a49SYabin Cui                    Setting memLog = 0 turns off memoization
*01826a49SYabin Cui --display=   : specify which parameters are included in the output
*01826a49SYabin Cui                    can use all --zstd parameter names and 'cParams' as a shorthand for all parameters used in ZSTD_compressionParameters
*01826a49SYabin Cui                    (Default: display all params available)
*01826a49SYabin Cui -P#          : generated sample compressibility (when no file is provided)
*01826a49SYabin Cui -t#          : Caps runtime of operation in seconds (default: 99999 seconds (about 27 hours))
*01826a49SYabin Cui -v           : Prints Benchmarking output
*01826a49SYabin Cui -D           : Next argument dictionary file
*01826a49SYabin Cui -s           : Benchmark all files separately
*01826a49SYabin Cui -q           : Quiet, repeat for more quiet
*01826a49SYabin Cui                  -q Prints parameters + results whenever a new best is found
*01826a49SYabin Cui                  -qq Only prints parameters whenever a new best is found, prints final parameters + results
*01826a49SYabin Cui                  -qqq Only print final parameters + results
*01826a49SYabin Cui                  -qqqq Only prints final parameter set in the form --zstd=
*01826a49SYabin Cui -v           : Verbose, cancels quiet, repeat for more volume
*01826a49SYabin Cui                  -v Prints all candidate parameters and results
*01826a49SYabin Cui
*01826a49SYabin Cui```
*01826a49SYabin Cui Any inputs afterwards are treated as files to benchmark.