external/zstd/CONTRIBUTING.md

*01826a49SYabin Cui# Contributing to Zstandard
*01826a49SYabin CuiWe want to make contributing to this project as easy and transparent as
*01826a49SYabin Cuipossible.
*01826a49SYabin Cui
*01826a49SYabin Cui## Our Development Process
*01826a49SYabin CuiNew versions are being developed in the "dev" branch,
*01826a49SYabin Cuior in their own feature branch.
*01826a49SYabin CuiWhen they are deemed ready for a release, they are merged into "release".
*01826a49SYabin Cui
*01826a49SYabin CuiAs a consequence, all contributions must stage first through "dev"
*01826a49SYabin Cuior their own feature branch.
*01826a49SYabin Cui
*01826a49SYabin Cui## Pull Requests
*01826a49SYabin CuiWe actively welcome your pull requests.
*01826a49SYabin Cui
*01826a49SYabin Cui1. Fork the repo and create your branch from `dev`.
*01826a49SYabin Cui2. If you've added code that should be tested, add tests.
*01826a49SYabin Cui3. If you've changed APIs, update the documentation.
*01826a49SYabin Cui4. Ensure the test suite passes.
*01826a49SYabin Cui5. Make sure your code lints.
*01826a49SYabin Cui6. If you haven't already, complete the Contributor License Agreement ("CLA").
*01826a49SYabin Cui
*01826a49SYabin Cui## Contributor License Agreement ("CLA")
*01826a49SYabin CuiIn order to accept your pull request, we need you to submit a CLA. You only need
*01826a49SYabin Cuito do this once to work on any of Facebook's open source projects.
*01826a49SYabin Cui
*01826a49SYabin CuiComplete your CLA here: <https://code.facebook.com/cla>
*01826a49SYabin Cui
*01826a49SYabin Cui## Workflow
*01826a49SYabin CuiZstd uses a branch-based workflow for making changes to the codebase. Typically, zstd
*01826a49SYabin Cuiwill use a new branch per sizable topic. For smaller changes, it is okay to lump multiple
*01826a49SYabin Cuirelated changes into a branch.
*01826a49SYabin Cui
*01826a49SYabin CuiOur contribution process works in three main stages:
*01826a49SYabin Cui1. Local development
*01826a49SYabin Cui    * Update:
*01826a49SYabin Cui        * Checkout your fork of zstd if you have not already
*01826a49SYabin Cui        ```
*01826a49SYabin Cui        git checkout https://github.com/<username>/zstd
*01826a49SYabin Cui        cd zstd
*01826a49SYabin Cui        ```
*01826a49SYabin Cui        * Update your local dev branch
*01826a49SYabin Cui        ```
*01826a49SYabin Cui        git pull https://github.com/facebook/zstd dev
*01826a49SYabin Cui        git push origin dev
*01826a49SYabin Cui        ```
*01826a49SYabin Cui    * Topic and development:
*01826a49SYabin Cui        * Make a new branch on your fork about the topic you're developing for
*01826a49SYabin Cui        ```
*01826a49SYabin Cui        # branch names should be concise but sufficiently informative
*01826a49SYabin Cui        git checkout -b <branch-name>
*01826a49SYabin Cui        git push origin <branch-name>
*01826a49SYabin Cui        ```
*01826a49SYabin Cui        * Make commits and push
*01826a49SYabin Cui        ```
*01826a49SYabin Cui        # make some changes =
*01826a49SYabin Cui        git add -u && git commit -m <message>
*01826a49SYabin Cui        git push origin <branch-name>
*01826a49SYabin Cui        ```
*01826a49SYabin Cui        * Note: run local tests to ensure that your changes didn't break existing functionality
*01826a49SYabin Cui            * Quick check
*01826a49SYabin Cui            ```
*01826a49SYabin Cui            make shortest
*01826a49SYabin Cui            ```
*01826a49SYabin Cui            * Longer check
*01826a49SYabin Cui            ```
*01826a49SYabin Cui            make test
*01826a49SYabin Cui            ```
*01826a49SYabin Cui2. Code Review and CI tests
*01826a49SYabin Cui    * Ensure CI tests pass:
*01826a49SYabin Cui        * Before sharing anything to the community, create a pull request in your own fork against the dev branch
*01826a49SYabin Cui        and make sure that all GitHub Actions CI tests pass. See the Continuous Integration section below for more information.
*01826a49SYabin Cui        * Ensure that static analysis passes on your development machine. See the Static Analysis section
*01826a49SYabin Cui        below to see how to do this.
*01826a49SYabin Cui    * Create a pull request:
*01826a49SYabin Cui        * When you are ready to share you changes to the community, create a pull request from your branch
*01826a49SYabin Cui        to facebook:dev. You can do this very easily by clicking 'Create Pull Request' on your fork's home
*01826a49SYabin Cui        page.
*01826a49SYabin Cui        * From there, select the branch where you made changes as your source branch and facebook:dev
*01826a49SYabin Cui        as the destination.
*01826a49SYabin Cui        * Examine the diff presented between the two branches to make sure there is nothing unexpected.
*01826a49SYabin Cui    * Write a good pull request description:
*01826a49SYabin Cui        * While there is no strict template that our contributors follow, we would like them to
*01826a49SYabin Cui        sufficiently summarize and motivate the changes they are proposing. We recommend all pull requests,
*01826a49SYabin Cui        at least indirectly, address the following points.
*01826a49SYabin Cui            * Is this pull request important and why?
*01826a49SYabin Cui            * Is it addressing an issue? If so, what issue? (provide links for convenience please)
*01826a49SYabin Cui            * Is this a new feature? If so, why is it useful and/or necessary?
*01826a49SYabin Cui            * Are there background references and documents that reviewers should be aware of to properly assess this change?
*01826a49SYabin Cui        * Note: make sure to point out any design and architectural decisions that you made and the rationale behind them.
*01826a49SYabin Cui        * Note: if you have been working with a specific user and would like them to review your work, make sure you mention them using (@<username>)
*01826a49SYabin Cui    * Submit the pull request and iterate with feedback.
*01826a49SYabin Cui3. Merge and Release
*01826a49SYabin Cui    * Getting approval:
*01826a49SYabin Cui        * You will have to iterate on your changes with feedback from other collaborators to reach a point
*01826a49SYabin Cui        where your pull request can be safely merged.
*01826a49SYabin Cui        * To avoid too many comments on style and convention, make sure that you have a
*01826a49SYabin Cui        look at our style section below before creating a pull request.
*01826a49SYabin Cui        * Eventually, someone from the zstd team will approve your pull request and not long after merge it into
*01826a49SYabin Cui        the dev branch.
*01826a49SYabin Cui    * Housekeeping:
*01826a49SYabin Cui        * Most PRs are linked with one or more Github issues. If this is the case for your PR, make sure
*01826a49SYabin Cui        the corresponding issue is mentioned. If your change 'fixes' or completely addresses the
*01826a49SYabin Cui        issue at hand, then please indicate this by requesting that an issue be closed by commenting.
*01826a49SYabin Cui        * Just because your changes have been merged does not mean the topic or larger issue is complete. Remember
*01826a49SYabin Cui        that the change must make it to an official zstd release for it to be meaningful. We recommend
*01826a49SYabin Cui        that contributors track the activity on their pull request and corresponding issue(s) page(s) until
*01826a49SYabin Cui        their change makes it to the next release of zstd. Users will often discover bugs in your code or
*01826a49SYabin Cui        suggest ways to refine and improve your initial changes even after the pull request is merged.
*01826a49SYabin Cui
*01826a49SYabin Cui## Static Analysis
*01826a49SYabin CuiStatic analysis is a process for examining the correctness or validity of a program without actually
*01826a49SYabin Cuiexecuting it. It usually helps us find many simple bugs. Zstd uses clang's `scan-build` tool for
*01826a49SYabin Cuistatic analysis. You can install it by following the instructions for your OS on https://clang-analyzer.llvm.org/scan-build.
*01826a49SYabin Cui
*01826a49SYabin CuiOnce installed, you can ensure that our static analysis tests pass on your local development machine
*01826a49SYabin Cuiby running:
*01826a49SYabin Cui```
*01826a49SYabin Cuimake staticAnalyze
*01826a49SYabin Cui```
*01826a49SYabin Cui
*01826a49SYabin CuiIn general, you can use `scan-build` to static analyze any build script. For example, to static analyze
*01826a49SYabin Cuijust `contrib/largeNbDicts` and nothing else, you can run:
*01826a49SYabin Cui
*01826a49SYabin Cui```
*01826a49SYabin Cuiscan-build make -C contrib/largeNbDicts largeNbDicts
*01826a49SYabin Cui```
*01826a49SYabin Cui
*01826a49SYabin Cui### Pitfalls of static analysis
*01826a49SYabin Cui`scan-build` is part of our regular CI suite. Other static analyzers are not.
*01826a49SYabin Cui
*01826a49SYabin CuiIt can be useful to look at additional static analyzers once in a while (and we do), but it's not a good idea to multiply the nb of analyzers run continuously at each commit and PR. The reasons are :
*01826a49SYabin Cui
*01826a49SYabin Cui- Static analyzers are full of false positive. The signal to noise ratio is actually pretty low.
*01826a49SYabin Cui- A good CI policy is "zero-warning tolerance". That means that all issues must be solved, including false positives. This quickly becomes a tedious workload.
*01826a49SYabin Cui- Multiple static analyzers will feature multiple kind of false positives, sometimes applying to the same code but in different ways leading to :
*01826a49SYabin Cui   + tortuous code, trying to please multiple constraints, hurting readability and therefore maintenance. Sometimes, such complexity introduce other more subtle bugs, that are just out of scope of the analyzers.
*01826a49SYabin Cui   + sometimes, these constraints are mutually exclusive : if one try to solve one, the other static analyzer will complain, they can't be both happy at the same time.
*01826a49SYabin Cui- As if that was not enough, the list of false positives change with each version. It's hard enough to follow one static analyzer, but multiple ones with their own update agenda, this quickly becomes a massive velocity reducer.
*01826a49SYabin Cui
*01826a49SYabin CuiThis is different from running a static analyzer once in a while, looking at the output, and __cherry picking__ a few warnings that seem helpful, either because they detected a genuine risk of bug, or because it helps expressing the code in a way which is more readable or more difficult to misuse. These kinds of reports can be useful, and are accepted.
*01826a49SYabin Cui
*01826a49SYabin Cui## Continuous Integration
*01826a49SYabin CuiCI tests run every time a pull request (PR) is created or updated. The exact tests
*01826a49SYabin Cuithat get run will depend on the destination branch you specify. Some tests take
*01826a49SYabin Cuilonger to run than others. Currently, our CI is set up to run a short
*01826a49SYabin Cuiseries of tests when creating a PR to the dev branch and a longer series of tests
*01826a49SYabin Cuiwhen creating a PR to the release branch. You can look in the configuration files
*01826a49SYabin Cuiof the respective CI platform for more information on what gets run when.
*01826a49SYabin Cui
*01826a49SYabin CuiMost people will just want to create a PR with the destination set to their local dev
*01826a49SYabin Cuibranch of zstd. You can then find the status of the tests on the PR's page. You can also
*01826a49SYabin Cuire-run tests and cancel running tests from the PR page or from the respective CI's dashboard.
*01826a49SYabin Cui
*01826a49SYabin CuiAlmost all of zstd's CI runs on GitHub Actions (configured at `.github/workflows`), which will automatically run on PRs to your
*01826a49SYabin Cuiown fork. A small number of tests run on other services (e.g. Travis CI, Circle CI, Appveyor).
*01826a49SYabin CuiThese require work to set up on your local fork, and (at least for Travis CI) cost money.
*01826a49SYabin CuiTherefore, if the PR on your local fork passes GitHub Actions, feel free to submit a PR
*01826a49SYabin Cuiagainst the main repo.
*01826a49SYabin Cui
*01826a49SYabin Cui### Third-party CI
*01826a49SYabin CuiA small number of tests cannot run on GitHub Actions, or have yet to be migrated.
*01826a49SYabin CuiFor these, we use a variety of third-party services (listed below). It is not necessary to set
*01826a49SYabin Cuithese up on your fork in order to contribute to zstd; however, we do link to instructions for those
*01826a49SYabin Cuiwho want earlier signal.
*01826a49SYabin Cui
*01826a49SYabin Cui| Service   | Purpose                                                                                                    | Setup Links                                                                                                                                            | Config Path            |
*01826a49SYabin Cui|-----------|------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------|
*01826a49SYabin Cui| Travis CI | Used for testing on non-x86 architectures such as PowerPC                                                  | https://docs.travis-ci.com/user/tutorial/#to-get-started-with-travis-ci-using-github <br> https://github.com/marketplace/travis-ci                     | `.travis.yml`          |
*01826a49SYabin Cui| AppVeyor  | Used for some Windows testing (e.g. cygwin, mingw)                                                         | https://www.appveyor.com/blog/2018/10/02/github-apps-integration/ <br> https://github.com/marketplace/appveyor                                         | `appveyor.yml`         |
*01826a49SYabin Cui| Cirrus CI | Used for testing on FreeBSD                                                                                | https://github.com/marketplace/cirrus-ci/                                                                                                              | `.cirrus.yml`          |
*01826a49SYabin Cui| Circle CI | Historically was used to provide faster signal,<br/> but we may be able to migrate these to Github Actions | https://circleci.com/docs/2.0/getting-started/#setting-up-circleci <br> https://youtu.be/Js3hMUsSZ2c <br> https://circleci.com/docs/2.0/enable-checks/ | `.circleci/config.yml` |
*01826a49SYabin Cui
*01826a49SYabin CuiNote: the instructions linked above mostly cover how to set up a repository with CI from scratch.
*01826a49SYabin CuiThe general idea should be the same for setting up CI on your fork of zstd, but you may have to
*01826a49SYabin Cuifollow slightly different steps. In particular, please ignore any instructions related to setting up
*01826a49SYabin Cuiconfig files (since zstd already has configs for each of these services).
*01826a49SYabin Cui
*01826a49SYabin Cui## Performance
*01826a49SYabin CuiPerformance is extremely important for zstd and we only merge pull requests whose performance
*01826a49SYabin Cuilandscape and corresponding trade-offs have been adequately analyzed, reproduced, and presented.
*01826a49SYabin CuiThis high bar for performance means that every PR which has the potential to
*01826a49SYabin Cuiimpact performance takes a very long time for us to properly review. That being said, we
*01826a49SYabin Cuialways welcome contributions to improve performance (or worsen performance for the trade-off of
*01826a49SYabin Cuisomething else). Please keep the following in mind before submitting a performance related PR:
*01826a49SYabin Cui
*01826a49SYabin Cui1. Zstd isn't as old as gzip but it has been around for time now and its evolution is
*01826a49SYabin Cuivery well documented via past Github issues and pull requests. It may be the case that your
*01826a49SYabin Cuiparticular performance optimization has already been considered in the past. Please take some
*01826a49SYabin Cuitime to search through old issues and pull requests using keywords specific to your
*01826a49SYabin Cuiwould-be PR. Of course, just because a topic has already been discussed (and perhaps rejected
*01826a49SYabin Cuion some grounds) in the past, doesn't mean it isn't worth bringing up again. But even in that case,
*01826a49SYabin Cuiit will be helpful for you to have context from that topic's history before contributing.
*01826a49SYabin Cui2. The distinction between noise and actual performance gains can unfortunately be very subtle
*01826a49SYabin Cuiespecially when microbenchmarking extremely small wins or losses. The only remedy to getting
*01826a49SYabin Cuisomething subtle merged is extensive benchmarking. You will be doing us a great favor if you
*01826a49SYabin Cuitake the time to run extensive, long-duration, and potentially cross-(os, platform, process, etc)
*01826a49SYabin Cuibenchmarks on your end before submitting a PR. Of course, you will not be able to benchmark
*01826a49SYabin Cuiyour changes on every single processor and os out there (and neither will we) but do that best
*01826a49SYabin Cuiyou can:) We've added some things to think about when benchmarking below in the Benchmarking
*01826a49SYabin CuiPerformance section which might be helpful for you.
*01826a49SYabin Cui3. Optimizing performance for a certain OS, processor vendor, compiler, or network system is a perfectly
*01826a49SYabin Cuilegitimate thing to do as long as it does not harm the overall performance health of Zstd.
*01826a49SYabin CuiThis is a hard balance to strike but please keep in mind other aspects of Zstd when
*01826a49SYabin Cuisubmitting changes that are clang-specific, windows-specific, etc.
*01826a49SYabin Cui
*01826a49SYabin Cui## Benchmarking Performance
*01826a49SYabin CuiPerformance microbenchmarking is a tricky subject but also essential for Zstd. We value empirical
*01826a49SYabin Cuitesting over theoretical speculation. This guide it not perfect but for most scenarios, it
*01826a49SYabin Cuiis a good place to start.
*01826a49SYabin Cui
*01826a49SYabin Cui### Stability
*01826a49SYabin CuiUnfortunately, the most important aspect in being able to benchmark reliably is to have a stable
*01826a49SYabin Cuibenchmarking machine. A virtual machine, a machine with shared resources, or your laptop
*01826a49SYabin Cuiwill typically not be stable enough to obtain reliable benchmark results. If you can get your
*01826a49SYabin Cuihands on a desktop, this is usually a better scenario.
*01826a49SYabin Cui
*01826a49SYabin CuiOf course, benchmarking can be done on non-hyper-stable machines as well. You will just have to
*01826a49SYabin Cuido a little more work to ensure that you are in fact measuring the changes you've made and not
*01826a49SYabin Cuinoise. Here are some things you can do to make your benchmarks more stable:
*01826a49SYabin Cui
*01826a49SYabin Cui1. The most simple thing you can do to drastically improve the stability of your benchmark is
*01826a49SYabin Cuito run it multiple times and then aggregate the results of those runs. As a general rule of
*01826a49SYabin Cuithumb, the smaller the change you are trying to measure, the more samples of benchmark runs
*01826a49SYabin Cuiyou will have to aggregate over to get reliable results. Here are some additional things to keep in
*01826a49SYabin Cuimind when running multiple trials:
*01826a49SYabin Cui    * How you aggregate your samples are important. You might be tempted to use the mean of your
*01826a49SYabin Cui    results. While this is certainly going to be a more stable number than a raw single sample
*01826a49SYabin Cui    benchmark number, you might have more luck by taking the median. The mean is not robust to
*01826a49SYabin Cui    outliers whereas the median is. Better still, you could simply take the fastest speed your
*01826a49SYabin Cui    benchmark achieved on each run since that is likely the fastest your process will be
*01826a49SYabin Cui    capable of running your code. In our experience, this (aggregating by just taking the sample
*01826a49SYabin Cui    with the fastest running time) has been the most stable approach.
*01826a49SYabin Cui    * The more samples you have, the more stable your benchmarks should be. You can verify
*01826a49SYabin Cui    your improved stability by looking at the size of your confidence intervals as you
*01826a49SYabin Cui    increase your sample count. These should get smaller and smaller. Eventually hopefully
*01826a49SYabin Cui    smaller than the performance win you are expecting.
*01826a49SYabin Cui    * Most processors will take some time to get `hot` when running anything. The observations
*01826a49SYabin Cui    you collect during that time period will very different from the true performance number. Having
*01826a49SYabin Cui    a very large number of sample will help alleviate this problem slightly but you can also
*01826a49SYabin Cui    address is directly by simply not including the first `n` iterations of your benchmark in
*01826a49SYabin Cui    your aggregations. You can determine `n` by simply looking at the results from each iteration
*01826a49SYabin Cui    and then hand picking a good threshold after which the variance in results seems to stabilize.
*01826a49SYabin Cui2. You cannot really get reliable benchmarks if your host machine is simultaneously running
*01826a49SYabin Cuianother cpu/memory-intensive application in the background. If you are running benchmarks on your
*01826a49SYabin Cuipersonal laptop for instance, you should close all applications (including your code editor and
*01826a49SYabin Cuibrowser) before running your benchmarks. You might also have invisible background applications
*01826a49SYabin Cuirunning. You can see what these are by looking at either Activity Monitor on Mac or Task Manager
*01826a49SYabin Cuion Windows. You will get more stable benchmark results of you end those processes as well.
*01826a49SYabin Cui    * If you have multiple cores, you can even run your benchmark on a reserved core to prevent
*01826a49SYabin Cui    pollution from other OS and user processes. There are a number of ways to do this depending
*01826a49SYabin Cui    on your OS:
*01826a49SYabin Cui        * On linux boxes, you have use https://github.com/lpechacek/cpuset.
*01826a49SYabin Cui        * On Windows, you can "Set Processor Affinity" using https://www.thewindowsclub.com/processor-affinity-windows
*01826a49SYabin Cui        * On Mac, you can try to use their dedicated affinity API https://developer.apple.com/library/archive/releasenotes/Performance/RN-AffinityAPI/#//apple_ref/doc/uid/TP40006635-CH1-DontLinkElementID_2
*01826a49SYabin Cui3. To benchmark, you will likely end up writing a separate c/c++ program that will link libzstd.
*01826a49SYabin CuiDynamically linking your library will introduce some added variation (not a large amount but
*01826a49SYabin Cuidefinitely some). Statically linking libzstd will be more stable. Static libraries should
*01826a49SYabin Cuibe enabled by default when building zstd.
*01826a49SYabin Cui4. Use a profiler with a good high resolution timer. See the section below on profiling for
*01826a49SYabin Cuidetails on this.
*01826a49SYabin Cui5. Disable frequency scaling, turbo boost and address space randomization (this will vary by OS)
*01826a49SYabin Cui6. Try to avoid storage. On some systems you can use tmpfs. Putting the program, inputs and outputs on
*01826a49SYabin Cuitmpfs avoids touching a real storage system, which can have a pretty big variability.
*01826a49SYabin Cui
*01826a49SYabin CuiAlso check our LLVM's guide on benchmarking here: https://llvm.org/docs/Benchmarking.html
*01826a49SYabin Cui
*01826a49SYabin Cui### Zstd benchmark
*01826a49SYabin CuiThe fastest signal you can get regarding your performance changes is via the in-build zstd cli
*01826a49SYabin Cuibench option. You can run Zstd as you typically would for your scenario using some set of options
*01826a49SYabin Cuiand then additionally also specify the `-b#` option. Doing this will run our benchmarking pipeline
*01826a49SYabin Cuifor that options you have just provided. If you want to look at the internals of how this
*01826a49SYabin Cuibenchmarking script works, you can check out programs/benchzstd.c
*01826a49SYabin Cui
*01826a49SYabin CuiFor example: say you have made a change that you believe improves the speed of zstd level 1. The
*01826a49SYabin Cuivery first thing you should use to assess whether you actually achieved any sort of improvement
*01826a49SYabin Cuiis `zstd -b`. You might try to do something like this. Note: you can use the `-i` option to
*01826a49SYabin Cuispecify a running time for your benchmark in seconds (default is 3 seconds).
*01826a49SYabin CuiUsually, the longer the running time, the more stable your results will be.
*01826a49SYabin Cui
*01826a49SYabin Cui```
*01826a49SYabin Cui$ git checkout <commit-before-your-change>
*01826a49SYabin Cui$ make && cp zstd zstd-old
*01826a49SYabin Cui$ git checkout <commit-after-your-change>
*01826a49SYabin Cui$ make && cp zstd zstd-new
*01826a49SYabin Cui$ zstd-old -i5 -b1 <your-test-data>
*01826a49SYabin Cui 1<your-test-data>         :      8990 ->      3992 (2.252), 302.6 MB/s , 626.4 MB/s
*01826a49SYabin Cui$ zstd-new -i5 -b1 <your-test-data>
*01826a49SYabin Cui 1<your-test-data>         :      8990 ->      3992 (2.252), 302.8 MB/s , 628.4 MB/s
*01826a49SYabin Cui```
*01826a49SYabin Cui
*01826a49SYabin CuiUnless your performance win is large enough to be visible despite the intrinsic noise
*01826a49SYabin Cuion your computer, benchzstd alone will likely not be enough to validate the impact of your
*01826a49SYabin Cuichanges. For example, the results of the example above indicate that effectively nothing
*01826a49SYabin Cuichanged but there could be a small <3% improvement that the noise on the host machine
*01826a49SYabin Cuiobscured. So unless you see a large performance win (10-15% consistently) using just
*01826a49SYabin Cuithis method of evaluation will not be sufficient.
*01826a49SYabin Cui
*01826a49SYabin Cui### Profiling
*01826a49SYabin CuiThere are a number of great profilers out there. We're going to briefly mention how you can
*01826a49SYabin Cuiprofile your code using `instruments` on mac, `perf` on linux and `visual studio profiler`
*01826a49SYabin Cuion Windows.
*01826a49SYabin Cui
*01826a49SYabin CuiSay you have an idea for a change that you think will provide some good performance gains
*01826a49SYabin Cuifor level 1 compression on Zstd. Typically this means, you have identified a section of
*01826a49SYabin Cuicode that you think can be made to run faster.
*01826a49SYabin Cui
*01826a49SYabin CuiThe first thing you will want to do is make sure that the piece of code is actually taking up
*01826a49SYabin Cuia notable amount of time to run. It is usually not worth optimizing something which accounts for less than
*01826a49SYabin Cui0.0001% of the total running time. Luckily, there are tools to help with this.
*01826a49SYabin CuiProfilers will let you see how much time your code spends inside a particular function.
*01826a49SYabin CuiIf your target code snippet is only part of a function, it might be worth trying to
*01826a49SYabin Cuiisolate that snippet by moving it to its own function (this is usually not necessary but
*01826a49SYabin Cuimight be).
*01826a49SYabin Cui
*01826a49SYabin CuiMost profilers (including the profilers discussed below) will generate a call graph of
*01826a49SYabin Cuifunctions for you. Your goal will be to find your function of interest in this call graph
*01826a49SYabin Cuiand then inspect the time spent inside of it. You might also want to look at the annotated
*01826a49SYabin Cuiassembly which most profilers will provide you with.
*01826a49SYabin Cui
*01826a49SYabin Cui#### Instruments
*01826a49SYabin CuiWe will once again consider the scenario where you think you've identified a piece of code
*01826a49SYabin Cuiwhose performance can be improved upon. Follow these steps to profile your code using
*01826a49SYabin CuiInstruments.
*01826a49SYabin Cui
*01826a49SYabin Cui1. Open Instruments
*01826a49SYabin Cui2. Select `Time Profiler` from the list of standard templates
*01826a49SYabin Cui3. Close all other applications except for your instruments window and your terminal
*01826a49SYabin Cui4. Run your benchmarking script from your terminal window
*01826a49SYabin Cui    * You will want a benchmark that runs for at least a few seconds (5 seconds will
*01826a49SYabin Cui    usually be long enough). This way the profiler will have something to work with
*01826a49SYabin Cui    and you will have ample time to attach your profiler to this process:)
*01826a49SYabin Cui    * I will just use benchzstd as my benchmarmking script for this example:
*01826a49SYabin Cui```
*01826a49SYabin Cui$ zstd -b1 -i5 <my-data> # this will run for 5 seconds
*01826a49SYabin Cui```
*01826a49SYabin Cui5. Once you run your benchmarking script, switch back over to instruments and attach your
*01826a49SYabin Cuiprocess to the time profiler. You can do this by:
*01826a49SYabin Cui    * Clicking on the `All Processes` drop down in the top left of the toolbar.
*01826a49SYabin Cui    * Selecting your process from the dropdown. In my case, it is just going to be labeled
*01826a49SYabin Cui    `zstd`
*01826a49SYabin Cui    * Hitting the bright red record circle button on the top left of the toolbar
*01826a49SYabin Cui6. You profiler will now start collecting metrics from your benchmarking script. Once
*01826a49SYabin Cuiyou think you have collected enough samples (usually this is the case after 3 seconds of
*01826a49SYabin Cuirecording), stop your profiler.
*01826a49SYabin Cui7. Make sure that in toolbar of the bottom window, `profile` is selected.
*01826a49SYabin Cui8. You should be able to see your call graph.
*01826a49SYabin Cui    * If you don't see the call graph or an incomplete call graph, make sure you have compiled
*01826a49SYabin Cui    zstd and your benchmarking script using debug flags. On mac and linux, this just means
*01826a49SYabin Cui    you will have to supply the `-g` flag alone with your build script. You might also
*01826a49SYabin Cui    have to provide the `-fno-omit-frame-pointer` flag
*01826a49SYabin Cui9. Dig down the graph to find your function call and then inspect it by double clicking
*01826a49SYabin Cuithe list item. You will be able to see the annotated source code and the assembly side by
*01826a49SYabin Cuiside.
*01826a49SYabin Cui
*01826a49SYabin Cui#### Perf
*01826a49SYabin Cui
*01826a49SYabin CuiThis wiki has a pretty detailed tutorial on getting started working with perf so we'll
*01826a49SYabin Cuileave you to check that out of you're getting started:
*01826a49SYabin Cui
*01826a49SYabin Cuihttps://perf.wiki.kernel.org/index.php/Tutorial
*01826a49SYabin Cui
*01826a49SYabin CuiSome general notes on perf:
*01826a49SYabin Cui* Use `perf stat -r # <bench-program>` to quickly get some relevant timing and
*01826a49SYabin Cuicounter statistics. Perf uses a high resolution timer and this is likely one
*01826a49SYabin Cuiof the first things your team will run when assessing your PR.
*01826a49SYabin Cui* Perf has a long list of hardware counters that can be viewed with `perf --list`.
*01826a49SYabin CuiWhen measuring optimizations, something worth trying is to make sure the hardware
*01826a49SYabin Cuicounters you expect to be impacted by your change are in fact being so. For example,
*01826a49SYabin Cuiif you expect the L1 cache misses to decrease with your change, you can look at the
*01826a49SYabin Cuicounter `L1-dcache-load-misses`
*01826a49SYabin Cui* Perf hardware counters will not work on a virtual machine.
*01826a49SYabin Cui
*01826a49SYabin Cui#### Visual Studio
*01826a49SYabin Cui
*01826a49SYabin CuiTODO
*01826a49SYabin Cui
*01826a49SYabin Cui## Issues
*01826a49SYabin CuiWe use GitHub issues to track public bugs. Please ensure your description is
*01826a49SYabin Cuiclear and has sufficient instructions to be able to reproduce the issue.
*01826a49SYabin Cui
*01826a49SYabin CuiFacebook has a [bounty program](https://www.facebook.com/whitehat/) for the safe
*01826a49SYabin Cuidisclosure of security bugs. In those cases, please go through the process
*01826a49SYabin Cuioutlined on that page and do not file a public issue.
*01826a49SYabin Cui
*01826a49SYabin Cui## Coding Style
*01826a49SYabin CuiIt's a pretty long topic, which is difficult to summarize in a single paragraph.
*01826a49SYabin CuiAs a rule of thumbs, try to imitate the coding style of
*01826a49SYabin Cuisimilar lines of codes around your contribution.
*01826a49SYabin CuiThe following is a non-exhaustive list of rules employed in zstd code base:
*01826a49SYabin Cui
*01826a49SYabin Cui### C90
*01826a49SYabin CuiThis code base is following strict C90 standard,
*01826a49SYabin Cuiwith 2 extensions : 64-bit `long long` types, and variadic macros.
*01826a49SYabin CuiThis rule is applied strictly to code within `lib/` and `programs/`.
*01826a49SYabin CuiSub-project in `contrib/` are allowed to use other conventions.
*01826a49SYabin Cui
*01826a49SYabin Cui### C++ direct compatibility : symbol mangling
*01826a49SYabin CuiAll public symbol declarations must be wrapped in `extern “C” { … }`,
*01826a49SYabin Cuiso that this project can be compiled as C++98 code,
*01826a49SYabin Cuiand linked into C++ applications.
*01826a49SYabin Cui
*01826a49SYabin Cui### Minimal Frugal
*01826a49SYabin CuiThis design requirement is fundamental to preserve the portability of the code base.
*01826a49SYabin Cui#### Dependencies
*01826a49SYabin Cui- Reduce dependencies to the minimum possible level.
*01826a49SYabin Cui  Any dependency should be considered “bad” by default,
*01826a49SYabin Cui  and only tolerated because it provides a service in a better way than can be achieved locally.
*01826a49SYabin Cui  The only external dependencies this repository tolerates are
*01826a49SYabin Cui  standard C libraries, and in rare cases, system level headers.
*01826a49SYabin Cui- Within `lib/`, this policy is even more drastic.
*01826a49SYabin Cui  The only external dependencies allowed are `<assert.h>`, `<stdlib.h>`, `<string.h>`,
*01826a49SYabin Cui  and even then, not directly.
*01826a49SYabin Cui  In particular, no function shall ever allocate on heap directly,
*01826a49SYabin Cui  and must use instead `ZSTD_malloc()` and equivalent.
*01826a49SYabin Cui  Other accepted non-symbol headers are `<stddef.h>` and `<limits.h>`.
*01826a49SYabin Cui- Within the project, there is a strict hierarchy of dependencies that must be respected.
*01826a49SYabin Cui  `programs/` is allowed to depend on `lib/`, but only its public API.
*01826a49SYabin Cui  Within `lib/`, `lib/common` doesn't depend on any other directory.
*01826a49SYabin Cui  `lib/compress` and `lib/decompress` shall not depend on each other.
*01826a49SYabin Cui  `lib/dictBuilder` can depend on `lib/common` and `lib/compress`, but not `lib/decompress`.
*01826a49SYabin Cui#### Resources
*01826a49SYabin Cui- Functions in `lib/` must use very little stack space,
*01826a49SYabin Cui  several dozens of bytes max.
*01826a49SYabin Cui  Everything larger must use the heap allocator,
*01826a49SYabin Cui  or require a scratch buffer to be emplaced manually.
*01826a49SYabin Cui
*01826a49SYabin Cui### Naming
*01826a49SYabin Cui* All public symbols are prefixed with `ZSTD_`
*01826a49SYabin Cui  + private symbols, with a scope limited to their own unit, are free of this restriction.
*01826a49SYabin Cui    However, since `libzstd` source code can be amalgamated,
*01826a49SYabin Cui    each symbol name must attempt to be (and remain) unique.
*01826a49SYabin Cui    Avoid too generic names that could become ground for future collisions.
*01826a49SYabin Cui    This generally implies usage of some form of prefix.
*01826a49SYabin Cui* For symbols (functions and variables), naming convention is `PREFIX_camelCase`.
*01826a49SYabin Cui  + In some advanced cases, one can also find :
*01826a49SYabin Cui    - `PREFIX_prefix2_camelCase`
*01826a49SYabin Cui    - `PREFIX_camelCase_extendedQualifier`
*01826a49SYabin Cui* Multi-words names generally consist of an action followed by object:
*01826a49SYabin Cui  - for example : `ZSTD_createCCtx()`
*01826a49SYabin Cui* Prefer positive actions
*01826a49SYabin Cui  - `goBackward` rather than `notGoForward`
*01826a49SYabin Cui* Type names (`struct`, etc.) follow similar convention,
*01826a49SYabin Cui  except that they are allowed and even invited to start by an Uppercase letter.
*01826a49SYabin Cui  Example : `ZSTD_CCtx`, `ZSTD_CDict`
*01826a49SYabin Cui* Macro names are all Capital letters.
*01826a49SYabin Cui  The same composition rules (`PREFIX_NAME_QUALIFIER`) apply.
*01826a49SYabin Cui* File names are all lowercase letters.
*01826a49SYabin Cui  The convention is `snake_case`.
*01826a49SYabin Cui  File names **must** be unique across the entire code base,
*01826a49SYabin Cui  even when they stand in clearly separated directories.
*01826a49SYabin Cui
*01826a49SYabin Cui### Qualifiers
*01826a49SYabin Cui* This code base is `const` friendly, if not `const` fanatical.
*01826a49SYabin Cui  Any variable that can be `const` (aka. read-only) **must** be `const`.
*01826a49SYabin Cui  Any pointer which content will not be modified must be `const`.
*01826a49SYabin Cui  This property is then controlled at compiler level.
*01826a49SYabin Cui  `const` variables are an important signal to readers that this variable isn't modified.
*01826a49SYabin Cui  Conversely, non-const variables are a signal to readers to watch out for modifications later on in the function.
*01826a49SYabin Cui* If a function must be inlined, mention it explicitly,
*01826a49SYabin Cui  using project's own portable macros, such as `FORCE_INLINE_ATTR`,
*01826a49SYabin Cui  defined in `lib/common/compiler.h`.
*01826a49SYabin Cui
*01826a49SYabin Cui### Debugging
*01826a49SYabin Cui* **Assertions** are welcome, and should be used very liberally,
*01826a49SYabin Cui  to control any condition the code expects for its correct execution.
*01826a49SYabin Cui  These assertion checks will be run in debug builds, and disabled in production.
*01826a49SYabin Cui* For traces, this project provides its own debug macros,
*01826a49SYabin Cui  in particular `DEBUGLOG(level, ...)`, defined in `lib/common/debug.h`.
*01826a49SYabin Cui
*01826a49SYabin Cui### Code documentation
*01826a49SYabin Cui* Avoid code documentation that merely repeats what the code is already stating.
*01826a49SYabin Cui  Whenever applicable, prefer employing the code as the primary way to convey explanations.
*01826a49SYabin Cui  Example 1 : `int nbTokens = n;` instead of `int i = n; /* i is a nb of tokens *./`.
*01826a49SYabin Cui  Example 2 : `assert(size > 0);` instead of `/* here, size should be positive */`.
*01826a49SYabin Cui* At declaration level, the documentation explains how to use the function or variable
*01826a49SYabin Cui  and when applicable why it's needed, of the scenarios where it can be useful.
*01826a49SYabin Cui* At implementation level, the documentation explains the general outline of the algorithm employed,
*01826a49SYabin Cui  and when applicable why this specific choice was preferred.
*01826a49SYabin Cui
*01826a49SYabin Cui### General layout
*01826a49SYabin Cui* 4 spaces for indentation rather than tabs
*01826a49SYabin Cui* Code documentation shall directly precede function declaration or implementation
*01826a49SYabin Cui* Function implementations and its code documentation should be preceded and followed by an empty line
*01826a49SYabin Cui
*01826a49SYabin Cui
*01826a49SYabin Cui## License
*01826a49SYabin CuiBy contributing to Zstandard, you agree that your contributions will be licensed
*01826a49SYabin Cuiunder both the [LICENSE](LICENSE) file and the [COPYING](COPYING) file in the root directory of this source tree.