1<?% config.freshness.owner = 'brandtr' %?> 2<?% config.freshness.reviewed = '2021-04-15' %?> 3 4# Video coding in WebRTC 5 6## Introduction to layered video coding 7 8[Video coding][video-coding-wiki] is the process of encoding a stream of 9uncompressed video frames into a compressed bitstream, whose bitrate is lower 10than that of the original stream. 11 12### Block-based hybrid video coding 13 14All video codecs in WebRTC are based on the block-based hybrid video coding 15paradigm, which entails prediction of the original video frame using either 16[information from previously encoded frames][motion-compensation-wiki] or 17information from previously encoded portions of the current frame, subtraction 18of the prediction from the original video, and 19[transform][transform-coding-wiki] and [quantization][quantization-wiki] of the 20resulting difference. The output of the quantization process, quantized 21transform coefficients, is losslessly [entropy coded][entropy-coding-wiki] along 22with other encoder parameters (e.g., those related to the prediction process) 23and then a reconstruction is constructed by inverse quantizing and inverse 24transforming the quantized transform coefficients and adding the result to the 25prediction. Finally, in-loop filtering is applied and the resulting 26reconstruction is stored as a reference frame to be used to develop predictions 27for future frames. 28 29### Frame types 30 31When an encoded frame depends on previously encoded frames (i.e., it has one or 32more inter-frame dependencies), the prior frames must be available at the 33receiver before the current frame can be decoded. In order for a receiver to 34start decoding an encoded bitstream, a frame which has no prior dependencies is 35required. Such a frame is called a "key frame". For real-time-communications 36encoding, key frames typically compress less efficiently than "delta frames" 37(i.e., frames whose predictions are derived from previously encoded frames). 38 39### Single-layer coding 40 41In 1:1 calls, the encoded bitstream has a single recipient. Using end-to-end 42bandwidth estimation, the target bitrate can thus be well tailored for the 43intended recipient. The number of key frames can be kept to a minimum and the 44compressability of the stream can be maximized. One way of achiving this is by 45using "single-layer coding", where each delta frame only depends on the frame 46that was most recently encoded. 47 48### Scalable video coding 49 50In multiway conferences, on the other hand, the encoded bitstream has multiple 51recipients each of whom may have different downlink bandwidths. In order to 52tailor the encoded bitstreams to a heterogeneous network of receivers, 53[scalable video coding][svc-wiki] can be used. The idea is to introduce 54structure into the dependency graph of the encoded bitstream, such that _layers_ of 55the full stream can be decoded using only available lower layers. This structure 56allows for a [selective forwarding unit][sfu-webrtc-glossary] to discard upper 57layers of the of the bitstream in order to achieve the intended downlink 58bandwidth. 59 60There are multiple types of scalability: 61 62* _Temporal scalability_ are layers whose framerate (and bitrate) is lower than that of the upper layer(s) 63* _Spatial scalability_ are layers whose resolution (and bitrate) is lower than that of the upper layer(s) 64* _Quality scalability_ are layers whose bitrate is lower than that of the upper layer(s) 65 66WebRTC supports temporal scalability for `VP8`, `VP9` and `AV1`, and spatial 67scalability for `VP9` and `AV1`. 68 69### Simulcast 70 71Simulcast is another approach for multiway conferencing, where multiple 72_independent_ bitstreams are produced by the encoder. 73 74In cases where multiple encodings of the same source are required (e.g., uplink 75transmission in a multiway call), spatial scalability with inter-layer 76prediction generally offers superior coding efficiency compared with simulcast. 77When a single encoding is required (e.g., downlink transmission in any call), 78simulcast generally provides better coding efficiency for the upper spatial 79layers. The `K-SVC` concept, where spatial inter-layer dependencies are only 80used to encode key frames, for which inter-layer prediction is typically 81significantly more effective than it is for delta frames, can be seen as a 82compromise between full spatial scalability and simulcast. 83 84## Overview of implementation in `modules/video_coding` 85 86Given the general introduction to video coding above, we now describe some 87specifics of the [`modules/video_coding`][modules-video-coding] folder in WebRTC. 88 89### Built-in software codecs in [`modules/video_coding/codecs`][modules-video-coding-codecs] 90 91This folder contains WebRTC-specific classes that wrap software codec 92implementations for different video coding standards: 93 94* [libaom][libaom-src] for [AV1][av1-spec] 95* [libvpx][libvpx-src] for [VP8][vp8-spec] and [VP9][vp9-spec] 96* [OpenH264][openh264-src] for [H.264 constrained baseline profile][h264-spec] 97 98Users of the library can also inject their own codecs, using the 99[VideoEncoderFactory][video-encoder-factory-interface] and 100[VideoDecoderFactory][video-decoder-factory-interface] interfaces. This is how 101platform-supported codecs, such as hardware backed codecs, are implemented. 102 103### Video codec test framework in [`modules/video_coding/codecs/test`][modules-video-coding-codecs-test] 104 105This folder contains a test framework that can be used to evaluate video quality 106performance of different video codec implementations. 107 108### SVC helper classes in [`modules/video_coding/svc`][modules-video-coding-svc] 109 110* [`ScalabilityStructure*`][scalabilitystructure] - different 111 [standardized scalability structures][scalability-structure-spec] 112* [`ScalableVideoController`][scalablevideocontroller] - provides instructions to the video encoder how 113 to create a scalable stream 114* [`SvcRateAllocator`][svcrateallocator] - bitrate allocation to different spatial and temporal 115 layers 116 117### Utility classes in [`modules/video_coding/utility`][modules-video-coding-utility] 118 119* [`FrameDropper`][framedropper] - drops incoming frames when encoder systematically 120 overshoots its target bitrate 121* [`FramerateController`][frameratecontroller] - drops incoming frames to achieve a target framerate 122* [`QpParser`][qpparser] - parses the quantization parameter from a bitstream 123* [`QualityScaler`][qualityscaler] - signals when an encoder generates encoded frames whose 124 quantization parameter is outside the window of acceptable values 125* [`SimulcastRateAllocator`][simulcastrateallocator] - bitrate allocation to simulcast layers 126 127### General helper classes in [`modules/video_coding`][modules-video-coding] 128 129* [`FecControllerDefault`][feccontrollerdefault] - provides a default implementation for rate 130 allocation to [forward error correction][fec-wiki] 131* [`VideoCodecInitializer`][videocodecinitializer] - converts between different encoder configuration 132 structs 133 134### Receiver buffer classes in [`modules/video_coding`][modules-video-coding] 135 136* [`PacketBuffer`][packetbuffer] - (re-)combines RTP packets into frames 137* [`RtpFrameReferenceFinder`][rtpframereferencefinder] - determines dependencies between frames based on information in the RTP header, payload header and RTP extensions 138* [`FrameBuffer`][framebuffer] - order frames based on their dependencies to be fed to the decoder 139 140[video-coding-wiki]: https://en.wikipedia.org/wiki/Video_coding_format 141[motion-compensation-wiki]: https://en.wikipedia.org/wiki/Motion_compensation 142[transform-coding-wiki]: https://en.wikipedia.org/wiki/Transform_coding 143[motion-vector-wiki]: https://en.wikipedia.org/wiki/Motion_vector 144[mpeg-wiki]: https://en.wikipedia.org/wiki/Moving_Picture_Experts_Group 145[svc-wiki]: https://en.wikipedia.org/wiki/Scalable_Video_Coding 146[sfu-webrtc-glossary]: https://webrtcglossary.com/sfu/ 147[libvpx-src]: https://chromium.googlesource.com/webm/libvpx/ 148[libaom-src]: https://aomedia.googlesource.com/aom/ 149[openh264-src]: https://github.com/cisco/openh264 150[vp8-spec]: https://tools.ietf.org/html/rfc6386 151[vp9-spec]: https://storage.googleapis.com/downloads.webmproject.org/docs/vp9/vp9-bitstream-specification-v0.6-20160331-draft.pdf 152[av1-spec]: https://aomediacodec.github.io/av1-spec/ 153[h264-spec]: https://www.itu.int/rec/T-REC-H.264-201906-I/en 154[video-encoder-factory-interface]: https://source.chromium.org/chromium/chromium/src/+/main:third_party/webrtc/api/video_codecs/video_encoder_factory.h;l=27;drc=afadfb24a5e608da6ae102b20b0add53a083dcf3 155[video-decoder-factory-interface]: https://source.chromium.org/chromium/chromium/src/+/main:third_party/webrtc/api/video_codecs/video_decoder_factory.h;l=27;drc=49c293f03d8f593aa3aca282577fcb14daa63207 156[scalability-structure-spec]: https://w3c.github.io/webrtc-svc/#scalabilitymodes* 157[fec-wiki]: https://en.wikipedia.org/wiki/Error_correction_code#Forward_error_correction 158[entropy-coding-wiki]: https://en.wikipedia.org/wiki/Entropy_encoding 159[modules-video-coding]: https://source.chromium.org/chromium/chromium/src/+/main:third_party/webrtc/modules/video_coding/ 160[modules-video-coding-codecs]: https://source.chromium.org/chromium/chromium/src/+/main:third_party/webrtc/modules/video_coding/codecs/ 161[modules-video-coding-codecs-test]: https://source.chromium.org/chromium/chromium/src/+/main:third_party/webrtc/modules/video_coding/codecs/test/ 162[modules-video-coding-svc]: https://source.chromium.org/chromium/chromium/src/+/main:third_party/webrtc/modules/video_coding/svc/ 163[modules-video-coding-utility]: https://source.chromium.org/chromium/chromium/src/+/main:third_party/webrtc/modules/video_coding/utility/ 164[scalabilitystructure]: https://source.chromium.org/chromium/chromium/src/+/main:third_party/webrtc/modules/video_coding/svc/create_scalability_structure.h?q=CreateScalabilityStructure 165[scalablevideocontroller]: https://source.chromium.org/chromium/chromium/src/+/main:third_party/webrtc/modules/video_coding/svc/scalable_video_controller.h?q=ScalableVideoController 166[svcrateallocator]: https://source.chromium.org/chromium/chromium/src/+/main:third_party/webrtc/modules/video_coding/svc/svc_rate_allocator.h?q=SvcRateAllocator 167[framedropper]: https://source.chromium.org/chromium/chromium/src/+/main:third_party/webrtc/modules/video_coding/utility/frame_dropper.h?q=FrameDropper 168[frameratecontroller]: https://source.chromium.org/chromium/chromium/src/+/main:third_party/webrtc/modules/video_coding/utility/framerate_controller.h?q=FramerateController 169[qpparser]: https://source.chromium.org/chromium/chromium/src/+/main:third_party/webrtc/modules/video_coding/utility/qp_parser.h?q=QpParser 170[qualityscaler]: https://source.chromium.org/chromium/chromium/src/+/main:third_party/webrtc/modules/video_coding/utility/quality_scaler.h?q=QualityScaler 171[simulcastrateallocator]: https://source.chromium.org/chromium/chromium/src/+/main:third_party/webrtc/modules/video_coding/utility/simulcast_rate_allocator.h?q=SimulcastRateAllocator 172[feccontrollerdefault]: https://source.chromium.org/chromium/chromium/src/+/main:third_party/webrtc/modules/video_coding/fec_controller_default.h?q=FecControllerDefault 173[videocodecinitializer]: https://source.chromium.org/chromium/chromium/src/+/main:third_party/webrtc/modules/video_coding/include/video_codec_initializer.h?q=VideoCodecInitializer 174[packetbuffer]: https://source.chromium.org/chromium/chromium/src/+/main:third_party/webrtc/modules/video_coding/packet_buffer.h?q=PacketBuffer 175[rtpframereferencefinder]: https://source.chromium.org/chromium/chromium/src/+/main:third_party/webrtc/modules/video_coding/rtp_frame_reference_finder.h?q=RtpFrameReferenceFinder 176[framebuffer]: https://source.chromium.org/chromium/chromium/src/+/main:third_party/webrtc/modules/video_coding/frame_buffer2.h?q=FrameBuffer 177[quantization-wiki]: https://en.wikipedia.org/wiki/Quantization_(signal_processing) 178