1*b2055c35SXin Li<!-- 2*b2055c35SXin Li 3*b2055c35SXin LiAlthough you may be viewing an alternate representation, this document is 4*b2055c35SXin Lisourced in Markdown, a light-duty markup scheme, and is optimized for the 5*b2055c35SXin Li[kramdown](https://kramdown.gettalong.org/) transformer. 6*b2055c35SXin Li 7*b2055c35SXin LiSee the accompanying specs_generation.md. External link targets are referenced 8*b2055c35SXin Liat the end of this file. 9*b2055c35SXin Li 10*b2055c35SXin Li--> 11*b2055c35SXin Li 12*b2055c35SXin LiSpecification for WebP Lossless Bitstream 13*b2055c35SXin Li========================================= 14*b2055c35SXin Li 15*b2055c35SXin Li_Jyrki Alakuijala, Ph.D., Google, Inc., 2023-03-09_ 16*b2055c35SXin Li 17*b2055c35SXin LiAbstract 18*b2055c35SXin Li-------- 19*b2055c35SXin Li 20*b2055c35SXin LiWebP lossless is an image format for lossless compression of ARGB images. The 21*b2055c35SXin Lilossless format stores and restores the pixel values exactly, including the 22*b2055c35SXin Licolor values for fully transparent pixels. A universal algorithm for sequential 23*b2055c35SXin Lidata compression (LZ77), prefix coding, and a color cache are used for 24*b2055c35SXin Licompression of the bulk data. Decoding speeds faster than PNG have been 25*b2055c35SXin Lidemonstrated, as well as 25% denser compression than can be achieved using 26*b2055c35SXin Litoday's PNG format. 27*b2055c35SXin Li 28*b2055c35SXin Li* TOC placeholder 29*b2055c35SXin Li{:toc} 30*b2055c35SXin Li 31*b2055c35SXin Li1 Introduction 32*b2055c35SXin Li-------------- 33*b2055c35SXin Li 34*b2055c35SXin LiThis document describes the compressed data representation of a WebP lossless 35*b2055c35SXin Liimage. It is intended as a detailed reference for the WebP lossless encoder and 36*b2055c35SXin Lidecoder implementation. 37*b2055c35SXin Li 38*b2055c35SXin LiIn this document, we extensively use C programming language syntax to describe 39*b2055c35SXin Lithe bitstream and assume the existence of a function for reading bits, 40*b2055c35SXin Li`ReadBits(n)`. The bytes are read in the natural order of the stream containing 41*b2055c35SXin Lithem, and bits of each byte are read in least-significant-bit-first order. When 42*b2055c35SXin Limultiple bits are read at the same time, the integer is constructed from the 43*b2055c35SXin Lioriginal data in the original order. The most significant bits of the returned 44*b2055c35SXin Liinteger are also the most significant bits of the original data. Thus, the 45*b2055c35SXin Listatement 46*b2055c35SXin Li 47*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 48*b2055c35SXin Lib = ReadBits(2); 49*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 50*b2055c35SXin Li 51*b2055c35SXin Liis equivalent with the two statements below: 52*b2055c35SXin Li 53*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 54*b2055c35SXin Lib = ReadBits(1); 55*b2055c35SXin Lib |= ReadBits(1) << 1; 56*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 57*b2055c35SXin Li 58*b2055c35SXin LiWe assume that each color component, that is, alpha, red, blue, and green, is 59*b2055c35SXin Lirepresented using an 8-bit byte. We define the corresponding type as uint8. A 60*b2055c35SXin Liwhole ARGB pixel is represented by a type called uint32, which is an unsigned 61*b2055c35SXin Liinteger consisting of 32 bits. In the code showing the behavior of the 62*b2055c35SXin Litransforms, these values are codified in the following bits: alpha in bits 63*b2055c35SXin Li31..24, red in bits 23..16, green in bits 15..8, and blue in bits 7..0; however, 64*b2055c35SXin Liimplementations of the format are free to use another representation internally. 65*b2055c35SXin Li 66*b2055c35SXin LiBroadly, a WebP lossless image contains header data, transform information, and 67*b2055c35SXin Liactual image data. Headers contain the width and height of the image. A WebP 68*b2055c35SXin Lilossless image can go through four different types of transforms before being 69*b2055c35SXin Lientropy encoded. The transform information in the bitstream contains the data 70*b2055c35SXin Lirequired to apply the respective inverse transforms. 71*b2055c35SXin Li 72*b2055c35SXin Li2 Nomenclature 73*b2055c35SXin Li-------------- 74*b2055c35SXin Li 75*b2055c35SXin LiARGB 76*b2055c35SXin Li: A pixel value consisting of alpha, red, green, and blue values. 77*b2055c35SXin Li 78*b2055c35SXin LiARGB image 79*b2055c35SXin Li: A two-dimensional array containing ARGB pixels. 80*b2055c35SXin Li 81*b2055c35SXin Licolor cache 82*b2055c35SXin Li: A small hash-addressed array to store recently used colors to be able to 83*b2055c35SXin Li recall them with shorter codes. 84*b2055c35SXin Li 85*b2055c35SXin Licolor indexing image 86*b2055c35SXin Li: A one-dimensional image of colors that can be indexed using a small integer 87*b2055c35SXin Li (up to 256 within WebP lossless). 88*b2055c35SXin Li 89*b2055c35SXin Licolor transform image 90*b2055c35SXin Li: A two-dimensional subresolution image containing data about correlations of 91*b2055c35SXin Li color components. 92*b2055c35SXin Li 93*b2055c35SXin Lidistance mapping 94*b2055c35SXin Li: Changes LZ77 distances to have the smallest values for pixels in 95*b2055c35SXin Li two-dimensional proximity. 96*b2055c35SXin Li 97*b2055c35SXin Lientropy image 98*b2055c35SXin Li: A two-dimensional subresolution image indicating which entropy coding should 99*b2055c35SXin Li be used in a respective square in the image, that is, each pixel is a meta 100*b2055c35SXin Li prefix code. 101*b2055c35SXin Li 102*b2055c35SXin LiLZ77 103*b2055c35SXin Li: A dictionary-based sliding window compression algorithm that either emits 104*b2055c35SXin Li symbols or describes them as sequences of past symbols. 105*b2055c35SXin Li 106*b2055c35SXin Limeta prefix code 107*b2055c35SXin Li: A small integer (up to 16 bits) that indexes an element in the meta prefix 108*b2055c35SXin Li table. 109*b2055c35SXin Li 110*b2055c35SXin Lipredictor image 111*b2055c35SXin Li: A two-dimensional subresolution image indicating which spatial predictor is 112*b2055c35SXin Li used for a particular square in the image. 113*b2055c35SXin Li 114*b2055c35SXin Liprefix code 115*b2055c35SXin Li: A classic way to do entropy coding where a smaller number of bits are used 116*b2055c35SXin Li for more frequent codes. 117*b2055c35SXin Li 118*b2055c35SXin Liprefix coding 119*b2055c35SXin Li: A way to entropy code larger integers, which codes a few bits of the integer 120*b2055c35SXin Li using an entropy code and codifies the remaining bits raw. This allows for 121*b2055c35SXin Li the descriptions of the entropy codes to remain relatively small even when 122*b2055c35SXin Li the range of symbols is large. 123*b2055c35SXin Li 124*b2055c35SXin Liscan-line order 125*b2055c35SXin Li: A processing order of pixels (left to right and top to bottom), starting 126*b2055c35SXin Li from the left-hand-top pixel. Once a row is completed, continue from the 127*b2055c35SXin Li left-hand column of the next row. 128*b2055c35SXin Li 129*b2055c35SXin Li3 RIFF Header 130*b2055c35SXin Li------------- 131*b2055c35SXin Li 132*b2055c35SXin LiThe beginning of the header has the RIFF container. This consists of the 133*b2055c35SXin Lifollowing 21 bytes: 134*b2055c35SXin Li 135*b2055c35SXin Li 1. String 'RIFF'. 136*b2055c35SXin Li 2. A little-endian, 32-bit value of the chunk length, which is the whole size 137*b2055c35SXin Li of the chunk controlled by the RIFF header. Normally, this equals 138*b2055c35SXin Li the payload size (file size minus 8 bytes: 4 bytes for the 'RIFF' 139*b2055c35SXin Li identifier and 4 bytes for storing the value itself). 140*b2055c35SXin Li 3. String 'WEBP' (RIFF container name). 141*b2055c35SXin Li 4. String 'VP8L' (FourCC for lossless-encoded image data). 142*b2055c35SXin Li 5. A little-endian, 32-bit value of the number of bytes in the 143*b2055c35SXin Li lossless stream. 144*b2055c35SXin Li 6. 1-byte signature 0x2f. 145*b2055c35SXin Li 146*b2055c35SXin LiThe first 28 bits of the bitstream specify the width and height of the image. 147*b2055c35SXin LiWidth and height are decoded as 14-bit integers as follows: 148*b2055c35SXin Li 149*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 150*b2055c35SXin Liint image_width = ReadBits(14) + 1; 151*b2055c35SXin Liint image_height = ReadBits(14) + 1; 152*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 153*b2055c35SXin Li 154*b2055c35SXin LiThe 14-bit precision for image width and height limits the maximum size of a 155*b2055c35SXin LiWebP lossless image to 16384✕16384 pixels. 156*b2055c35SXin Li 157*b2055c35SXin LiThe alpha_is_used bit is a hint only, and should not impact decoding. It should 158*b2055c35SXin Libe set to 0 when all alpha values are 255 in the picture, and 1 otherwise. 159*b2055c35SXin Li 160*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 161*b2055c35SXin Liint alpha_is_used = ReadBits(1); 162*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 163*b2055c35SXin Li 164*b2055c35SXin LiThe version_number is a 3 bit code that must be set to 0. Any other value should 165*b2055c35SXin Libe treated as an error. 166*b2055c35SXin Li 167*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 168*b2055c35SXin Liint version_number = ReadBits(3); 169*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 170*b2055c35SXin Li 171*b2055c35SXin Li4 Transforms 172*b2055c35SXin Li------------ 173*b2055c35SXin Li 174*b2055c35SXin LiThe transforms are reversible manipulations of the image data that can reduce 175*b2055c35SXin Lithe remaining symbolic entropy by modeling spatial and color correlations. They 176*b2055c35SXin Lican make the final compression more dense. 177*b2055c35SXin Li 178*b2055c35SXin LiAn image can go through four types of transforms. A 1 bit indicates the 179*b2055c35SXin Lipresence of a transform. Each transform is allowed to be used only once. The 180*b2055c35SXin Litransforms are used only for the main-level ARGB image; the subresolution images 181*b2055c35SXin Li(color transform image, entropy image, and predictor image) have no transforms, 182*b2055c35SXin Linot even the 0 bit indicating the end of transforms. 183*b2055c35SXin Li 184*b2055c35SXin LiTypically, an encoder would use these transforms to reduce the Shannon entropy 185*b2055c35SXin Liin the residual image. Also, the transform data can be decided based on entropy 186*b2055c35SXin Liminimization. 187*b2055c35SXin Li 188*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 189*b2055c35SXin Liwhile (ReadBits(1)) { // Transform present. 190*b2055c35SXin Li // Decode transform type. 191*b2055c35SXin Li enum TransformType transform_type = ReadBits(2); 192*b2055c35SXin Li // Decode transform data. 193*b2055c35SXin Li ... 194*b2055c35SXin Li} 195*b2055c35SXin Li 196*b2055c35SXin Li// Decode actual image data (Section 5). 197*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 198*b2055c35SXin Li 199*b2055c35SXin LiIf a transform is present, then the next two bits specify the transform type. 200*b2055c35SXin LiThere are four types of transforms. 201*b2055c35SXin Li 202*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 203*b2055c35SXin Lienum TransformType { 204*b2055c35SXin Li PREDICTOR_TRANSFORM = 0, 205*b2055c35SXin Li COLOR_TRANSFORM = 1, 206*b2055c35SXin Li SUBTRACT_GREEN_TRANSFORM = 2, 207*b2055c35SXin Li COLOR_INDEXING_TRANSFORM = 3, 208*b2055c35SXin Li}; 209*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 210*b2055c35SXin Li 211*b2055c35SXin LiThe transform type is followed by the transform data. Transform data contains 212*b2055c35SXin Lithe information required to apply the inverse transform and depends on the 213*b2055c35SXin Litransform type. The inverse transforms are applied in the reverse order that 214*b2055c35SXin Lithey are read from the bitstream, that is, last one first. 215*b2055c35SXin Li 216*b2055c35SXin LiNext, we describe the transform data for different types. 217*b2055c35SXin Li 218*b2055c35SXin Li### 4.1 Predictor Transform 219*b2055c35SXin Li 220*b2055c35SXin LiThe predictor transform can be used to reduce entropy by exploiting the fact 221*b2055c35SXin Lithat neighboring pixels are often correlated. In the predictor transform, the 222*b2055c35SXin Licurrent pixel value is predicted from the pixels already decoded (in scan-line 223*b2055c35SXin Liorder) and only the residual value (actual - predicted) is encoded. The green 224*b2055c35SXin Licomponent of a pixel defines which of the 14 predictors is used within a 225*b2055c35SXin Liparticular block of the ARGB image. The _prediction mode_ determines the type of 226*b2055c35SXin Liprediction to use. We divide the image into squares, and all the pixels in a 227*b2055c35SXin Lisquare use the same prediction mode. 228*b2055c35SXin Li 229*b2055c35SXin LiThe first 3 bits of prediction data define the block width and height in number 230*b2055c35SXin Liof bits. 231*b2055c35SXin Li 232*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 233*b2055c35SXin Liint size_bits = ReadBits(3) + 2; 234*b2055c35SXin Liint block_width = (1 << size_bits); 235*b2055c35SXin Liint block_height = (1 << size_bits); 236*b2055c35SXin Li#define DIV_ROUND_UP(num, den) (((num) + (den) - 1) / (den)) 237*b2055c35SXin Liint transform_width = DIV_ROUND_UP(image_width, 1 << size_bits); 238*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 239*b2055c35SXin Li 240*b2055c35SXin LiThe transform data contains the prediction mode for each block of the image. It 241*b2055c35SXin Liis a subresolution image where the green component of a pixel defines which of 242*b2055c35SXin Lithe 14 predictors is used for all the `block_width * block_height` pixels within 243*b2055c35SXin Lia particular block of the ARGB image. This subresolution image is encoded using 244*b2055c35SXin Lithe same techniques described in [Chapter 5](#image-data). 245*b2055c35SXin Li 246*b2055c35SXin LiThe number of block columns, `transform_width`, is used in two-dimensional 247*b2055c35SXin Liindexing. For a pixel (x, y), one can compute the respective filter block 248*b2055c35SXin Liaddress by: 249*b2055c35SXin Li 250*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 251*b2055c35SXin Liint block_index = (y >> size_bits) * transform_width + 252*b2055c35SXin Li (x >> size_bits); 253*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 254*b2055c35SXin Li 255*b2055c35SXin LiThere are 14 different prediction modes. In each prediction mode, the current 256*b2055c35SXin Lipixel value is predicted from one or more neighboring pixels whose values are 257*b2055c35SXin Lialready known. 258*b2055c35SXin Li 259*b2055c35SXin LiWe chose the neighboring pixels (TL, T, TR, and L) of the current pixel (P) as 260*b2055c35SXin Lifollows: 261*b2055c35SXin Li 262*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 263*b2055c35SXin LiO O O O O O O O O O O 264*b2055c35SXin LiO O O O O O O O O O O 265*b2055c35SXin LiO O O O TL T TR O O O O 266*b2055c35SXin LiO O O O L P X X X X X 267*b2055c35SXin LiX X X X X X X X X X X 268*b2055c35SXin LiX X X X X X X X X X X 269*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 270*b2055c35SXin Li 271*b2055c35SXin Liwhere TL means top-left, T means top, TR means top-right, and L means left. At 272*b2055c35SXin Lithe time of predicting a value for P, all O, TL, T, TR and L pixels have already 273*b2055c35SXin Libeen processed, and the P pixel and all X pixels are unknown. 274*b2055c35SXin Li 275*b2055c35SXin LiGiven the preceding neighboring pixels, the different prediction modes are 276*b2055c35SXin Lidefined as follows. 277*b2055c35SXin Li 278*b2055c35SXin Li| Mode | Predicted value of each channel of the current pixel | 279*b2055c35SXin Li| ------ | ------------------------------------------------------- | 280*b2055c35SXin Li| 0 | 0xff000000 (represents solid black color in ARGB) | 281*b2055c35SXin Li| 1 | L | 282*b2055c35SXin Li| 2 | T | 283*b2055c35SXin Li| 3 | TR | 284*b2055c35SXin Li| 4 | TL | 285*b2055c35SXin Li| 5 | Average2(Average2(L, TR), T) | 286*b2055c35SXin Li| 6 | Average2(L, TL) | 287*b2055c35SXin Li| 7 | Average2(L, T) | 288*b2055c35SXin Li| 8 | Average2(TL, T) | 289*b2055c35SXin Li| 9 | Average2(T, TR) | 290*b2055c35SXin Li| 10 | Average2(Average2(L, TL), Average2(T, TR)) | 291*b2055c35SXin Li| 11 | Select(L, T, TL) | 292*b2055c35SXin Li| 12 | ClampAddSubtractFull(L, T, TL) | 293*b2055c35SXin Li| 13 | ClampAddSubtractHalf(Average2(L, T), TL) | 294*b2055c35SXin Li 295*b2055c35SXin Li 296*b2055c35SXin Li`Average2` is defined as follows for each ARGB component: 297*b2055c35SXin Li 298*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 299*b2055c35SXin Liuint8 Average2(uint8 a, uint8 b) { 300*b2055c35SXin Li return (a + b) / 2; 301*b2055c35SXin Li} 302*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 303*b2055c35SXin Li 304*b2055c35SXin LiThe Select predictor is defined as follows: 305*b2055c35SXin Li 306*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 307*b2055c35SXin Liuint32 Select(uint32 L, uint32 T, uint32 TL) { 308*b2055c35SXin Li // L = left pixel, T = top pixel, TL = top-left pixel. 309*b2055c35SXin Li 310*b2055c35SXin Li // ARGB component estimates for prediction. 311*b2055c35SXin Li int pAlpha = ALPHA(L) + ALPHA(T) - ALPHA(TL); 312*b2055c35SXin Li int pRed = RED(L) + RED(T) - RED(TL); 313*b2055c35SXin Li int pGreen = GREEN(L) + GREEN(T) - GREEN(TL); 314*b2055c35SXin Li int pBlue = BLUE(L) + BLUE(T) - BLUE(TL); 315*b2055c35SXin Li 316*b2055c35SXin Li // Manhattan distances to estimates for left and top pixels. 317*b2055c35SXin Li int pL = abs(pAlpha - ALPHA(L)) + abs(pRed - RED(L)) + 318*b2055c35SXin Li abs(pGreen - GREEN(L)) + abs(pBlue - BLUE(L)); 319*b2055c35SXin Li int pT = abs(pAlpha - ALPHA(T)) + abs(pRed - RED(T)) + 320*b2055c35SXin Li abs(pGreen - GREEN(T)) + abs(pBlue - BLUE(T)); 321*b2055c35SXin Li 322*b2055c35SXin Li // Return either left or top, the one closer to the prediction. 323*b2055c35SXin Li if (pL < pT) { 324*b2055c35SXin Li return L; 325*b2055c35SXin Li } else { 326*b2055c35SXin Li return T; 327*b2055c35SXin Li } 328*b2055c35SXin Li} 329*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 330*b2055c35SXin Li 331*b2055c35SXin LiThe functions `ClampAddSubtractFull` and `ClampAddSubtractHalf` are performed 332*b2055c35SXin Lifor each ARGB component as follows: 333*b2055c35SXin Li 334*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 335*b2055c35SXin Li// Clamp the input value between 0 and 255. 336*b2055c35SXin Liint Clamp(int a) { 337*b2055c35SXin Li return (a < 0) ? 0 : (a > 255) ? 255 : a; 338*b2055c35SXin Li} 339*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 340*b2055c35SXin Li 341*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 342*b2055c35SXin Liint ClampAddSubtractFull(int a, int b, int c) { 343*b2055c35SXin Li return Clamp(a + b - c); 344*b2055c35SXin Li} 345*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 346*b2055c35SXin Li 347*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 348*b2055c35SXin Liint ClampAddSubtractHalf(int a, int b) { 349*b2055c35SXin Li return Clamp(a + (a - b) / 2); 350*b2055c35SXin Li} 351*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 352*b2055c35SXin Li 353*b2055c35SXin LiThere are special handling rules for some border pixels. If there is a 354*b2055c35SXin Liprediction transform, regardless of the mode \[0..13\] for these pixels, the 355*b2055c35SXin Lipredicted value for the left-topmost pixel of the image is 0xff000000, all 356*b2055c35SXin Lipixels on the top row are L-pixel, and all pixels on the leftmost column are 357*b2055c35SXin LiT-pixel. 358*b2055c35SXin Li 359*b2055c35SXin LiAddressing the TR-pixel for pixels on the rightmost column is 360*b2055c35SXin Liexceptional. The pixels on the rightmost column are predicted by using the modes 361*b2055c35SXin Li\[0..13\], just like pixels not on the border, but the leftmost pixel on the 362*b2055c35SXin Lisame row as the current pixel is instead used as the TR-pixel. 363*b2055c35SXin Li 364*b2055c35SXin LiThe final pixel value is obtained by adding each channel of the predicted value 365*b2055c35SXin Lito the encoded residual value. 366*b2055c35SXin Li 367*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 368*b2055c35SXin Livoid PredictorTransformOutput(uint32 residual, uint32 pred, 369*b2055c35SXin Li uint8* alpha, uint8* red, 370*b2055c35SXin Li uint8* green, uint8* blue) { 371*b2055c35SXin Li *alpha = ALPHA(residual) + ALPHA(pred); 372*b2055c35SXin Li *red = RED(residual) + RED(pred); 373*b2055c35SXin Li *green = GREEN(residual) + GREEN(pred); 374*b2055c35SXin Li *blue = BLUE(residual) + BLUE(pred); 375*b2055c35SXin Li} 376*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 377*b2055c35SXin Li 378*b2055c35SXin Li### 4.2 Color Transform 379*b2055c35SXin Li 380*b2055c35SXin LiThe goal of the color transform is to decorrelate the R, G, and B values of each 381*b2055c35SXin Lipixel. The color transform keeps the green (G) value as it is, transforms the 382*b2055c35SXin Lired (R) value based on the green value, and transforms the blue (B) value based 383*b2055c35SXin Lion the green value and then on the red value. 384*b2055c35SXin Li 385*b2055c35SXin LiAs is the case for the predictor transform, first the image is divided into 386*b2055c35SXin Liblocks, and the same transform mode is used for all the pixels in a block. For 387*b2055c35SXin Lieach block, there are three types of color transform elements. 388*b2055c35SXin Li 389*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 390*b2055c35SXin Litypedef struct { 391*b2055c35SXin Li uint8 green_to_red; 392*b2055c35SXin Li uint8 green_to_blue; 393*b2055c35SXin Li uint8 red_to_blue; 394*b2055c35SXin Li} ColorTransformElement; 395*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 396*b2055c35SXin Li 397*b2055c35SXin LiThe actual color transform is done by defining a color transform delta. The 398*b2055c35SXin Licolor transform delta depends on the `ColorTransformElement`, which is the same 399*b2055c35SXin Lifor all the pixels in a particular block. The delta is subtracted during the 400*b2055c35SXin Licolor transform. The inverse color transform then is just adding those deltas. 401*b2055c35SXin Li 402*b2055c35SXin LiThe color transform function is defined as follows: 403*b2055c35SXin Li 404*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 405*b2055c35SXin Livoid ColorTransform(uint8 red, uint8 blue, uint8 green, 406*b2055c35SXin Li ColorTransformElement *trans, 407*b2055c35SXin Li uint8 *new_red, uint8 *new_blue) { 408*b2055c35SXin Li // Transformed values of red and blue components 409*b2055c35SXin Li int tmp_red = red; 410*b2055c35SXin Li int tmp_blue = blue; 411*b2055c35SXin Li 412*b2055c35SXin Li // Applying the transform is just subtracting the transform deltas 413*b2055c35SXin Li tmp_red -= ColorTransformDelta(trans->green_to_red, green); 414*b2055c35SXin Li tmp_blue -= ColorTransformDelta(trans->green_to_blue, green); 415*b2055c35SXin Li tmp_blue -= ColorTransformDelta(trans->red_to_blue, red); 416*b2055c35SXin Li 417*b2055c35SXin Li *new_red = tmp_red & 0xff; 418*b2055c35SXin Li *new_blue = tmp_blue & 0xff; 419*b2055c35SXin Li} 420*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 421*b2055c35SXin Li 422*b2055c35SXin Li`ColorTransformDelta` is computed using a signed 8-bit integer representing a 423*b2055c35SXin Li3.5-fixed-point number and a signed 8-bit RGB color channel (c) \[-128..127\] 424*b2055c35SXin Liand is defined as follows: 425*b2055c35SXin Li 426*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 427*b2055c35SXin Liint8 ColorTransformDelta(int8 t, int8 c) { 428*b2055c35SXin Li return (t * c) >> 5; 429*b2055c35SXin Li} 430*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 431*b2055c35SXin Li 432*b2055c35SXin LiA conversion from the 8-bit unsigned representation (uint8) to the 8-bit signed 433*b2055c35SXin Lione (int8) is required before calling `ColorTransformDelta()`. The signed value 434*b2055c35SXin Lishould be interpreted as an 8-bit two's complement number (that is: uint8 range 435*b2055c35SXin Li\[128..255\] is mapped to the \[-128..-1\] range of its converted int8 value). 436*b2055c35SXin Li 437*b2055c35SXin LiThe multiplication is to be done using more precision (with at least 16-bit 438*b2055c35SXin Liprecision). The sign extension property of the shift operation does not matter 439*b2055c35SXin Lihere; only the lowest 8 bits are used from the result, and there the sign 440*b2055c35SXin Liextension shifting and unsigned shifting are consistent with each other. 441*b2055c35SXin Li 442*b2055c35SXin LiNow, we describe the contents of color transform data so that decoding can apply 443*b2055c35SXin Lithe inverse color transform and recover the original red and blue values. The 444*b2055c35SXin Lifirst 3 bits of the color transform data contain the width and height of the 445*b2055c35SXin Liimage block in number of bits, just like the predictor transform: 446*b2055c35SXin Li 447*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 448*b2055c35SXin Liint size_bits = ReadBits(3) + 2; 449*b2055c35SXin Liint block_width = 1 << size_bits; 450*b2055c35SXin Liint block_height = 1 << size_bits; 451*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 452*b2055c35SXin Li 453*b2055c35SXin LiThe remaining part of the color transform data contains `ColorTransformElement` 454*b2055c35SXin Liinstances, corresponding to each block of the image. Each 455*b2055c35SXin Li`ColorTransformElement` `'cte'` is treated as a pixel in a subresolution image 456*b2055c35SXin Liwhose alpha component is `255`, red component is `cte.red_to_blue`, green 457*b2055c35SXin Licomponent is `cte.green_to_blue`, and blue component is `cte.green_to_red`. 458*b2055c35SXin Li 459*b2055c35SXin LiDuring decoding, `ColorTransformElement` instances of the blocks are decoded and 460*b2055c35SXin Lithe inverse color transform is applied on the ARGB values of the pixels. As 461*b2055c35SXin Limentioned earlier, that inverse color transform is just adding 462*b2055c35SXin Li`ColorTransformElement` values to the red and blue channels. The alpha and green 463*b2055c35SXin Lichannels are left as is. 464*b2055c35SXin Li 465*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 466*b2055c35SXin Livoid InverseTransform(uint8 red, uint8 green, uint8 blue, 467*b2055c35SXin Li ColorTransformElement *trans, 468*b2055c35SXin Li uint8 *new_red, uint8 *new_blue) { 469*b2055c35SXin Li // Transformed values of red and blue components 470*b2055c35SXin Li int tmp_red = red; 471*b2055c35SXin Li int tmp_blue = blue; 472*b2055c35SXin Li 473*b2055c35SXin Li // Applying the inverse transform is just adding the 474*b2055c35SXin Li // color transform deltas 475*b2055c35SXin Li tmp_red += ColorTransformDelta(trans->green_to_red, green); 476*b2055c35SXin Li tmp_blue += ColorTransformDelta(trans->green_to_blue, green); 477*b2055c35SXin Li tmp_blue += 478*b2055c35SXin Li ColorTransformDelta(trans->red_to_blue, tmp_red & 0xff); 479*b2055c35SXin Li 480*b2055c35SXin Li *new_red = tmp_red & 0xff; 481*b2055c35SXin Li *new_blue = tmp_blue & 0xff; 482*b2055c35SXin Li} 483*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 484*b2055c35SXin Li 485*b2055c35SXin Li### 4.3 Subtract Green Transform 486*b2055c35SXin Li 487*b2055c35SXin LiThe subtract green transform subtracts green values from red and blue values of 488*b2055c35SXin Lieach pixel. When this transform is present, the decoder needs to add the green 489*b2055c35SXin Livalue to both the red and blue values. There is no data associated with this 490*b2055c35SXin Litransform. The decoder applies the inverse transform as follows: 491*b2055c35SXin Li 492*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 493*b2055c35SXin Livoid AddGreenToBlueAndRed(uint8 green, uint8 *red, uint8 *blue) { 494*b2055c35SXin Li *red = (*red + green) & 0xff; 495*b2055c35SXin Li *blue = (*blue + green) & 0xff; 496*b2055c35SXin Li} 497*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 498*b2055c35SXin Li 499*b2055c35SXin LiThis transform is redundant, as it can be modeled using the color transform, but 500*b2055c35SXin Lisince there is no additional data here, the subtract green transform can be 501*b2055c35SXin Licoded using fewer bits than a full-blown color transform. 502*b2055c35SXin Li 503*b2055c35SXin Li### 4.4 Color Indexing Transform 504*b2055c35SXin Li 505*b2055c35SXin LiIf there are not many unique pixel values, it may be more efficient to create a 506*b2055c35SXin Licolor index array and replace the pixel values by the array's indices. The color 507*b2055c35SXin Liindexing transform achieves this. (In the context of WebP lossless, we 508*b2055c35SXin Lispecifically do not call this a palette transform because a similar but more 509*b2055c35SXin Lidynamic concept exists in WebP lossless encoding: color cache.) 510*b2055c35SXin Li 511*b2055c35SXin LiThe color indexing transform checks for the number of unique ARGB values in the 512*b2055c35SXin Liimage. If that number is below a threshold (256), it creates an array of those 513*b2055c35SXin LiARGB values, which is then used to replace the pixel values with the 514*b2055c35SXin Licorresponding index: the green channel of the pixels are replaced with the 515*b2055c35SXin Liindex, all alpha values are set to 255, and all red and blue values to 0. 516*b2055c35SXin Li 517*b2055c35SXin LiThe transform data contains the color table size and the entries in the color 518*b2055c35SXin Litable. The decoder reads the color indexing transform data as follows: 519*b2055c35SXin Li 520*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 521*b2055c35SXin Li// 8-bit value for the color table size 522*b2055c35SXin Liint color_table_size = ReadBits(8) + 1; 523*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 524*b2055c35SXin Li 525*b2055c35SXin LiThe color table is stored using the image storage format itself. The color table 526*b2055c35SXin Lican be obtained by reading an image, without the RIFF header, image size, and 527*b2055c35SXin Litransforms, assuming the height of 1 pixel and the width of `color_table_size`. 528*b2055c35SXin LiThe color table is always subtraction-coded to reduce image entropy. The deltas 529*b2055c35SXin Liof palette colors contain typically much less entropy than the colors 530*b2055c35SXin Lithemselves, leading to significant savings for smaller images. In decoding, 531*b2055c35SXin Lievery final color in the color table can be obtained by adding the previous 532*b2055c35SXin Licolor component values by each ARGB component separately and storing the least 533*b2055c35SXin Lisignificant 8 bits of the result. 534*b2055c35SXin Li 535*b2055c35SXin LiThe inverse transform for the image is simply replacing the pixel values (which 536*b2055c35SXin Liare indices to the color table) with the actual color table values. The indexing 537*b2055c35SXin Liis done based on the green component of the ARGB color. 538*b2055c35SXin Li 539*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 540*b2055c35SXin Li// Inverse transform 541*b2055c35SXin Liargb = color_table[GREEN(argb)]; 542*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 543*b2055c35SXin Li 544*b2055c35SXin LiIf the index is equal to or larger than `color_table_size`, the argb color value 545*b2055c35SXin Lishould be set to 0x00000000 (transparent black). 546*b2055c35SXin Li 547*b2055c35SXin LiWhen the color table is small (equal to or less than 16 colors), several pixels 548*b2055c35SXin Liare bundled into a single pixel. The pixel bundling packs several (2, 4, or 8) 549*b2055c35SXin Lipixels into a single pixel, reducing the image width respectively. Pixel 550*b2055c35SXin Libundling allows for a more efficient joint distribution entropy coding of 551*b2055c35SXin Lineighboring pixels and gives some arithmetic coding-like benefits to the 552*b2055c35SXin Lientropy code, but it can only be used when there are 16 or fewer unique values. 553*b2055c35SXin Li 554*b2055c35SXin Li`color_table_size` specifies how many pixels are combined: 555*b2055c35SXin Li 556*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 557*b2055c35SXin Liint width_bits; 558*b2055c35SXin Liif (color_table_size <= 2) { 559*b2055c35SXin Li width_bits = 3; 560*b2055c35SXin Li} else if (color_table_size <= 4) { 561*b2055c35SXin Li width_bits = 2; 562*b2055c35SXin Li} else if (color_table_size <= 16) { 563*b2055c35SXin Li width_bits = 1; 564*b2055c35SXin Li} else { 565*b2055c35SXin Li width_bits = 0; 566*b2055c35SXin Li} 567*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 568*b2055c35SXin Li 569*b2055c35SXin Li`width_bits` has a value of 0, 1, 2, or 3. A value of 0 indicates no pixel 570*b2055c35SXin Libundling is to be done for the image. A value of 1 indicates that two pixels are 571*b2055c35SXin Licombined, and each pixel has a range of \[0..15\]. A value of 2 indicates that 572*b2055c35SXin Lifour pixels are combined, and each pixel has a range of \[0..3\]. A value of 3 573*b2055c35SXin Liindicates that eight pixels are combined and each pixel has a range of \[0..1\], 574*b2055c35SXin Lithat is, a binary value. 575*b2055c35SXin Li 576*b2055c35SXin LiThe values are packed into the green component as follows: 577*b2055c35SXin Li 578*b2055c35SXin Li * `width_bits` = 1: For every x value, where x ≡ 0 (mod 2), a green 579*b2055c35SXin Li value at x is positioned into the 4 least significant bits of the 580*b2055c35SXin Li green value at x / 2, and a green value at x + 1 is positioned into the 581*b2055c35SXin Li 4 most significant bits of the green value at x / 2. 582*b2055c35SXin Li * `width_bits` = 2: For every x value, where x ≡ 0 (mod 4), a green 583*b2055c35SXin Li value at x is positioned into the 2 least-significant bits of the 584*b2055c35SXin Li green value at x / 4, and green values at x + 1 to x + 3 are positioned in 585*b2055c35SXin Li order to the more significant bits of the green value at x / 4. 586*b2055c35SXin Li * `width_bits` = 3: For every x value, where x ≡ 0 (mod 8), a green 587*b2055c35SXin Li value at x is positioned into the least significant bit of the green 588*b2055c35SXin Li value at x / 8, and green values at x + 1 to x + 7 are positioned in order 589*b2055c35SXin Li to the more significant bits of the green value at x / 8. 590*b2055c35SXin Li 591*b2055c35SXin LiAfter reading this transform, `image_width` is subsampled by `width_bits`. This 592*b2055c35SXin Liaffects the size of subsequent transforms. The new size can be calculated using 593*b2055c35SXin Li`DIV_ROUND_UP`, as defined [earlier](#predictor-transform). 594*b2055c35SXin Li 595*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 596*b2055c35SXin Liimage_width = DIV_ROUND_UP(image_width, 1 << width_bits); 597*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 598*b2055c35SXin Li 599*b2055c35SXin Li5 Image Data 600*b2055c35SXin Li------------ 601*b2055c35SXin Li 602*b2055c35SXin LiImage data is an array of pixel values in scan-line order. 603*b2055c35SXin Li 604*b2055c35SXin Li### 5.1 Roles of Image Data 605*b2055c35SXin Li 606*b2055c35SXin LiWe use image data in five different roles: 607*b2055c35SXin Li 608*b2055c35SXin Li 1. ARGB image: Stores the actual pixels of the image. 609*b2055c35SXin Li 1. Entropy image: Stores the meta prefix codes (see 610*b2055c35SXin Li ["Decoding of Meta Prefix Codes"](#decoding-of-meta-prefix-codes)). 611*b2055c35SXin Li 1. Predictor image: Stores the metadata for the predictor transform (see 612*b2055c35SXin Li ["Predictor Transform"](#predictor-transform)). 613*b2055c35SXin Li 1. Color transform image: Created by `ColorTransformElement` values 614*b2055c35SXin Li (defined in ["Color Transform"](#color-transform)) for different blocks of 615*b2055c35SXin Li the image. 616*b2055c35SXin Li 1. Color indexing image: An array of size `color_table_size` (up to 256 ARGB 617*b2055c35SXin Li values) storing the metadata for the color indexing transform (see 618*b2055c35SXin Li ["Color Indexing Transform"](#color-indexing-transform)). 619*b2055c35SXin Li 620*b2055c35SXin Li### 5.2 Encoding of Image Data 621*b2055c35SXin Li 622*b2055c35SXin LiThe encoding of image data is independent of its role. 623*b2055c35SXin Li 624*b2055c35SXin LiThe image is first divided into a set of fixed-size blocks (typically 16x16 625*b2055c35SXin Liblocks). Each of these blocks are modeled using their own entropy codes. Also, 626*b2055c35SXin Liseveral blocks may share the same entropy codes. 627*b2055c35SXin Li 628*b2055c35SXin Li**Rationale:** Storing an entropy code incurs a cost. This cost can be minimized 629*b2055c35SXin Liif statistically similar blocks share an entropy code, thereby storing that code 630*b2055c35SXin Lionly once. For example, an encoder can find similar blocks by clustering them 631*b2055c35SXin Liusing their statistical properties or by repeatedly joining a pair of randomly 632*b2055c35SXin Liselected clusters when it reduces the overall amount of bits needed to encode 633*b2055c35SXin Lithe image. 634*b2055c35SXin Li 635*b2055c35SXin LiEach pixel is encoded using one of the three possible methods: 636*b2055c35SXin Li 637*b2055c35SXin Li 1. Prefix-coded literals: Each channel (green, red, blue, and alpha) is 638*b2055c35SXin Li entropy-coded independently. 639*b2055c35SXin Li 2. LZ77 backward reference: A sequence of pixels are copied from elsewhere in 640*b2055c35SXin Li the image. 641*b2055c35SXin Li 3. Color cache code: Using a short multiplicative hash code (color cache 642*b2055c35SXin Li index) of a recently seen color. 643*b2055c35SXin Li 644*b2055c35SXin LiThe following subsections describe each of these in detail. 645*b2055c35SXin Li 646*b2055c35SXin Li#### 5.2.1 Prefix-Coded Literals 647*b2055c35SXin Li 648*b2055c35SXin LiThe pixel is stored as prefix-coded values of green, red, blue, and alpha (in 649*b2055c35SXin Lithat order). See [Section 6.2.3](#decoding-entropy-coded-image-data) for 650*b2055c35SXin Lidetails. 651*b2055c35SXin Li 652*b2055c35SXin Li#### 5.2.2 LZ77 Backward Reference 653*b2055c35SXin Li 654*b2055c35SXin LiBackward references are tuples of _length_ and _distance code_: 655*b2055c35SXin Li 656*b2055c35SXin Li * Length indicates how many pixels in scan-line order are to be copied. 657*b2055c35SXin Li * Distance code is a number indicating the position of a previously seen 658*b2055c35SXin Li pixel, from which the pixels are to be copied. The exact mapping is 659*b2055c35SXin Li described [below](#distance-mapping). 660*b2055c35SXin Li 661*b2055c35SXin LiThe length and distance values are stored using **LZ77 prefix coding**. 662*b2055c35SXin Li 663*b2055c35SXin LiLZ77 prefix coding divides large integer values into two parts: the _prefix 664*b2055c35SXin Licode_ and the _extra bits_. The prefix code is stored using an entropy code, 665*b2055c35SXin Liwhile the extra bits are stored as they are (without an entropy code). 666*b2055c35SXin Li 667*b2055c35SXin Li**Rationale**: This approach reduces the storage requirement for the entropy 668*b2055c35SXin Licode. Also, large values are usually rare, so extra bits would be used for very 669*b2055c35SXin Lifew values in the image. Thus, this approach results in better compression 670*b2055c35SXin Lioverall. 671*b2055c35SXin Li 672*b2055c35SXin LiThe following table denotes the prefix codes and extra bits used for storing 673*b2055c35SXin Lidifferent ranges of values. 674*b2055c35SXin Li 675*b2055c35SXin LiNote: The maximum backward reference length is limited to 4096. Hence, only the 676*b2055c35SXin Lifirst 24 prefix codes (with the respective extra bits) are meaningful for length 677*b2055c35SXin Livalues. For distance values, however, all the 40 prefix codes are valid. 678*b2055c35SXin Li 679*b2055c35SXin Li| Value range | Prefix code | Extra bits | 680*b2055c35SXin Li| --------------- | ----------- | ---------- | 681*b2055c35SXin Li| 1 | 0 | 0 | 682*b2055c35SXin Li| 2 | 1 | 0 | 683*b2055c35SXin Li| 3 | 2 | 0 | 684*b2055c35SXin Li| 4 | 3 | 0 | 685*b2055c35SXin Li| 5..6 | 4 | 1 | 686*b2055c35SXin Li| 7..8 | 5 | 1 | 687*b2055c35SXin Li| 9..12 | 6 | 2 | 688*b2055c35SXin Li| 13..16 | 7 | 2 | 689*b2055c35SXin Li| ... | ... | ... | 690*b2055c35SXin Li| 3072..4096 | 23 | 10 | 691*b2055c35SXin Li| ... | ... | ... | 692*b2055c35SXin Li| 524289..786432 | 38 | 18 | 693*b2055c35SXin Li| 786433..1048576 | 39 | 18 | 694*b2055c35SXin Li 695*b2055c35SXin LiThe pseudocode to obtain a (length or distance) value from the prefix code is as 696*b2055c35SXin Lifollows: 697*b2055c35SXin Li 698*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 699*b2055c35SXin Liif (prefix_code < 4) { 700*b2055c35SXin Li return prefix_code + 1; 701*b2055c35SXin Li} 702*b2055c35SXin Liint extra_bits = (prefix_code - 2) >> 1; 703*b2055c35SXin Liint offset = (2 + (prefix_code & 1)) << extra_bits; 704*b2055c35SXin Lireturn offset + ReadBits(extra_bits) + 1; 705*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 706*b2055c35SXin Li 707*b2055c35SXin Li##### Distance Mapping 708*b2055c35SXin Li 709*b2055c35SXin LiAs noted previously, a distance code is a number indicating the position of a 710*b2055c35SXin Lipreviously seen pixel, from which the pixels are to be copied. This subsection 711*b2055c35SXin Lidefines the mapping between a distance code and the position of a previous 712*b2055c35SXin Lipixel. 713*b2055c35SXin Li 714*b2055c35SXin LiDistance codes larger than 120 denote the pixel distance in scan-line order, 715*b2055c35SXin Lioffset by 120. 716*b2055c35SXin Li 717*b2055c35SXin LiThe smallest distance codes \[1..120\] are special and are reserved for a close 718*b2055c35SXin Lineighborhood of the current pixel. This neighborhood consists of 120 pixels: 719*b2055c35SXin Li 720*b2055c35SXin Li * Pixels that are 1 to 7 rows above the current pixel and are up to 8 columns 721*b2055c35SXin Li to the left or up to 7 columns to the right of the current pixel. \[Total 722*b2055c35SXin Li such pixels = `7 * (8 + 1 + 7) = 112`\]. 723*b2055c35SXin Li * Pixels that are in the same row as the current pixel and are up to 8 724*b2055c35SXin Li columns to the left of the current pixel. \[`8` such pixels\]. 725*b2055c35SXin Li 726*b2055c35SXin LiThe mapping between distance code `distance_code` and the neighboring pixel 727*b2055c35SXin Lioffset `(xi, yi)` is as follows: 728*b2055c35SXin Li 729*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 730*b2055c35SXin Li(0, 1), (1, 0), (1, 1), (-1, 1), (0, 2), (2, 0), (1, 2), 731*b2055c35SXin Li(-1, 2), (2, 1), (-2, 1), (2, 2), (-2, 2), (0, 3), (3, 0), 732*b2055c35SXin Li(1, 3), (-1, 3), (3, 1), (-3, 1), (2, 3), (-2, 3), (3, 2), 733*b2055c35SXin Li(-3, 2), (0, 4), (4, 0), (1, 4), (-1, 4), (4, 1), (-4, 1), 734*b2055c35SXin Li(3, 3), (-3, 3), (2, 4), (-2, 4), (4, 2), (-4, 2), (0, 5), 735*b2055c35SXin Li(3, 4), (-3, 4), (4, 3), (-4, 3), (5, 0), (1, 5), (-1, 5), 736*b2055c35SXin Li(5, 1), (-5, 1), (2, 5), (-2, 5), (5, 2), (-5, 2), (4, 4), 737*b2055c35SXin Li(-4, 4), (3, 5), (-3, 5), (5, 3), (-5, 3), (0, 6), (6, 0), 738*b2055c35SXin Li(1, 6), (-1, 6), (6, 1), (-6, 1), (2, 6), (-2, 6), (6, 2), 739*b2055c35SXin Li(-6, 2), (4, 5), (-4, 5), (5, 4), (-5, 4), (3, 6), (-3, 6), 740*b2055c35SXin Li(6, 3), (-6, 3), (0, 7), (7, 0), (1, 7), (-1, 7), (5, 5), 741*b2055c35SXin Li(-5, 5), (7, 1), (-7, 1), (4, 6), (-4, 6), (6, 4), (-6, 4), 742*b2055c35SXin Li(2, 7), (-2, 7), (7, 2), (-7, 2), (3, 7), (-3, 7), (7, 3), 743*b2055c35SXin Li(-7, 3), (5, 6), (-5, 6), (6, 5), (-6, 5), (8, 0), (4, 7), 744*b2055c35SXin Li(-4, 7), (7, 4), (-7, 4), (8, 1), (8, 2), (6, 6), (-6, 6), 745*b2055c35SXin Li(8, 3), (5, 7), (-5, 7), (7, 5), (-7, 5), (8, 4), (6, 7), 746*b2055c35SXin Li(-6, 7), (7, 6), (-7, 6), (8, 5), (7, 7), (-7, 7), (8, 6), 747*b2055c35SXin Li(8, 7) 748*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 749*b2055c35SXin Li 750*b2055c35SXin LiFor example, the distance code `1` indicates an offset of `(0, 1)` for the 751*b2055c35SXin Lineighboring pixel, that is, the pixel above the current pixel (0 pixel 752*b2055c35SXin Lidifference in the X direction and 1 pixel difference in the Y direction). 753*b2055c35SXin LiSimilarly, the distance code `3` indicates the top-left pixel. 754*b2055c35SXin Li 755*b2055c35SXin LiThe decoder can convert a distance code `distance_code` to a scan-line order 756*b2055c35SXin Lidistance `dist` as follows: 757*b2055c35SXin Li 758*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 759*b2055c35SXin Li(xi, yi) = distance_map[distance_code - 1] 760*b2055c35SXin Lidist = xi + yi * image_width 761*b2055c35SXin Liif (dist < 1) { 762*b2055c35SXin Li dist = 1 763*b2055c35SXin Li} 764*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 765*b2055c35SXin Li 766*b2055c35SXin Liwhere `distance_map` is the mapping noted above, and `image_width` is the width 767*b2055c35SXin Liof the image in pixels. 768*b2055c35SXin Li 769*b2055c35SXin Li#### 5.2.3 Color Cache Coding 770*b2055c35SXin Li{:#color-cache-code} 771*b2055c35SXin Li 772*b2055c35SXin LiColor cache stores a set of colors that have been recently used in the image. 773*b2055c35SXin Li 774*b2055c35SXin Li**Rationale:** This way, the recently used colors can sometimes be referred to 775*b2055c35SXin Limore efficiently than emitting them using the other two methods (described in 776*b2055c35SXin LiSections [5.2.1](#prefix-coded-literals) and [5.2.2](#lz77-backward-reference)). 777*b2055c35SXin Li 778*b2055c35SXin LiColor cache codes are stored as follows. First, there is a 1-bit value that 779*b2055c35SXin Liindicates if the color cache is used. If this bit is 0, no color cache codes 780*b2055c35SXin Liexist, and they are not transmitted in the prefix code that decodes the green 781*b2055c35SXin Lisymbols and the length prefix codes. However, if this bit is 1, the color cache 782*b2055c35SXin Lisize is read next: 783*b2055c35SXin Li 784*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 785*b2055c35SXin Liint color_cache_code_bits = ReadBits(4); 786*b2055c35SXin Liint color_cache_size = 1 << color_cache_code_bits; 787*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 788*b2055c35SXin Li 789*b2055c35SXin Li`color_cache_code_bits` defines the size of the color cache (`1 << 790*b2055c35SXin Licolor_cache_code_bits`). The range of allowed values for 791*b2055c35SXin Li`color_cache_code_bits` is \[1..11\]. Compliant decoders must indicate a 792*b2055c35SXin Licorrupted bitstream for other values. 793*b2055c35SXin Li 794*b2055c35SXin LiA color cache is an array of size `color_cache_size`. Each entry stores one ARGB 795*b2055c35SXin Licolor. Colors are looked up by indexing them by `(0x1e35a7bd * color) >> (32 - 796*b2055c35SXin Licolor_cache_code_bits)`. Only one lookup is done in a color cache; there is no 797*b2055c35SXin Liconflict resolution. 798*b2055c35SXin Li 799*b2055c35SXin LiIn the beginning of decoding or encoding of an image, all entries in all color 800*b2055c35SXin Licache values are set to zero. The color cache code is converted to this color at 801*b2055c35SXin Lidecoding time. The state of the color cache is maintained by inserting every 802*b2055c35SXin Lipixel, be it produced by backward referencing or as literals, into the cache in 803*b2055c35SXin Lithe order they appear in the stream. 804*b2055c35SXin Li 805*b2055c35SXin Li6 Entropy Code 806*b2055c35SXin Li-------------- 807*b2055c35SXin Li 808*b2055c35SXin Li### 6.1 Overview 809*b2055c35SXin Li 810*b2055c35SXin LiMost of the data is coded using a [canonical prefix code][canonical_huff]. 811*b2055c35SXin LiHence, the codes are transmitted by sending the _prefix code lengths_, as 812*b2055c35SXin Liopposed to the actual _prefix codes_. 813*b2055c35SXin Li 814*b2055c35SXin LiIn particular, the format uses **spatially variant prefix coding**. In other 815*b2055c35SXin Liwords, different blocks of the image can potentially use different entropy 816*b2055c35SXin Licodes. 817*b2055c35SXin Li 818*b2055c35SXin Li**Rationale**: Different areas of the image may have different characteristics. 819*b2055c35SXin LiSo, allowing them to use different entropy codes provides more flexibility and 820*b2055c35SXin Lipotentially better compression. 821*b2055c35SXin Li 822*b2055c35SXin Li### 6.2 Details 823*b2055c35SXin Li 824*b2055c35SXin LiThe encoded image data consists of several parts: 825*b2055c35SXin Li 826*b2055c35SXin Li 1. Decoding and building the prefix codes. 827*b2055c35SXin Li 1. Meta prefix codes. 828*b2055c35SXin Li 1. Entropy-coded image data. 829*b2055c35SXin Li 830*b2055c35SXin LiFor any given pixel (x, y), there is a set of five prefix codes associated with 831*b2055c35SXin Liit. These codes are (in bitstream order): 832*b2055c35SXin Li 833*b2055c35SXin Li * **Prefix code #1**: Used for green channel, backward-reference length, and 834*b2055c35SXin Li color cache. 835*b2055c35SXin Li * **Prefix code #2, #3, and #4**: Used for red, blue, and alpha channels, 836*b2055c35SXin Li respectively. 837*b2055c35SXin Li * **Prefix code #5**: Used for backward-reference distance. 838*b2055c35SXin Li 839*b2055c35SXin LiFrom here on, we refer to this set as a **prefix code group**. 840*b2055c35SXin Li 841*b2055c35SXin Li#### 6.2.1 Decoding and Building the Prefix Codes 842*b2055c35SXin Li 843*b2055c35SXin LiThis section describes how to read the prefix code lengths from the bitstream. 844*b2055c35SXin Li 845*b2055c35SXin LiThe prefix code lengths can be coded in two ways. The method used is specified 846*b2055c35SXin Liby a 1-bit value. 847*b2055c35SXin Li 848*b2055c35SXin Li * If this bit is 1, it is a _simple code length code_. 849*b2055c35SXin Li * If this bit is 0, it is a _normal code length code_. 850*b2055c35SXin Li 851*b2055c35SXin LiIn both cases, there can be unused code lengths that are still part of the 852*b2055c35SXin Listream. This may be inefficient, but it is allowed by the format. 853*b2055c35SXin LiThe described tree must be a complete binary tree. A single leaf node is 854*b2055c35SXin Liconsidered a complete binary tree and can be encoded using either the simple 855*b2055c35SXin Licode length code or the normal code length code. When coding a single leaf 856*b2055c35SXin Linode using the _normal code length code_, all but one code length are zeros, 857*b2055c35SXin Liand the single leaf node value is marked with the length of 1 -- even when no 858*b2055c35SXin Libits are consumed when that single leaf node tree is used. 859*b2055c35SXin Li 860*b2055c35SXin Li##### Simple Code Length Code 861*b2055c35SXin Li 862*b2055c35SXin LiThis variant is used in the special case when only 1 or 2 prefix symbols are in 863*b2055c35SXin Lithe range \[0..255\] with code length `1`. All other prefix code lengths are 864*b2055c35SXin Liimplicitly zeros. 865*b2055c35SXin Li 866*b2055c35SXin LiThe first bit indicates the number of symbols: 867*b2055c35SXin Li 868*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 869*b2055c35SXin Liint num_symbols = ReadBits(1) + 1; 870*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 871*b2055c35SXin Li 872*b2055c35SXin LiThe following are the symbol values. 873*b2055c35SXin Li 874*b2055c35SXin LiThis first symbol is coded using 1 or 8 bits, depending on the value of 875*b2055c35SXin Li`is_first_8bits`. The range is \[0..1\] or \[0..255\], respectively. The second 876*b2055c35SXin Lisymbol, if present, is always assumed to be in the range \[0..255\] and coded 877*b2055c35SXin Liusing 8 bits. 878*b2055c35SXin Li 879*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 880*b2055c35SXin Liint is_first_8bits = ReadBits(1); 881*b2055c35SXin Lisymbol0 = ReadBits(1 + 7 * is_first_8bits); 882*b2055c35SXin Licode_lengths[symbol0] = 1; 883*b2055c35SXin Liif (num_symbols == 2) { 884*b2055c35SXin Li symbol1 = ReadBits(8); 885*b2055c35SXin Li code_lengths[symbol1] = 1; 886*b2055c35SXin Li} 887*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 888*b2055c35SXin Li 889*b2055c35SXin LiThe two symbols should be different. Duplicate symbols are allowed, but 890*b2055c35SXin Liinefficient. 891*b2055c35SXin Li 892*b2055c35SXin Li**Note:** Another special case is when _all_ prefix code lengths are _zeros_ (an 893*b2055c35SXin Liempty prefix code). For example, a prefix code for distance can be empty if 894*b2055c35SXin Lithere are no backward references. Similarly, prefix codes for alpha, red, and 895*b2055c35SXin Liblue can be empty if all pixels within the same meta prefix code are produced 896*b2055c35SXin Liusing the color cache. However, this case doesn't need special handling, as 897*b2055c35SXin Liempty prefix codes can be coded as those containing a single symbol `0`. 898*b2055c35SXin Li 899*b2055c35SXin Li##### Normal Code Length Code 900*b2055c35SXin Li 901*b2055c35SXin LiThe code lengths of the prefix code fit in 8 bits and are read as follows. 902*b2055c35SXin LiFirst, `num_code_lengths` specifies the number of code lengths. 903*b2055c35SXin Li 904*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 905*b2055c35SXin Liint num_code_lengths = 4 + ReadBits(4); 906*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 907*b2055c35SXin Li 908*b2055c35SXin LiThe code lengths are themselves encoded using prefix codes; lower-level code 909*b2055c35SXin Lilengths, `code_length_code_lengths`, first have to be read. The rest of those 910*b2055c35SXin Li`code_length_code_lengths` (according to the order in `kCodeLengthCodeOrder`) 911*b2055c35SXin Liare zeros. 912*b2055c35SXin Li 913*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 914*b2055c35SXin Liint kCodeLengthCodes = 19; 915*b2055c35SXin Liint kCodeLengthCodeOrder[kCodeLengthCodes] = { 916*b2055c35SXin Li 17, 18, 0, 1, 2, 3, 4, 5, 16, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 917*b2055c35SXin Li}; 918*b2055c35SXin Liint code_length_code_lengths[kCodeLengthCodes] = { 0 }; // All zeros 919*b2055c35SXin Lifor (i = 0; i < num_code_lengths; ++i) { 920*b2055c35SXin Li code_length_code_lengths[kCodeLengthCodeOrder[i]] = ReadBits(3); 921*b2055c35SXin Li} 922*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 923*b2055c35SXin Li 924*b2055c35SXin LiNext, if `ReadBits(1) == 0`, the maximum number of different read symbols 925*b2055c35SXin Li(`max_symbol`) for each symbol type (A, R, G, B, and distance) is set to its 926*b2055c35SXin Lialphabet size: 927*b2055c35SXin Li 928*b2055c35SXin Li * G channel: 256 + 24 + `color_cache_size` 929*b2055c35SXin Li * Other literals (A, R, and B): 256 930*b2055c35SXin Li * Distance code: 40 931*b2055c35SXin Li 932*b2055c35SXin LiOtherwise, it is defined as: 933*b2055c35SXin Li 934*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 935*b2055c35SXin Liint length_nbits = 2 + 2 * ReadBits(3); 936*b2055c35SXin Liint max_symbol = 2 + ReadBits(length_nbits); 937*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 938*b2055c35SXin Li 939*b2055c35SXin LiIf `max_symbol` is larger than the size of the alphabet for the symbol type, the 940*b2055c35SXin Libitstream is invalid. 941*b2055c35SXin Li 942*b2055c35SXin LiA prefix table is then built from `code_length_code_lengths` and used to read up 943*b2055c35SXin Lito `max_symbol` code lengths. 944*b2055c35SXin Li 945*b2055c35SXin Li * Code \[0..15\] indicates literal code lengths. 946*b2055c35SXin Li * Value 0 means no symbols have been coded. 947*b2055c35SXin Li * Values \[1..15\] indicate the bit length of the respective code. 948*b2055c35SXin Li * Code 16 repeats the previous nonzero value \[3..6\] times, that is, 949*b2055c35SXin Li `3 + ReadBits(2)` times. If code 16 is used before a nonzero 950*b2055c35SXin Li value has been emitted, a value of 8 is repeated. 951*b2055c35SXin Li * Code 17 emits a streak of zeros of length \[3..10\], that is, `3 + 952*b2055c35SXin Li ReadBits(3)` times. 953*b2055c35SXin Li * Code 18 emits a streak of zeros of length \[11..138\], that is, 954*b2055c35SXin Li `11 + ReadBits(7)` times. 955*b2055c35SXin Li 956*b2055c35SXin LiOnce code lengths are read, a prefix code for each symbol type (A, R, G, B, and 957*b2055c35SXin Lidistance) is formed using their respective alphabet sizes. 958*b2055c35SXin Li 959*b2055c35SXin LiThe Normal Code Length Code must code a full decision tree, that is, the sum of 960*b2055c35SXin Li`2 ^ (-length)` for all non-zero codes must be exactly one. There is however 961*b2055c35SXin Lione exception to this rule, the single leaf node tree, where the leaf node 962*b2055c35SXin Livalue is marked with value 1 and other values are 0s. 963*b2055c35SXin Li 964*b2055c35SXin Li#### 6.2.2 Decoding of Meta Prefix Codes 965*b2055c35SXin Li 966*b2055c35SXin LiAs noted earlier, the format allows the use of different prefix codes for 967*b2055c35SXin Lidifferent blocks of the image. _Meta prefix codes_ are indexes identifying which 968*b2055c35SXin Liprefix codes to use in different parts of the image. 969*b2055c35SXin Li 970*b2055c35SXin LiMeta prefix codes may be used _only_ when the image is being used in the 971*b2055c35SXin Li[role](#roles-of-image-data) of an _ARGB image_. 972*b2055c35SXin Li 973*b2055c35SXin LiThere are two possibilities for the meta prefix codes, indicated by a 1-bit 974*b2055c35SXin Livalue: 975*b2055c35SXin Li 976*b2055c35SXin Li * If this bit is zero, there is only one meta prefix code used everywhere in 977*b2055c35SXin Li the image. No more data is stored. 978*b2055c35SXin Li * If this bit is one, the image uses multiple meta prefix codes. These meta 979*b2055c35SXin Li prefix codes are stored as an _entropy image_ (described below). 980*b2055c35SXin Li 981*b2055c35SXin LiThe red and green components of a pixel define a 16-bit meta prefix code used in 982*b2055c35SXin Lia particular block of the ARGB image. 983*b2055c35SXin Li 984*b2055c35SXin Li##### Entropy Image 985*b2055c35SXin Li 986*b2055c35SXin LiThe entropy image defines which prefix codes are used in different parts of the 987*b2055c35SXin Liimage. 988*b2055c35SXin Li 989*b2055c35SXin LiThe first 3 bits contain the `prefix_bits` value. The dimensions of the entropy 990*b2055c35SXin Liimage are derived from `prefix_bits`: 991*b2055c35SXin Li 992*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 993*b2055c35SXin Liint prefix_bits = ReadBits(3) + 2; 994*b2055c35SXin Liint prefix_image_width = 995*b2055c35SXin Li DIV_ROUND_UP(image_width, 1 << prefix_bits); 996*b2055c35SXin Liint prefix_image_height = 997*b2055c35SXin Li DIV_ROUND_UP(image_height, 1 << prefix_bits); 998*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 999*b2055c35SXin Li 1000*b2055c35SXin Liwhere `DIV_ROUND_UP` is as defined [earlier](#predictor-transform). 1001*b2055c35SXin Li 1002*b2055c35SXin LiThe next bits contain an entropy image of width `prefix_image_width` and height 1003*b2055c35SXin Li`prefix_image_height`. 1004*b2055c35SXin Li 1005*b2055c35SXin Li##### Interpretation of Meta Prefix Codes 1006*b2055c35SXin Li 1007*b2055c35SXin LiThe number of prefix code groups in the ARGB image can be obtained by finding 1008*b2055c35SXin Lithe _largest meta prefix code_ from the entropy image: 1009*b2055c35SXin Li 1010*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1011*b2055c35SXin Liint num_prefix_groups = max(entropy image) + 1; 1012*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1013*b2055c35SXin Liwhere `max(entropy image)` indicates the largest prefix code stored in the 1014*b2055c35SXin Lientropy image. 1015*b2055c35SXin Li 1016*b2055c35SXin LiAs each prefix code group contains five prefix codes, the total number of prefix 1017*b2055c35SXin Licodes is: 1018*b2055c35SXin Li 1019*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1020*b2055c35SXin Liint num_prefix_codes = 5 * num_prefix_groups; 1021*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1022*b2055c35SXin Li 1023*b2055c35SXin LiGiven a pixel (x, y) in the ARGB image, we can obtain the corresponding prefix 1024*b2055c35SXin Licodes to be used as follows: 1025*b2055c35SXin Li 1026*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1027*b2055c35SXin Liint position = 1028*b2055c35SXin Li (y >> prefix_bits) * prefix_image_width + (x >> prefix_bits); 1029*b2055c35SXin Liint meta_prefix_code = (entropy_image[position] >> 8) & 0xffff; 1030*b2055c35SXin LiPrefixCodeGroup prefix_group = prefix_code_groups[meta_prefix_code]; 1031*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1032*b2055c35SXin Li 1033*b2055c35SXin Liwhere we have assumed the existence of `PrefixCodeGroup` structure, which 1034*b2055c35SXin Lirepresents a set of five prefix codes. Also, `prefix_code_groups` is an array of 1035*b2055c35SXin Li`PrefixCodeGroup` (of size `num_prefix_groups`). 1036*b2055c35SXin Li 1037*b2055c35SXin LiThe decoder then uses prefix code group `prefix_group` to decode the pixel 1038*b2055c35SXin Li(x, y), as explained in ["Decoding Entropy-Coded Image 1039*b2055c35SXin LiData"](#decoding-entropy-coded-image-data). 1040*b2055c35SXin Li 1041*b2055c35SXin Li#### 6.2.3 Decoding Entropy-Coded Image Data 1042*b2055c35SXin Li 1043*b2055c35SXin LiFor the current position (x, y) in the image, the decoder first identifies the 1044*b2055c35SXin Licorresponding prefix code group (as explained in the last section). Given the 1045*b2055c35SXin Liprefix code group, the pixel is read and decoded as follows. 1046*b2055c35SXin Li 1047*b2055c35SXin LiNext, read the symbol S from the bitstream using prefix code #1. Note that S is 1048*b2055c35SXin Liany integer in the range `0` to 1049*b2055c35SXin Li`(256 + 24 + ` [`color_cache_size`](#color-cache-code)` - 1)`. 1050*b2055c35SXin Li 1051*b2055c35SXin LiThe interpretation of S depends on its value: 1052*b2055c35SXin Li 1053*b2055c35SXin Li 1. If S < 256 1054*b2055c35SXin Li 1. Use S as the green component. 1055*b2055c35SXin Li 1. Read red from the bitstream using prefix code #2. 1056*b2055c35SXin Li 1. Read blue from the bitstream using prefix code #3. 1057*b2055c35SXin Li 1. Read alpha from the bitstream using prefix code #4. 1058*b2055c35SXin Li 1. If S >= 256 & S < 256 + 24 1059*b2055c35SXin Li 1. Use S - 256 as a length prefix code. 1060*b2055c35SXin Li 1. Read extra bits for the length from the bitstream. 1061*b2055c35SXin Li 1. Determine backward-reference length L from length prefix code and the 1062*b2055c35SXin Li extra bits read. 1063*b2055c35SXin Li 1. Read the distance prefix code from the bitstream using prefix code #5. 1064*b2055c35SXin Li 1. Read extra bits for the distance from the bitstream. 1065*b2055c35SXin Li 1. Determine backward-reference distance D from the distance prefix code 1066*b2055c35SXin Li and the extra bits read. 1067*b2055c35SXin Li 1. Copy L pixels (in scan-line order) from the sequence of pixels starting 1068*b2055c35SXin Li at the current position minus D pixels. 1069*b2055c35SXin Li 1. If S >= 256 + 24 1070*b2055c35SXin Li 1. Use S - (256 + 24) as the index into the color cache. 1071*b2055c35SXin Li 1. Get ARGB color from the color cache at that index. 1072*b2055c35SXin Li 1073*b2055c35SXin Li7 Overall Structure of the Format 1074*b2055c35SXin Li--------------------------------- 1075*b2055c35SXin Li 1076*b2055c35SXin LiBelow is a view into the format in Augmented Backus-Naur Form (ABNF) 1077*b2055c35SXin Li[RFC 5234][] [RFC 7405][]. It does not cover all details. The end-of-image (EOI) 1078*b2055c35SXin Liis only implicitly coded into the number of pixels (image_width * image_height). 1079*b2055c35SXin Li 1080*b2055c35SXin LiNote that `*element` means `element` can be repeated 0 or more times. `5element` 1081*b2055c35SXin Limeans `element` is repeated exactly 5 times. `%b` represents a binary value. 1082*b2055c35SXin Li 1083*b2055c35SXin Li#### 7.1 Basic Structure 1084*b2055c35SXin Li 1085*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1086*b2055c35SXin Liformat = RIFF-header image-header image-stream 1087*b2055c35SXin LiRIFF-header = %s"RIFF" 4OCTET %s"WEBPVP8L" 4OCTET 1088*b2055c35SXin Liimage-header = %x2F image-size alpha-is-used version 1089*b2055c35SXin Liimage-size = 14BIT 14BIT ; width - 1, height - 1 1090*b2055c35SXin Lialpha-is-used = 1BIT 1091*b2055c35SXin Liversion = 3BIT ; 0 1092*b2055c35SXin Liimage-stream = optional-transform spatially-coded-image 1093*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1094*b2055c35SXin Li 1095*b2055c35SXin Li#### 7.2 Structure of Transforms 1096*b2055c35SXin Li 1097*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1098*b2055c35SXin Lioptional-transform = (%b1 transform optional-transform) / %b0 1099*b2055c35SXin Litransform = predictor-tx / color-tx / subtract-green-tx 1100*b2055c35SXin Litransform =/ color-indexing-tx 1101*b2055c35SXin Li 1102*b2055c35SXin Lipredictor-tx = %b00 predictor-image 1103*b2055c35SXin Lipredictor-image = 3BIT ; sub-pixel code 1104*b2055c35SXin Li entropy-coded-image 1105*b2055c35SXin Li 1106*b2055c35SXin Licolor-tx = %b01 color-image 1107*b2055c35SXin Licolor-image = 3BIT ; sub-pixel code 1108*b2055c35SXin Li entropy-coded-image 1109*b2055c35SXin Li 1110*b2055c35SXin Lisubtract-green-tx = %b10 1111*b2055c35SXin Li 1112*b2055c35SXin Licolor-indexing-tx = %b11 color-indexing-image 1113*b2055c35SXin Licolor-indexing-image = 8BIT ; color count 1114*b2055c35SXin Li entropy-coded-image 1115*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1116*b2055c35SXin Li 1117*b2055c35SXin Li#### 7.3 Structure of the Image Data 1118*b2055c35SXin Li 1119*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1120*b2055c35SXin Lispatially-coded-image = color-cache-info meta-prefix data 1121*b2055c35SXin Lientropy-coded-image = color-cache-info data 1122*b2055c35SXin Li 1123*b2055c35SXin Licolor-cache-info = %b0 1124*b2055c35SXin Licolor-cache-info =/ (%b1 4BIT) ; 1 followed by color cache size 1125*b2055c35SXin Li 1126*b2055c35SXin Limeta-prefix = %b0 / (%b1 entropy-image) 1127*b2055c35SXin Li 1128*b2055c35SXin Lidata = prefix-codes lz77-coded-image 1129*b2055c35SXin Lientropy-image = 3BIT ; subsample value 1130*b2055c35SXin Li entropy-coded-image 1131*b2055c35SXin Li 1132*b2055c35SXin Liprefix-codes = prefix-code-group *prefix-codes 1133*b2055c35SXin Liprefix-code-group = 1134*b2055c35SXin Li 5prefix-code ; See "Interpretation of Meta Prefix Codes" to 1135*b2055c35SXin Li ; understand what each of these five prefix 1136*b2055c35SXin Li ; codes are for. 1137*b2055c35SXin Li 1138*b2055c35SXin Liprefix-code = simple-prefix-code / normal-prefix-code 1139*b2055c35SXin Lisimple-prefix-code = ; see "Simple Code Length Code" for details 1140*b2055c35SXin Linormal-prefix-code = ; see "Normal Code Length Code" for details 1141*b2055c35SXin Li 1142*b2055c35SXin Lilz77-coded-image = 1143*b2055c35SXin Li *((argb-pixel / lz77-copy / color-cache-code) lz77-coded-image) 1144*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1145*b2055c35SXin Li 1146*b2055c35SXin LiThe following is a possible example sequence: 1147*b2055c35SXin Li 1148*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1149*b2055c35SXin LiRIFF-header image-size %b1 subtract-green-tx 1150*b2055c35SXin Li%b1 predictor-tx %b0 color-cache-info 1151*b2055c35SXin Li%b0 prefix-codes lz77-coded-image 1152*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1153*b2055c35SXin Li 1154*b2055c35SXin Li[RFC 5234]: https://www.rfc-editor.org/rfc/rfc5234 1155*b2055c35SXin Li[RFC 7405]: https://www.rfc-editor.org/rfc/rfc7405 1156*b2055c35SXin Li[canonical_huff]: https://en.wikipedia.org/wiki/Canonical_Huffman_code 1157