xref: /aosp_15_r20/external/webp/doc/webp-lossless-bitstream-spec.txt (revision b2055c353e87c8814eb2b6b1b11112a1562253bd)
1*b2055c35SXin Li<!--
2*b2055c35SXin Li
3*b2055c35SXin LiAlthough you may be viewing an alternate representation, this document is
4*b2055c35SXin Lisourced in Markdown, a light-duty markup scheme, and is optimized for the
5*b2055c35SXin Li[kramdown](https://kramdown.gettalong.org/) transformer.
6*b2055c35SXin Li
7*b2055c35SXin LiSee the accompanying specs_generation.md. External link targets are referenced
8*b2055c35SXin Liat the end of this file.
9*b2055c35SXin Li
10*b2055c35SXin Li-->
11*b2055c35SXin Li
12*b2055c35SXin LiSpecification for WebP Lossless Bitstream
13*b2055c35SXin Li=========================================
14*b2055c35SXin Li
15*b2055c35SXin Li_Jyrki Alakuijala, Ph.D., Google, Inc., 2023-03-09_
16*b2055c35SXin Li
17*b2055c35SXin LiAbstract
18*b2055c35SXin Li--------
19*b2055c35SXin Li
20*b2055c35SXin LiWebP lossless is an image format for lossless compression of ARGB images. The
21*b2055c35SXin Lilossless format stores and restores the pixel values exactly, including the
22*b2055c35SXin Licolor values for fully transparent pixels. A universal algorithm for sequential
23*b2055c35SXin Lidata compression (LZ77), prefix coding, and a color cache are used for
24*b2055c35SXin Licompression of the bulk data. Decoding speeds faster than PNG have been
25*b2055c35SXin Lidemonstrated, as well as 25% denser compression than can be achieved using
26*b2055c35SXin Litoday's PNG format.
27*b2055c35SXin Li
28*b2055c35SXin Li* TOC placeholder
29*b2055c35SXin Li{:toc}
30*b2055c35SXin Li
31*b2055c35SXin Li1 Introduction
32*b2055c35SXin Li--------------
33*b2055c35SXin Li
34*b2055c35SXin LiThis document describes the compressed data representation of a WebP lossless
35*b2055c35SXin Liimage. It is intended as a detailed reference for the WebP lossless encoder and
36*b2055c35SXin Lidecoder implementation.
37*b2055c35SXin Li
38*b2055c35SXin LiIn this document, we extensively use C programming language syntax to describe
39*b2055c35SXin Lithe bitstream and assume the existence of a function for reading bits,
40*b2055c35SXin Li`ReadBits(n)`. The bytes are read in the natural order of the stream containing
41*b2055c35SXin Lithem, and bits of each byte are read in least-significant-bit-first order. When
42*b2055c35SXin Limultiple bits are read at the same time, the integer is constructed from the
43*b2055c35SXin Lioriginal data in the original order. The most significant bits of the returned
44*b2055c35SXin Liinteger are also the most significant bits of the original data. Thus, the
45*b2055c35SXin Listatement
46*b2055c35SXin Li
47*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
48*b2055c35SXin Lib = ReadBits(2);
49*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
50*b2055c35SXin Li
51*b2055c35SXin Liis equivalent with the two statements below:
52*b2055c35SXin Li
53*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
54*b2055c35SXin Lib = ReadBits(1);
55*b2055c35SXin Lib |= ReadBits(1) << 1;
56*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
57*b2055c35SXin Li
58*b2055c35SXin LiWe assume that each color component, that is, alpha, red, blue, and green, is
59*b2055c35SXin Lirepresented using an 8-bit byte. We define the corresponding type as uint8. A
60*b2055c35SXin Liwhole ARGB pixel is represented by a type called uint32, which is an unsigned
61*b2055c35SXin Liinteger consisting of 32 bits. In the code showing the behavior of the
62*b2055c35SXin Litransforms, these values are codified in the following bits: alpha in bits
63*b2055c35SXin Li31..24, red in bits 23..16, green in bits 15..8, and blue in bits 7..0; however,
64*b2055c35SXin Liimplementations of the format are free to use another representation internally.
65*b2055c35SXin Li
66*b2055c35SXin LiBroadly, a WebP lossless image contains header data, transform information, and
67*b2055c35SXin Liactual image data. Headers contain the width and height of the image. A WebP
68*b2055c35SXin Lilossless image can go through four different types of transforms before being
69*b2055c35SXin Lientropy encoded. The transform information in the bitstream contains the data
70*b2055c35SXin Lirequired to apply the respective inverse transforms.
71*b2055c35SXin Li
72*b2055c35SXin Li2 Nomenclature
73*b2055c35SXin Li--------------
74*b2055c35SXin Li
75*b2055c35SXin LiARGB
76*b2055c35SXin Li:   A pixel value consisting of alpha, red, green, and blue values.
77*b2055c35SXin Li
78*b2055c35SXin LiARGB image
79*b2055c35SXin Li:   A two-dimensional array containing ARGB pixels.
80*b2055c35SXin Li
81*b2055c35SXin Licolor cache
82*b2055c35SXin Li:   A small hash-addressed array to store recently used colors to be able to
83*b2055c35SXin Li    recall them with shorter codes.
84*b2055c35SXin Li
85*b2055c35SXin Licolor indexing image
86*b2055c35SXin Li:   A one-dimensional image of colors that can be indexed using a small integer
87*b2055c35SXin Li    (up to 256 within WebP lossless).
88*b2055c35SXin Li
89*b2055c35SXin Licolor transform image
90*b2055c35SXin Li:   A two-dimensional subresolution image containing data about correlations of
91*b2055c35SXin Li    color components.
92*b2055c35SXin Li
93*b2055c35SXin Lidistance mapping
94*b2055c35SXin Li:   Changes LZ77 distances to have the smallest values for pixels in
95*b2055c35SXin Li    two-dimensional proximity.
96*b2055c35SXin Li
97*b2055c35SXin Lientropy image
98*b2055c35SXin Li:   A two-dimensional subresolution image indicating which entropy coding should
99*b2055c35SXin Li    be used in a respective square in the image, that is, each pixel is a meta
100*b2055c35SXin Li    prefix code.
101*b2055c35SXin Li
102*b2055c35SXin LiLZ77
103*b2055c35SXin Li:   A dictionary-based sliding window compression algorithm that either emits
104*b2055c35SXin Li    symbols or describes them as sequences of past symbols.
105*b2055c35SXin Li
106*b2055c35SXin Limeta prefix code
107*b2055c35SXin Li:   A small integer (up to 16 bits) that indexes an element in the meta prefix
108*b2055c35SXin Li    table.
109*b2055c35SXin Li
110*b2055c35SXin Lipredictor image
111*b2055c35SXin Li:   A two-dimensional subresolution image indicating which spatial predictor is
112*b2055c35SXin Li    used for a particular square in the image.
113*b2055c35SXin Li
114*b2055c35SXin Liprefix code
115*b2055c35SXin Li:   A classic way to do entropy coding where a smaller number of bits are used
116*b2055c35SXin Li    for more frequent codes.
117*b2055c35SXin Li
118*b2055c35SXin Liprefix coding
119*b2055c35SXin Li:   A way to entropy code larger integers, which codes a few bits of the integer
120*b2055c35SXin Li    using an entropy code and codifies the remaining bits raw. This allows for
121*b2055c35SXin Li    the descriptions of the entropy codes to remain relatively small even when
122*b2055c35SXin Li    the range of symbols is large.
123*b2055c35SXin Li
124*b2055c35SXin Liscan-line order
125*b2055c35SXin Li:   A processing order of pixels (left to right and top to bottom), starting
126*b2055c35SXin Li    from the left-hand-top pixel. Once a row is completed, continue from the
127*b2055c35SXin Li    left-hand column of the next row.
128*b2055c35SXin Li
129*b2055c35SXin Li3 RIFF Header
130*b2055c35SXin Li-------------
131*b2055c35SXin Li
132*b2055c35SXin LiThe beginning of the header has the RIFF container. This consists of the
133*b2055c35SXin Lifollowing 21 bytes:
134*b2055c35SXin Li
135*b2055c35SXin Li   1. String 'RIFF'.
136*b2055c35SXin Li   2. A little-endian, 32-bit value of the chunk length, which is the whole size
137*b2055c35SXin Li      of the chunk controlled by the RIFF header. Normally, this equals
138*b2055c35SXin Li      the payload size (file size minus 8 bytes: 4 bytes for the 'RIFF'
139*b2055c35SXin Li      identifier and 4 bytes for storing the value itself).
140*b2055c35SXin Li   3. String 'WEBP' (RIFF container name).
141*b2055c35SXin Li   4. String 'VP8L' (FourCC for lossless-encoded image data).
142*b2055c35SXin Li   5. A little-endian, 32-bit value of the number of bytes in the
143*b2055c35SXin Li      lossless stream.
144*b2055c35SXin Li   6. 1-byte signature 0x2f.
145*b2055c35SXin Li
146*b2055c35SXin LiThe first 28 bits of the bitstream specify the width and height of the image.
147*b2055c35SXin LiWidth and height are decoded as 14-bit integers as follows:
148*b2055c35SXin Li
149*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
150*b2055c35SXin Liint image_width = ReadBits(14) + 1;
151*b2055c35SXin Liint image_height = ReadBits(14) + 1;
152*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
153*b2055c35SXin Li
154*b2055c35SXin LiThe 14-bit precision for image width and height limits the maximum size of a
155*b2055c35SXin LiWebP lossless image to 16384✕16384 pixels.
156*b2055c35SXin Li
157*b2055c35SXin LiThe alpha_is_used bit is a hint only, and should not impact decoding. It should
158*b2055c35SXin Libe set to 0 when all alpha values are 255 in the picture, and 1 otherwise.
159*b2055c35SXin Li
160*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
161*b2055c35SXin Liint alpha_is_used = ReadBits(1);
162*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
163*b2055c35SXin Li
164*b2055c35SXin LiThe version_number is a 3 bit code that must be set to 0. Any other value should
165*b2055c35SXin Libe treated as an error.
166*b2055c35SXin Li
167*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
168*b2055c35SXin Liint version_number = ReadBits(3);
169*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
170*b2055c35SXin Li
171*b2055c35SXin Li4 Transforms
172*b2055c35SXin Li------------
173*b2055c35SXin Li
174*b2055c35SXin LiThe transforms are reversible manipulations of the image data that can reduce
175*b2055c35SXin Lithe remaining symbolic entropy by modeling spatial and color correlations. They
176*b2055c35SXin Lican make the final compression more dense.
177*b2055c35SXin Li
178*b2055c35SXin LiAn image can go through four types of transforms. A 1 bit indicates the
179*b2055c35SXin Lipresence of a transform. Each transform is allowed to be used only once. The
180*b2055c35SXin Litransforms are used only for the main-level ARGB image; the subresolution images
181*b2055c35SXin Li(color transform image, entropy image, and predictor image) have no transforms,
182*b2055c35SXin Linot even the 0 bit indicating the end of transforms.
183*b2055c35SXin Li
184*b2055c35SXin LiTypically, an encoder would use these transforms to reduce the Shannon entropy
185*b2055c35SXin Liin the residual image. Also, the transform data can be decided based on entropy
186*b2055c35SXin Liminimization.
187*b2055c35SXin Li
188*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
189*b2055c35SXin Liwhile (ReadBits(1)) {  // Transform present.
190*b2055c35SXin Li  // Decode transform type.
191*b2055c35SXin Li  enum TransformType transform_type = ReadBits(2);
192*b2055c35SXin Li  // Decode transform data.
193*b2055c35SXin Li  ...
194*b2055c35SXin Li}
195*b2055c35SXin Li
196*b2055c35SXin Li// Decode actual image data (Section 5).
197*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
198*b2055c35SXin Li
199*b2055c35SXin LiIf a transform is present, then the next two bits specify the transform type.
200*b2055c35SXin LiThere are four types of transforms.
201*b2055c35SXin Li
202*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
203*b2055c35SXin Lienum TransformType {
204*b2055c35SXin Li  PREDICTOR_TRANSFORM             = 0,
205*b2055c35SXin Li  COLOR_TRANSFORM                 = 1,
206*b2055c35SXin Li  SUBTRACT_GREEN_TRANSFORM        = 2,
207*b2055c35SXin Li  COLOR_INDEXING_TRANSFORM        = 3,
208*b2055c35SXin Li};
209*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
210*b2055c35SXin Li
211*b2055c35SXin LiThe transform type is followed by the transform data. Transform data contains
212*b2055c35SXin Lithe information required to apply the inverse transform and depends on the
213*b2055c35SXin Litransform type. The inverse transforms are applied in the reverse order that
214*b2055c35SXin Lithey are read from the bitstream, that is, last one first.
215*b2055c35SXin Li
216*b2055c35SXin LiNext, we describe the transform data for different types.
217*b2055c35SXin Li
218*b2055c35SXin Li### 4.1 Predictor Transform
219*b2055c35SXin Li
220*b2055c35SXin LiThe predictor transform can be used to reduce entropy by exploiting the fact
221*b2055c35SXin Lithat neighboring pixels are often correlated. In the predictor transform, the
222*b2055c35SXin Licurrent pixel value is predicted from the pixels already decoded (in scan-line
223*b2055c35SXin Liorder) and only the residual value (actual - predicted) is encoded. The green
224*b2055c35SXin Licomponent of a pixel defines which of the 14 predictors is used within a
225*b2055c35SXin Liparticular block of the ARGB image. The _prediction mode_ determines the type of
226*b2055c35SXin Liprediction to use. We divide the image into squares, and all the pixels in a
227*b2055c35SXin Lisquare use the same prediction mode.
228*b2055c35SXin Li
229*b2055c35SXin LiThe first 3 bits of prediction data define the block width and height in number
230*b2055c35SXin Liof bits.
231*b2055c35SXin Li
232*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
233*b2055c35SXin Liint size_bits = ReadBits(3) + 2;
234*b2055c35SXin Liint block_width = (1 << size_bits);
235*b2055c35SXin Liint block_height = (1 << size_bits);
236*b2055c35SXin Li#define DIV_ROUND_UP(num, den) (((num) + (den) - 1) / (den))
237*b2055c35SXin Liint transform_width = DIV_ROUND_UP(image_width, 1 << size_bits);
238*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
239*b2055c35SXin Li
240*b2055c35SXin LiThe transform data contains the prediction mode for each block of the image. It
241*b2055c35SXin Liis a subresolution image where the green component of a pixel defines which of
242*b2055c35SXin Lithe 14 predictors is used for all the `block_width * block_height` pixels within
243*b2055c35SXin Lia particular block of the ARGB image. This subresolution image is encoded using
244*b2055c35SXin Lithe same techniques described in [Chapter 5](#image-data).
245*b2055c35SXin Li
246*b2055c35SXin LiThe number of block columns, `transform_width`, is used in two-dimensional
247*b2055c35SXin Liindexing. For a pixel (x, y), one can compute the respective filter block
248*b2055c35SXin Liaddress by:
249*b2055c35SXin Li
250*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
251*b2055c35SXin Liint block_index = (y >> size_bits) * transform_width +
252*b2055c35SXin Li                  (x >> size_bits);
253*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
254*b2055c35SXin Li
255*b2055c35SXin LiThere are 14 different prediction modes. In each prediction mode, the current
256*b2055c35SXin Lipixel value is predicted from one or more neighboring pixels whose values are
257*b2055c35SXin Lialready known.
258*b2055c35SXin Li
259*b2055c35SXin LiWe chose the neighboring pixels (TL, T, TR, and L) of the current pixel (P) as
260*b2055c35SXin Lifollows:
261*b2055c35SXin Li
262*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
263*b2055c35SXin LiO    O    O    O    O    O    O    O    O    O    O
264*b2055c35SXin LiO    O    O    O    O    O    O    O    O    O    O
265*b2055c35SXin LiO    O    O    O    TL   T    TR   O    O    O    O
266*b2055c35SXin LiO    O    O    O    L    P    X    X    X    X    X
267*b2055c35SXin LiX    X    X    X    X    X    X    X    X    X    X
268*b2055c35SXin LiX    X    X    X    X    X    X    X    X    X    X
269*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
270*b2055c35SXin Li
271*b2055c35SXin Liwhere TL means top-left, T means top, TR means top-right, and L means left. At
272*b2055c35SXin Lithe time of predicting a value for P, all O, TL, T, TR and L pixels have already
273*b2055c35SXin Libeen processed, and the P pixel and all X pixels are unknown.
274*b2055c35SXin Li
275*b2055c35SXin LiGiven the preceding neighboring pixels, the different prediction modes are
276*b2055c35SXin Lidefined as follows.
277*b2055c35SXin Li
278*b2055c35SXin Li| Mode   | Predicted value of each channel of the current pixel    |
279*b2055c35SXin Li| ------ | ------------------------------------------------------- |
280*b2055c35SXin Li|  0     | 0xff000000 (represents solid black color in ARGB)       |
281*b2055c35SXin Li|  1     | L                                                       |
282*b2055c35SXin Li|  2     | T                                                       |
283*b2055c35SXin Li|  3     | TR                                                      |
284*b2055c35SXin Li|  4     | TL                                                      |
285*b2055c35SXin Li|  5     | Average2(Average2(L, TR), T)                            |
286*b2055c35SXin Li|  6     | Average2(L, TL)                                         |
287*b2055c35SXin Li|  7     | Average2(L, T)                                          |
288*b2055c35SXin Li|  8     | Average2(TL, T)                                         |
289*b2055c35SXin Li|  9     | Average2(T, TR)                                         |
290*b2055c35SXin Li| 10     | Average2(Average2(L, TL), Average2(T, TR))              |
291*b2055c35SXin Li| 11     | Select(L, T, TL)                                        |
292*b2055c35SXin Li| 12     | ClampAddSubtractFull(L, T, TL)                          |
293*b2055c35SXin Li| 13     | ClampAddSubtractHalf(Average2(L, T), TL)                |
294*b2055c35SXin Li
295*b2055c35SXin Li
296*b2055c35SXin Li`Average2` is defined as follows for each ARGB component:
297*b2055c35SXin Li
298*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
299*b2055c35SXin Liuint8 Average2(uint8 a, uint8 b) {
300*b2055c35SXin Li  return (a + b) / 2;
301*b2055c35SXin Li}
302*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
303*b2055c35SXin Li
304*b2055c35SXin LiThe Select predictor is defined as follows:
305*b2055c35SXin Li
306*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
307*b2055c35SXin Liuint32 Select(uint32 L, uint32 T, uint32 TL) {
308*b2055c35SXin Li  // L = left pixel, T = top pixel, TL = top-left pixel.
309*b2055c35SXin Li
310*b2055c35SXin Li  // ARGB component estimates for prediction.
311*b2055c35SXin Li  int pAlpha = ALPHA(L) + ALPHA(T) - ALPHA(TL);
312*b2055c35SXin Li  int pRed = RED(L) + RED(T) - RED(TL);
313*b2055c35SXin Li  int pGreen = GREEN(L) + GREEN(T) - GREEN(TL);
314*b2055c35SXin Li  int pBlue = BLUE(L) + BLUE(T) - BLUE(TL);
315*b2055c35SXin Li
316*b2055c35SXin Li  // Manhattan distances to estimates for left and top pixels.
317*b2055c35SXin Li  int pL = abs(pAlpha - ALPHA(L)) + abs(pRed - RED(L)) +
318*b2055c35SXin Li           abs(pGreen - GREEN(L)) + abs(pBlue - BLUE(L));
319*b2055c35SXin Li  int pT = abs(pAlpha - ALPHA(T)) + abs(pRed - RED(T)) +
320*b2055c35SXin Li           abs(pGreen - GREEN(T)) + abs(pBlue - BLUE(T));
321*b2055c35SXin Li
322*b2055c35SXin Li  // Return either left or top, the one closer to the prediction.
323*b2055c35SXin Li  if (pL < pT) {
324*b2055c35SXin Li    return L;
325*b2055c35SXin Li  } else {
326*b2055c35SXin Li    return T;
327*b2055c35SXin Li  }
328*b2055c35SXin Li}
329*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
330*b2055c35SXin Li
331*b2055c35SXin LiThe functions `ClampAddSubtractFull` and `ClampAddSubtractHalf` are performed
332*b2055c35SXin Lifor each ARGB component as follows:
333*b2055c35SXin Li
334*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
335*b2055c35SXin Li// Clamp the input value between 0 and 255.
336*b2055c35SXin Liint Clamp(int a) {
337*b2055c35SXin Li  return (a < 0) ? 0 : (a > 255) ? 255 : a;
338*b2055c35SXin Li}
339*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
340*b2055c35SXin Li
341*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
342*b2055c35SXin Liint ClampAddSubtractFull(int a, int b, int c) {
343*b2055c35SXin Li  return Clamp(a + b - c);
344*b2055c35SXin Li}
345*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
346*b2055c35SXin Li
347*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
348*b2055c35SXin Liint ClampAddSubtractHalf(int a, int b) {
349*b2055c35SXin Li  return Clamp(a + (a - b) / 2);
350*b2055c35SXin Li}
351*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
352*b2055c35SXin Li
353*b2055c35SXin LiThere are special handling rules for some border pixels. If there is a
354*b2055c35SXin Liprediction transform, regardless of the mode \[0..13\] for these pixels, the
355*b2055c35SXin Lipredicted value for the left-topmost pixel of the image is 0xff000000, all
356*b2055c35SXin Lipixels on the top row are L-pixel, and all pixels on the leftmost column are
357*b2055c35SXin LiT-pixel.
358*b2055c35SXin Li
359*b2055c35SXin LiAddressing the TR-pixel for pixels on the rightmost column is
360*b2055c35SXin Liexceptional. The pixels on the rightmost column are predicted by using the modes
361*b2055c35SXin Li\[0..13\], just like pixels not on the border, but the leftmost pixel on the
362*b2055c35SXin Lisame row as the current pixel is instead used as the TR-pixel.
363*b2055c35SXin Li
364*b2055c35SXin LiThe final pixel value is obtained by adding each channel of the predicted value
365*b2055c35SXin Lito the encoded residual value.
366*b2055c35SXin Li
367*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
368*b2055c35SXin Livoid PredictorTransformOutput(uint32 residual, uint32 pred,
369*b2055c35SXin Li                              uint8* alpha, uint8* red,
370*b2055c35SXin Li                              uint8* green, uint8* blue) {
371*b2055c35SXin Li  *alpha = ALPHA(residual) + ALPHA(pred);
372*b2055c35SXin Li  *red = RED(residual) + RED(pred);
373*b2055c35SXin Li  *green = GREEN(residual) + GREEN(pred);
374*b2055c35SXin Li  *blue = BLUE(residual) + BLUE(pred);
375*b2055c35SXin Li}
376*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
377*b2055c35SXin Li
378*b2055c35SXin Li### 4.2 Color Transform
379*b2055c35SXin Li
380*b2055c35SXin LiThe goal of the color transform is to decorrelate the R, G, and B values of each
381*b2055c35SXin Lipixel. The color transform keeps the green (G) value as it is, transforms the
382*b2055c35SXin Lired (R) value based on the green value, and transforms the blue (B) value based
383*b2055c35SXin Lion the green value and then on the red value.
384*b2055c35SXin Li
385*b2055c35SXin LiAs is the case for the predictor transform, first the image is divided into
386*b2055c35SXin Liblocks, and the same transform mode is used for all the pixels in a block. For
387*b2055c35SXin Lieach block, there are three types of color transform elements.
388*b2055c35SXin Li
389*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
390*b2055c35SXin Litypedef struct {
391*b2055c35SXin Li  uint8 green_to_red;
392*b2055c35SXin Li  uint8 green_to_blue;
393*b2055c35SXin Li  uint8 red_to_blue;
394*b2055c35SXin Li} ColorTransformElement;
395*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
396*b2055c35SXin Li
397*b2055c35SXin LiThe actual color transform is done by defining a color transform delta. The
398*b2055c35SXin Licolor transform delta depends on the `ColorTransformElement`, which is the same
399*b2055c35SXin Lifor all the pixels in a particular block. The delta is subtracted during the
400*b2055c35SXin Licolor transform. The inverse color transform then is just adding those deltas.
401*b2055c35SXin Li
402*b2055c35SXin LiThe color transform function is defined as follows:
403*b2055c35SXin Li
404*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
405*b2055c35SXin Livoid ColorTransform(uint8 red, uint8 blue, uint8 green,
406*b2055c35SXin Li                    ColorTransformElement *trans,
407*b2055c35SXin Li                    uint8 *new_red, uint8 *new_blue) {
408*b2055c35SXin Li  // Transformed values of red and blue components
409*b2055c35SXin Li  int tmp_red = red;
410*b2055c35SXin Li  int tmp_blue = blue;
411*b2055c35SXin Li
412*b2055c35SXin Li  // Applying the transform is just subtracting the transform deltas
413*b2055c35SXin Li  tmp_red  -= ColorTransformDelta(trans->green_to_red,  green);
414*b2055c35SXin Li  tmp_blue -= ColorTransformDelta(trans->green_to_blue, green);
415*b2055c35SXin Li  tmp_blue -= ColorTransformDelta(trans->red_to_blue, red);
416*b2055c35SXin Li
417*b2055c35SXin Li  *new_red = tmp_red & 0xff;
418*b2055c35SXin Li  *new_blue = tmp_blue & 0xff;
419*b2055c35SXin Li}
420*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
421*b2055c35SXin Li
422*b2055c35SXin Li`ColorTransformDelta` is computed using a signed 8-bit integer representing a
423*b2055c35SXin Li3.5-fixed-point number and a signed 8-bit RGB color channel (c) \[-128..127\]
424*b2055c35SXin Liand is defined as follows:
425*b2055c35SXin Li
426*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
427*b2055c35SXin Liint8 ColorTransformDelta(int8 t, int8 c) {
428*b2055c35SXin Li  return (t * c) >> 5;
429*b2055c35SXin Li}
430*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
431*b2055c35SXin Li
432*b2055c35SXin LiA conversion from the 8-bit unsigned representation (uint8) to the 8-bit signed
433*b2055c35SXin Lione (int8) is required before calling `ColorTransformDelta()`. The signed value
434*b2055c35SXin Lishould be interpreted as an 8-bit two's complement number (that is: uint8 range
435*b2055c35SXin Li\[128..255\] is mapped to the \[-128..-1\] range of its converted int8 value).
436*b2055c35SXin Li
437*b2055c35SXin LiThe multiplication is to be done using more precision (with at least 16-bit
438*b2055c35SXin Liprecision). The sign extension property of the shift operation does not matter
439*b2055c35SXin Lihere; only the lowest 8 bits are used from the result, and there the sign
440*b2055c35SXin Liextension shifting and unsigned shifting are consistent with each other.
441*b2055c35SXin Li
442*b2055c35SXin LiNow, we describe the contents of color transform data so that decoding can apply
443*b2055c35SXin Lithe inverse color transform and recover the original red and blue values. The
444*b2055c35SXin Lifirst 3 bits of the color transform data contain the width and height of the
445*b2055c35SXin Liimage block in number of bits, just like the predictor transform:
446*b2055c35SXin Li
447*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
448*b2055c35SXin Liint size_bits = ReadBits(3) + 2;
449*b2055c35SXin Liint block_width = 1 << size_bits;
450*b2055c35SXin Liint block_height = 1 << size_bits;
451*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
452*b2055c35SXin Li
453*b2055c35SXin LiThe remaining part of the color transform data contains `ColorTransformElement`
454*b2055c35SXin Liinstances, corresponding to each block of the image. Each
455*b2055c35SXin Li`ColorTransformElement` `'cte'` is treated as a pixel in a subresolution image
456*b2055c35SXin Liwhose alpha component is `255`, red component is `cte.red_to_blue`, green
457*b2055c35SXin Licomponent is `cte.green_to_blue`, and blue component is `cte.green_to_red`.
458*b2055c35SXin Li
459*b2055c35SXin LiDuring decoding, `ColorTransformElement` instances of the blocks are decoded and
460*b2055c35SXin Lithe inverse color transform is applied on the ARGB values of the pixels. As
461*b2055c35SXin Limentioned earlier, that inverse color transform is just adding
462*b2055c35SXin Li`ColorTransformElement` values to the red and blue channels. The alpha and green
463*b2055c35SXin Lichannels are left as is.
464*b2055c35SXin Li
465*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
466*b2055c35SXin Livoid InverseTransform(uint8 red, uint8 green, uint8 blue,
467*b2055c35SXin Li                      ColorTransformElement *trans,
468*b2055c35SXin Li                      uint8 *new_red, uint8 *new_blue) {
469*b2055c35SXin Li  // Transformed values of red and blue components
470*b2055c35SXin Li  int tmp_red = red;
471*b2055c35SXin Li  int tmp_blue = blue;
472*b2055c35SXin Li
473*b2055c35SXin Li  // Applying the inverse transform is just adding the
474*b2055c35SXin Li  // color transform deltas
475*b2055c35SXin Li  tmp_red  += ColorTransformDelta(trans->green_to_red, green);
476*b2055c35SXin Li  tmp_blue += ColorTransformDelta(trans->green_to_blue, green);
477*b2055c35SXin Li  tmp_blue +=
478*b2055c35SXin Li      ColorTransformDelta(trans->red_to_blue, tmp_red & 0xff);
479*b2055c35SXin Li
480*b2055c35SXin Li  *new_red = tmp_red & 0xff;
481*b2055c35SXin Li  *new_blue = tmp_blue & 0xff;
482*b2055c35SXin Li}
483*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
484*b2055c35SXin Li
485*b2055c35SXin Li### 4.3 Subtract Green Transform
486*b2055c35SXin Li
487*b2055c35SXin LiThe subtract green transform subtracts green values from red and blue values of
488*b2055c35SXin Lieach pixel. When this transform is present, the decoder needs to add the green
489*b2055c35SXin Livalue to both the red and blue values. There is no data associated with this
490*b2055c35SXin Litransform. The decoder applies the inverse transform as follows:
491*b2055c35SXin Li
492*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
493*b2055c35SXin Livoid AddGreenToBlueAndRed(uint8 green, uint8 *red, uint8 *blue) {
494*b2055c35SXin Li  *red  = (*red  + green) & 0xff;
495*b2055c35SXin Li  *blue = (*blue + green) & 0xff;
496*b2055c35SXin Li}
497*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
498*b2055c35SXin Li
499*b2055c35SXin LiThis transform is redundant, as it can be modeled using the color transform, but
500*b2055c35SXin Lisince there is no additional data here, the subtract green transform can be
501*b2055c35SXin Licoded using fewer bits than a full-blown color transform.
502*b2055c35SXin Li
503*b2055c35SXin Li### 4.4 Color Indexing Transform
504*b2055c35SXin Li
505*b2055c35SXin LiIf there are not many unique pixel values, it may be more efficient to create a
506*b2055c35SXin Licolor index array and replace the pixel values by the array's indices. The color
507*b2055c35SXin Liindexing transform achieves this. (In the context of WebP lossless, we
508*b2055c35SXin Lispecifically do not call this a palette transform because a similar but more
509*b2055c35SXin Lidynamic concept exists in WebP lossless encoding: color cache.)
510*b2055c35SXin Li
511*b2055c35SXin LiThe color indexing transform checks for the number of unique ARGB values in the
512*b2055c35SXin Liimage. If that number is below a threshold (256), it creates an array of those
513*b2055c35SXin LiARGB values, which is then used to replace the pixel values with the
514*b2055c35SXin Licorresponding index: the green channel of the pixels are replaced with the
515*b2055c35SXin Liindex, all alpha values are set to 255, and all red and blue values to 0.
516*b2055c35SXin Li
517*b2055c35SXin LiThe transform data contains the color table size and the entries in the color
518*b2055c35SXin Litable. The decoder reads the color indexing transform data as follows:
519*b2055c35SXin Li
520*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
521*b2055c35SXin Li// 8-bit value for the color table size
522*b2055c35SXin Liint color_table_size = ReadBits(8) + 1;
523*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
524*b2055c35SXin Li
525*b2055c35SXin LiThe color table is stored using the image storage format itself. The color table
526*b2055c35SXin Lican be obtained by reading an image, without the RIFF header, image size, and
527*b2055c35SXin Litransforms, assuming the height of 1 pixel and the width of `color_table_size`.
528*b2055c35SXin LiThe color table is always subtraction-coded to reduce image entropy. The deltas
529*b2055c35SXin Liof palette colors contain typically much less entropy than the colors
530*b2055c35SXin Lithemselves, leading to significant savings for smaller images. In decoding,
531*b2055c35SXin Lievery final color in the color table can be obtained by adding the previous
532*b2055c35SXin Licolor component values by each ARGB component separately and storing the least
533*b2055c35SXin Lisignificant 8 bits of the result.
534*b2055c35SXin Li
535*b2055c35SXin LiThe inverse transform for the image is simply replacing the pixel values (which
536*b2055c35SXin Liare indices to the color table) with the actual color table values. The indexing
537*b2055c35SXin Liis done based on the green component of the ARGB color.
538*b2055c35SXin Li
539*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
540*b2055c35SXin Li// Inverse transform
541*b2055c35SXin Liargb = color_table[GREEN(argb)];
542*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
543*b2055c35SXin Li
544*b2055c35SXin LiIf the index is equal to or larger than `color_table_size`, the argb color value
545*b2055c35SXin Lishould be set to 0x00000000 (transparent black).
546*b2055c35SXin Li
547*b2055c35SXin LiWhen the color table is small (equal to or less than 16 colors), several pixels
548*b2055c35SXin Liare bundled into a single pixel. The pixel bundling packs several (2, 4, or 8)
549*b2055c35SXin Lipixels into a single pixel, reducing the image width respectively. Pixel
550*b2055c35SXin Libundling allows for a more efficient joint distribution entropy coding of
551*b2055c35SXin Lineighboring pixels and gives some arithmetic coding-like benefits to the
552*b2055c35SXin Lientropy code, but it can only be used when there are 16 or fewer unique values.
553*b2055c35SXin Li
554*b2055c35SXin Li`color_table_size` specifies how many pixels are combined:
555*b2055c35SXin Li
556*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
557*b2055c35SXin Liint width_bits;
558*b2055c35SXin Liif (color_table_size <= 2) {
559*b2055c35SXin Li  width_bits = 3;
560*b2055c35SXin Li} else if (color_table_size <= 4) {
561*b2055c35SXin Li  width_bits = 2;
562*b2055c35SXin Li} else if (color_table_size <= 16) {
563*b2055c35SXin Li  width_bits = 1;
564*b2055c35SXin Li} else {
565*b2055c35SXin Li  width_bits = 0;
566*b2055c35SXin Li}
567*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
568*b2055c35SXin Li
569*b2055c35SXin Li`width_bits` has a value of 0, 1, 2, or 3. A value of 0 indicates no pixel
570*b2055c35SXin Libundling is to be done for the image. A value of 1 indicates that two pixels are
571*b2055c35SXin Licombined, and each pixel has a range of \[0..15\]. A value of 2 indicates that
572*b2055c35SXin Lifour pixels are combined, and each pixel has a range of \[0..3\]. A value of 3
573*b2055c35SXin Liindicates that eight pixels are combined and each pixel has a range of \[0..1\],
574*b2055c35SXin Lithat is, a binary value.
575*b2055c35SXin Li
576*b2055c35SXin LiThe values are packed into the green component as follows:
577*b2055c35SXin Li
578*b2055c35SXin Li  * `width_bits` = 1: For every x value, where x0 (mod 2), a green
579*b2055c35SXin Li    value at x is positioned into the 4 least significant bits of the
580*b2055c35SXin Li    green value at x / 2, and a green value at x + 1 is positioned into the
581*b2055c35SXin Li    4 most significant bits of the green value at x / 2.
582*b2055c35SXin Li  * `width_bits` = 2: For every x value, where x0 (mod 4), a green
583*b2055c35SXin Li    value at x is positioned into the 2 least-significant bits of the
584*b2055c35SXin Li    green value at x / 4, and green values at x + 1 to x + 3 are positioned in
585*b2055c35SXin Li    order to the more significant bits of the green value at x / 4.
586*b2055c35SXin Li  * `width_bits` = 3: For every x value, where x0 (mod 8), a green
587*b2055c35SXin Li    value at x is positioned into the least significant bit of the green
588*b2055c35SXin Li    value at x / 8, and green values at x + 1 to x + 7 are positioned in order
589*b2055c35SXin Li    to the more significant bits of the green value at x / 8.
590*b2055c35SXin Li
591*b2055c35SXin LiAfter reading this transform, `image_width` is subsampled by `width_bits`. This
592*b2055c35SXin Liaffects the size of subsequent transforms. The new size can be calculated using
593*b2055c35SXin Li`DIV_ROUND_UP`, as defined [earlier](#predictor-transform).
594*b2055c35SXin Li
595*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
596*b2055c35SXin Liimage_width = DIV_ROUND_UP(image_width, 1 << width_bits);
597*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
598*b2055c35SXin Li
599*b2055c35SXin Li5 Image Data
600*b2055c35SXin Li------------
601*b2055c35SXin Li
602*b2055c35SXin LiImage data is an array of pixel values in scan-line order.
603*b2055c35SXin Li
604*b2055c35SXin Li### 5.1 Roles of Image Data
605*b2055c35SXin Li
606*b2055c35SXin LiWe use image data in five different roles:
607*b2055c35SXin Li
608*b2055c35SXin Li  1. ARGB image: Stores the actual pixels of the image.
609*b2055c35SXin Li  1. Entropy image: Stores the meta prefix codes (see
610*b2055c35SXin Li     ["Decoding of Meta Prefix Codes"](#decoding-of-meta-prefix-codes)).
611*b2055c35SXin Li  1. Predictor image: Stores the metadata for the predictor transform (see
612*b2055c35SXin Li     ["Predictor Transform"](#predictor-transform)).
613*b2055c35SXin Li  1. Color transform image: Created by `ColorTransformElement` values
614*b2055c35SXin Li     (defined in ["Color Transform"](#color-transform)) for different blocks of
615*b2055c35SXin Li     the image.
616*b2055c35SXin Li  1. Color indexing image: An array of size `color_table_size` (up to 256 ARGB
617*b2055c35SXin Li     values) storing the metadata for the color indexing transform (see
618*b2055c35SXin Li     ["Color Indexing Transform"](#color-indexing-transform)).
619*b2055c35SXin Li
620*b2055c35SXin Li### 5.2 Encoding of Image Data
621*b2055c35SXin Li
622*b2055c35SXin LiThe encoding of image data is independent of its role.
623*b2055c35SXin Li
624*b2055c35SXin LiThe image is first divided into a set of fixed-size blocks (typically 16x16
625*b2055c35SXin Liblocks). Each of these blocks are modeled using their own entropy codes. Also,
626*b2055c35SXin Liseveral blocks may share the same entropy codes.
627*b2055c35SXin Li
628*b2055c35SXin Li**Rationale:** Storing an entropy code incurs a cost. This cost can be minimized
629*b2055c35SXin Liif statistically similar blocks share an entropy code, thereby storing that code
630*b2055c35SXin Lionly once. For example, an encoder can find similar blocks by clustering them
631*b2055c35SXin Liusing their statistical properties or by repeatedly joining a pair of randomly
632*b2055c35SXin Liselected clusters when it reduces the overall amount of bits needed to encode
633*b2055c35SXin Lithe image.
634*b2055c35SXin Li
635*b2055c35SXin LiEach pixel is encoded using one of the three possible methods:
636*b2055c35SXin Li
637*b2055c35SXin Li  1. Prefix-coded literals: Each channel (green, red, blue, and alpha) is
638*b2055c35SXin Li     entropy-coded independently.
639*b2055c35SXin Li  2. LZ77 backward reference: A sequence of pixels are copied from elsewhere in
640*b2055c35SXin Li     the image.
641*b2055c35SXin Li  3. Color cache code: Using a short multiplicative hash code (color cache
642*b2055c35SXin Li     index) of a recently seen color.
643*b2055c35SXin Li
644*b2055c35SXin LiThe following subsections describe each of these in detail.
645*b2055c35SXin Li
646*b2055c35SXin Li#### 5.2.1 Prefix-Coded Literals
647*b2055c35SXin Li
648*b2055c35SXin LiThe pixel is stored as prefix-coded values of green, red, blue, and alpha (in
649*b2055c35SXin Lithat order). See [Section 6.2.3](#decoding-entropy-coded-image-data) for
650*b2055c35SXin Lidetails.
651*b2055c35SXin Li
652*b2055c35SXin Li#### 5.2.2 LZ77 Backward Reference
653*b2055c35SXin Li
654*b2055c35SXin LiBackward references are tuples of _length_ and _distance code_:
655*b2055c35SXin Li
656*b2055c35SXin Li  * Length indicates how many pixels in scan-line order are to be copied.
657*b2055c35SXin Li  * Distance code is a number indicating the position of a previously seen
658*b2055c35SXin Li    pixel, from which the pixels are to be copied. The exact mapping is
659*b2055c35SXin Li    described [below](#distance-mapping).
660*b2055c35SXin Li
661*b2055c35SXin LiThe length and distance values are stored using **LZ77 prefix coding**.
662*b2055c35SXin Li
663*b2055c35SXin LiLZ77 prefix coding divides large integer values into two parts: the _prefix
664*b2055c35SXin Licode_ and the _extra bits_. The prefix code is stored using an entropy code,
665*b2055c35SXin Liwhile the extra bits are stored as they are (without an entropy code).
666*b2055c35SXin Li
667*b2055c35SXin Li**Rationale**: This approach reduces the storage requirement for the entropy
668*b2055c35SXin Licode. Also, large values are usually rare, so extra bits would be used for very
669*b2055c35SXin Lifew values in the image. Thus, this approach results in better compression
670*b2055c35SXin Lioverall.
671*b2055c35SXin Li
672*b2055c35SXin LiThe following table denotes the prefix codes and extra bits used for storing
673*b2055c35SXin Lidifferent ranges of values.
674*b2055c35SXin Li
675*b2055c35SXin LiNote: The maximum backward reference length is limited to 4096. Hence, only the
676*b2055c35SXin Lifirst 24 prefix codes (with the respective extra bits) are meaningful for length
677*b2055c35SXin Livalues. For distance values, however, all the 40 prefix codes are valid.
678*b2055c35SXin Li
679*b2055c35SXin Li| Value range     | Prefix code | Extra bits |
680*b2055c35SXin Li| --------------- | ----------- | ---------- |
681*b2055c35SXin Li| 1               | 0           | 0          |
682*b2055c35SXin Li| 2               | 1           | 0          |
683*b2055c35SXin Li| 3               | 2           | 0          |
684*b2055c35SXin Li| 4               | 3           | 0          |
685*b2055c35SXin Li| 5..6            | 4           | 1          |
686*b2055c35SXin Li| 7..8            | 5           | 1          |
687*b2055c35SXin Li| 9..12           | 6           | 2          |
688*b2055c35SXin Li| 13..16          | 7           | 2          |
689*b2055c35SXin Li| ...             | ...         | ...        |
690*b2055c35SXin Li| 3072..4096      | 23          | 10         |
691*b2055c35SXin Li| ...             | ...         | ...        |
692*b2055c35SXin Li| 524289..786432  | 38          | 18         |
693*b2055c35SXin Li| 786433..1048576 | 39          | 18         |
694*b2055c35SXin Li
695*b2055c35SXin LiThe pseudocode to obtain a (length or distance) value from the prefix code is as
696*b2055c35SXin Lifollows:
697*b2055c35SXin Li
698*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
699*b2055c35SXin Liif (prefix_code < 4) {
700*b2055c35SXin Li  return prefix_code + 1;
701*b2055c35SXin Li}
702*b2055c35SXin Liint extra_bits = (prefix_code - 2) >> 1;
703*b2055c35SXin Liint offset = (2 + (prefix_code & 1)) << extra_bits;
704*b2055c35SXin Lireturn offset + ReadBits(extra_bits) + 1;
705*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
706*b2055c35SXin Li
707*b2055c35SXin Li##### Distance Mapping
708*b2055c35SXin Li
709*b2055c35SXin LiAs noted previously, a distance code is a number indicating the position of a
710*b2055c35SXin Lipreviously seen pixel, from which the pixels are to be copied. This subsection
711*b2055c35SXin Lidefines the mapping between a distance code and the position of a previous
712*b2055c35SXin Lipixel.
713*b2055c35SXin Li
714*b2055c35SXin LiDistance codes larger than 120 denote the pixel distance in scan-line order,
715*b2055c35SXin Lioffset by 120.
716*b2055c35SXin Li
717*b2055c35SXin LiThe smallest distance codes \[1..120\] are special and are reserved for a close
718*b2055c35SXin Lineighborhood of the current pixel. This neighborhood consists of 120 pixels:
719*b2055c35SXin Li
720*b2055c35SXin Li  * Pixels that are 1 to 7 rows above the current pixel and are up to 8 columns
721*b2055c35SXin Li    to the left or up to 7 columns to the right of the current pixel. \[Total
722*b2055c35SXin Li    such pixels = `7 * (8 + 1 + 7) = 112`\].
723*b2055c35SXin Li  * Pixels that are in the same row as the current pixel and are up to 8
724*b2055c35SXin Li    columns to the left of the current pixel. \[`8` such pixels\].
725*b2055c35SXin Li
726*b2055c35SXin LiThe mapping between distance code `distance_code` and the neighboring pixel
727*b2055c35SXin Lioffset `(xi, yi)` is as follows:
728*b2055c35SXin Li
729*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
730*b2055c35SXin Li(0, 1),  (1, 0),  (1, 1),  (-1, 1), (0, 2),  (2, 0),  (1, 2),
731*b2055c35SXin Li(-1, 2), (2, 1),  (-2, 1), (2, 2),  (-2, 2), (0, 3),  (3, 0),
732*b2055c35SXin Li(1, 3),  (-1, 3), (3, 1),  (-3, 1), (2, 3),  (-2, 3), (3, 2),
733*b2055c35SXin Li(-3, 2), (0, 4),  (4, 0),  (1, 4),  (-1, 4), (4, 1),  (-4, 1),
734*b2055c35SXin Li(3, 3),  (-3, 3), (2, 4),  (-2, 4), (4, 2),  (-4, 2), (0, 5),
735*b2055c35SXin Li(3, 4),  (-3, 4), (4, 3),  (-4, 3), (5, 0),  (1, 5),  (-1, 5),
736*b2055c35SXin Li(5, 1),  (-5, 1), (2, 5),  (-2, 5), (5, 2),  (-5, 2), (4, 4),
737*b2055c35SXin Li(-4, 4), (3, 5),  (-3, 5), (5, 3),  (-5, 3), (0, 6),  (6, 0),
738*b2055c35SXin Li(1, 6),  (-1, 6), (6, 1),  (-6, 1), (2, 6),  (-2, 6), (6, 2),
739*b2055c35SXin Li(-6, 2), (4, 5),  (-4, 5), (5, 4),  (-5, 4), (3, 6),  (-3, 6),
740*b2055c35SXin Li(6, 3),  (-6, 3), (0, 7),  (7, 0),  (1, 7),  (-1, 7), (5, 5),
741*b2055c35SXin Li(-5, 5), (7, 1),  (-7, 1), (4, 6),  (-4, 6), (6, 4),  (-6, 4),
742*b2055c35SXin Li(2, 7),  (-2, 7), (7, 2),  (-7, 2), (3, 7),  (-3, 7), (7, 3),
743*b2055c35SXin Li(-7, 3), (5, 6),  (-5, 6), (6, 5),  (-6, 5), (8, 0),  (4, 7),
744*b2055c35SXin Li(-4, 7), (7, 4),  (-7, 4), (8, 1),  (8, 2),  (6, 6),  (-6, 6),
745*b2055c35SXin Li(8, 3),  (5, 7),  (-5, 7), (7, 5),  (-7, 5), (8, 4),  (6, 7),
746*b2055c35SXin Li(-6, 7), (7, 6),  (-7, 6), (8, 5),  (7, 7),  (-7, 7), (8, 6),
747*b2055c35SXin Li(8, 7)
748*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
749*b2055c35SXin Li
750*b2055c35SXin LiFor example, the distance code `1` indicates an offset of `(0, 1)` for the
751*b2055c35SXin Lineighboring pixel, that is, the pixel above the current pixel (0 pixel
752*b2055c35SXin Lidifference in the X direction and 1 pixel difference in the Y direction).
753*b2055c35SXin LiSimilarly, the distance code `3` indicates the top-left pixel.
754*b2055c35SXin Li
755*b2055c35SXin LiThe decoder can convert a distance code `distance_code` to a scan-line order
756*b2055c35SXin Lidistance `dist` as follows:
757*b2055c35SXin Li
758*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
759*b2055c35SXin Li(xi, yi) = distance_map[distance_code - 1]
760*b2055c35SXin Lidist = xi + yi * image_width
761*b2055c35SXin Liif (dist < 1) {
762*b2055c35SXin Li  dist = 1
763*b2055c35SXin Li}
764*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
765*b2055c35SXin Li
766*b2055c35SXin Liwhere `distance_map` is the mapping noted above, and `image_width` is the width
767*b2055c35SXin Liof the image in pixels.
768*b2055c35SXin Li
769*b2055c35SXin Li#### 5.2.3 Color Cache Coding
770*b2055c35SXin Li{:#color-cache-code}
771*b2055c35SXin Li
772*b2055c35SXin LiColor cache stores a set of colors that have been recently used in the image.
773*b2055c35SXin Li
774*b2055c35SXin Li**Rationale:** This way, the recently used colors can sometimes be referred to
775*b2055c35SXin Limore efficiently than emitting them using the other two methods (described in
776*b2055c35SXin LiSections [5.2.1](#prefix-coded-literals) and [5.2.2](#lz77-backward-reference)).
777*b2055c35SXin Li
778*b2055c35SXin LiColor cache codes are stored as follows. First, there is a 1-bit value that
779*b2055c35SXin Liindicates if the color cache is used. If this bit is 0, no color cache codes
780*b2055c35SXin Liexist, and they are not transmitted in the prefix code that decodes the green
781*b2055c35SXin Lisymbols and the length prefix codes. However, if this bit is 1, the color cache
782*b2055c35SXin Lisize is read next:
783*b2055c35SXin Li
784*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
785*b2055c35SXin Liint color_cache_code_bits = ReadBits(4);
786*b2055c35SXin Liint color_cache_size = 1 << color_cache_code_bits;
787*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
788*b2055c35SXin Li
789*b2055c35SXin Li`color_cache_code_bits` defines the size of the color cache (`1 <<
790*b2055c35SXin Licolor_cache_code_bits`). The range of allowed values for
791*b2055c35SXin Li`color_cache_code_bits` is \[1..11\]. Compliant decoders must indicate a
792*b2055c35SXin Licorrupted bitstream for other values.
793*b2055c35SXin Li
794*b2055c35SXin LiA color cache is an array of size `color_cache_size`. Each entry stores one ARGB
795*b2055c35SXin Licolor. Colors are looked up by indexing them by `(0x1e35a7bd * color) >> (32 -
796*b2055c35SXin Licolor_cache_code_bits)`. Only one lookup is done in a color cache; there is no
797*b2055c35SXin Liconflict resolution.
798*b2055c35SXin Li
799*b2055c35SXin LiIn the beginning of decoding or encoding of an image, all entries in all color
800*b2055c35SXin Licache values are set to zero. The color cache code is converted to this color at
801*b2055c35SXin Lidecoding time. The state of the color cache is maintained by inserting every
802*b2055c35SXin Lipixel, be it produced by backward referencing or as literals, into the cache in
803*b2055c35SXin Lithe order they appear in the stream.
804*b2055c35SXin Li
805*b2055c35SXin Li6 Entropy Code
806*b2055c35SXin Li--------------
807*b2055c35SXin Li
808*b2055c35SXin Li### 6.1 Overview
809*b2055c35SXin Li
810*b2055c35SXin LiMost of the data is coded using a [canonical prefix code][canonical_huff].
811*b2055c35SXin LiHence, the codes are transmitted by sending the _prefix code lengths_, as
812*b2055c35SXin Liopposed to the actual _prefix codes_.
813*b2055c35SXin Li
814*b2055c35SXin LiIn particular, the format uses **spatially variant prefix coding**. In other
815*b2055c35SXin Liwords, different blocks of the image can potentially use different entropy
816*b2055c35SXin Licodes.
817*b2055c35SXin Li
818*b2055c35SXin Li**Rationale**: Different areas of the image may have different characteristics.
819*b2055c35SXin LiSo, allowing them to use different entropy codes provides more flexibility and
820*b2055c35SXin Lipotentially better compression.
821*b2055c35SXin Li
822*b2055c35SXin Li### 6.2 Details
823*b2055c35SXin Li
824*b2055c35SXin LiThe encoded image data consists of several parts:
825*b2055c35SXin Li
826*b2055c35SXin Li  1. Decoding and building the prefix codes.
827*b2055c35SXin Li  1. Meta prefix codes.
828*b2055c35SXin Li  1. Entropy-coded image data.
829*b2055c35SXin Li
830*b2055c35SXin LiFor any given pixel (x, y), there is a set of five prefix codes associated with
831*b2055c35SXin Liit. These codes are (in bitstream order):
832*b2055c35SXin Li
833*b2055c35SXin Li  * **Prefix code #1**: Used for green channel, backward-reference length, and
834*b2055c35SXin Li    color cache.
835*b2055c35SXin Li  * **Prefix code #2, #3, and #4**: Used for red, blue, and alpha channels,
836*b2055c35SXin Li    respectively.
837*b2055c35SXin Li  * **Prefix code #5**: Used for backward-reference distance.
838*b2055c35SXin Li
839*b2055c35SXin LiFrom here on, we refer to this set as a **prefix code group**.
840*b2055c35SXin Li
841*b2055c35SXin Li#### 6.2.1 Decoding and Building the Prefix Codes
842*b2055c35SXin Li
843*b2055c35SXin LiThis section describes how to read the prefix code lengths from the bitstream.
844*b2055c35SXin Li
845*b2055c35SXin LiThe prefix code lengths can be coded in two ways. The method used is specified
846*b2055c35SXin Liby a 1-bit value.
847*b2055c35SXin Li
848*b2055c35SXin Li  * If this bit is 1, it is a _simple code length code_.
849*b2055c35SXin Li  * If this bit is 0, it is a _normal code length code_.
850*b2055c35SXin Li
851*b2055c35SXin LiIn both cases, there can be unused code lengths that are still part of the
852*b2055c35SXin Listream. This may be inefficient, but it is allowed by the format.
853*b2055c35SXin LiThe described tree must be a complete binary tree. A single leaf node is
854*b2055c35SXin Liconsidered a complete binary tree and can be encoded using either the simple
855*b2055c35SXin Licode length code or the normal code length code. When coding a single leaf
856*b2055c35SXin Linode using the _normal code length code_, all but one code length are zeros,
857*b2055c35SXin Liand the single leaf node value is marked with the length of 1 -- even when no
858*b2055c35SXin Libits are consumed when that single leaf node tree is used.
859*b2055c35SXin Li
860*b2055c35SXin Li##### Simple Code Length Code
861*b2055c35SXin Li
862*b2055c35SXin LiThis variant is used in the special case when only 1 or 2 prefix symbols are in
863*b2055c35SXin Lithe range \[0..255\] with code length `1`. All other prefix code lengths are
864*b2055c35SXin Liimplicitly zeros.
865*b2055c35SXin Li
866*b2055c35SXin LiThe first bit indicates the number of symbols:
867*b2055c35SXin Li
868*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
869*b2055c35SXin Liint num_symbols = ReadBits(1) + 1;
870*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
871*b2055c35SXin Li
872*b2055c35SXin LiThe following are the symbol values.
873*b2055c35SXin Li
874*b2055c35SXin LiThis first symbol is coded using 1 or 8 bits, depending on the value of
875*b2055c35SXin Li`is_first_8bits`. The range is \[0..1\] or \[0..255\], respectively. The second
876*b2055c35SXin Lisymbol, if present, is always assumed to be in the range \[0..255\] and coded
877*b2055c35SXin Liusing 8 bits.
878*b2055c35SXin Li
879*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
880*b2055c35SXin Liint is_first_8bits = ReadBits(1);
881*b2055c35SXin Lisymbol0 = ReadBits(1 + 7 * is_first_8bits);
882*b2055c35SXin Licode_lengths[symbol0] = 1;
883*b2055c35SXin Liif (num_symbols == 2) {
884*b2055c35SXin Li  symbol1 = ReadBits(8);
885*b2055c35SXin Li  code_lengths[symbol1] = 1;
886*b2055c35SXin Li}
887*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
888*b2055c35SXin Li
889*b2055c35SXin LiThe two symbols should be different. Duplicate symbols are allowed, but
890*b2055c35SXin Liinefficient.
891*b2055c35SXin Li
892*b2055c35SXin Li**Note:** Another special case is when _all_ prefix code lengths are _zeros_ (an
893*b2055c35SXin Liempty prefix code). For example, a prefix code for distance can be empty if
894*b2055c35SXin Lithere are no backward references. Similarly, prefix codes for alpha, red, and
895*b2055c35SXin Liblue can be empty if all pixels within the same meta prefix code are produced
896*b2055c35SXin Liusing the color cache. However, this case doesn't need special handling, as
897*b2055c35SXin Liempty prefix codes can be coded as those containing a single symbol `0`.
898*b2055c35SXin Li
899*b2055c35SXin Li##### Normal Code Length Code
900*b2055c35SXin Li
901*b2055c35SXin LiThe code lengths of the prefix code fit in 8 bits and are read as follows.
902*b2055c35SXin LiFirst, `num_code_lengths` specifies the number of code lengths.
903*b2055c35SXin Li
904*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
905*b2055c35SXin Liint num_code_lengths = 4 + ReadBits(4);
906*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
907*b2055c35SXin Li
908*b2055c35SXin LiThe code lengths are themselves encoded using prefix codes; lower-level code
909*b2055c35SXin Lilengths, `code_length_code_lengths`, first have to be read. The rest of those
910*b2055c35SXin Li`code_length_code_lengths` (according to the order in `kCodeLengthCodeOrder`)
911*b2055c35SXin Liare zeros.
912*b2055c35SXin Li
913*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
914*b2055c35SXin Liint kCodeLengthCodes = 19;
915*b2055c35SXin Liint kCodeLengthCodeOrder[kCodeLengthCodes] = {
916*b2055c35SXin Li  17, 18, 0, 1, 2, 3, 4, 5, 16, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
917*b2055c35SXin Li};
918*b2055c35SXin Liint code_length_code_lengths[kCodeLengthCodes] = { 0 };  // All zeros
919*b2055c35SXin Lifor (i = 0; i < num_code_lengths; ++i) {
920*b2055c35SXin Li  code_length_code_lengths[kCodeLengthCodeOrder[i]] = ReadBits(3);
921*b2055c35SXin Li}
922*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
923*b2055c35SXin Li
924*b2055c35SXin LiNext, if `ReadBits(1) == 0`, the maximum number of different read symbols
925*b2055c35SXin Li(`max_symbol`) for each symbol type (A, R, G, B, and distance) is set to its
926*b2055c35SXin Lialphabet size:
927*b2055c35SXin Li
928*b2055c35SXin Li  * G channel: 256 + 24 + `color_cache_size`
929*b2055c35SXin Li  * Other literals (A, R, and B): 256
930*b2055c35SXin Li  * Distance code: 40
931*b2055c35SXin Li
932*b2055c35SXin LiOtherwise, it is defined as:
933*b2055c35SXin Li
934*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
935*b2055c35SXin Liint length_nbits = 2 + 2 * ReadBits(3);
936*b2055c35SXin Liint max_symbol = 2 + ReadBits(length_nbits);
937*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
938*b2055c35SXin Li
939*b2055c35SXin LiIf `max_symbol` is larger than the size of the alphabet for the symbol type, the
940*b2055c35SXin Libitstream is invalid.
941*b2055c35SXin Li
942*b2055c35SXin LiA prefix table is then built from `code_length_code_lengths` and used to read up
943*b2055c35SXin Lito `max_symbol` code lengths.
944*b2055c35SXin Li
945*b2055c35SXin Li  * Code \[0..15\] indicates literal code lengths.
946*b2055c35SXin Li    * Value 0 means no symbols have been coded.
947*b2055c35SXin Li    * Values \[1..15\] indicate the bit length of the respective code.
948*b2055c35SXin Li  * Code 16 repeats the previous nonzero value \[3..6\] times, that is,
949*b2055c35SXin Li    `3 + ReadBits(2)` times. If code 16 is used before a nonzero
950*b2055c35SXin Li    value has been emitted, a value of 8 is repeated.
951*b2055c35SXin Li  * Code 17 emits a streak of zeros of length \[3..10\], that is, `3 +
952*b2055c35SXin Li    ReadBits(3)` times.
953*b2055c35SXin Li  * Code 18 emits a streak of zeros of length \[11..138\], that is,
954*b2055c35SXin Li    `11 + ReadBits(7)` times.
955*b2055c35SXin Li
956*b2055c35SXin LiOnce code lengths are read, a prefix code for each symbol type (A, R, G, B, and
957*b2055c35SXin Lidistance) is formed using their respective alphabet sizes.
958*b2055c35SXin Li
959*b2055c35SXin LiThe Normal Code Length Code must code a full decision tree, that is, the sum of
960*b2055c35SXin Li`2 ^ (-length)` for all non-zero codes must be exactly one. There is however
961*b2055c35SXin Lione exception to this rule, the single leaf node tree, where the leaf node
962*b2055c35SXin Livalue is marked with value 1 and other values are 0s.
963*b2055c35SXin Li
964*b2055c35SXin Li#### 6.2.2 Decoding of Meta Prefix Codes
965*b2055c35SXin Li
966*b2055c35SXin LiAs noted earlier, the format allows the use of different prefix codes for
967*b2055c35SXin Lidifferent blocks of the image. _Meta prefix codes_ are indexes identifying which
968*b2055c35SXin Liprefix codes to use in different parts of the image.
969*b2055c35SXin Li
970*b2055c35SXin LiMeta prefix codes may be used _only_ when the image is being used in the
971*b2055c35SXin Li[role](#roles-of-image-data) of an _ARGB image_.
972*b2055c35SXin Li
973*b2055c35SXin LiThere are two possibilities for the meta prefix codes, indicated by a 1-bit
974*b2055c35SXin Livalue:
975*b2055c35SXin Li
976*b2055c35SXin Li  * If this bit is zero, there is only one meta prefix code used everywhere in
977*b2055c35SXin Li    the image. No more data is stored.
978*b2055c35SXin Li  * If this bit is one, the image uses multiple meta prefix codes. These meta
979*b2055c35SXin Li    prefix codes are stored as an _entropy image_ (described below).
980*b2055c35SXin Li
981*b2055c35SXin LiThe red and green components of a pixel define a 16-bit meta prefix code used in
982*b2055c35SXin Lia particular block of the ARGB image.
983*b2055c35SXin Li
984*b2055c35SXin Li##### Entropy Image
985*b2055c35SXin Li
986*b2055c35SXin LiThe entropy image defines which prefix codes are used in different parts of the
987*b2055c35SXin Liimage.
988*b2055c35SXin Li
989*b2055c35SXin LiThe first 3 bits contain the `prefix_bits` value. The dimensions of the entropy
990*b2055c35SXin Liimage are derived from `prefix_bits`:
991*b2055c35SXin Li
992*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
993*b2055c35SXin Liint prefix_bits = ReadBits(3) + 2;
994*b2055c35SXin Liint prefix_image_width =
995*b2055c35SXin Li    DIV_ROUND_UP(image_width, 1 << prefix_bits);
996*b2055c35SXin Liint prefix_image_height =
997*b2055c35SXin Li    DIV_ROUND_UP(image_height, 1 << prefix_bits);
998*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
999*b2055c35SXin Li
1000*b2055c35SXin Liwhere `DIV_ROUND_UP` is as defined [earlier](#predictor-transform).
1001*b2055c35SXin Li
1002*b2055c35SXin LiThe next bits contain an entropy image of width `prefix_image_width` and height
1003*b2055c35SXin Li`prefix_image_height`.
1004*b2055c35SXin Li
1005*b2055c35SXin Li##### Interpretation of Meta Prefix Codes
1006*b2055c35SXin Li
1007*b2055c35SXin LiThe number of prefix code groups in the ARGB image can be obtained by finding
1008*b2055c35SXin Lithe _largest meta prefix code_ from the entropy image:
1009*b2055c35SXin Li
1010*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1011*b2055c35SXin Liint num_prefix_groups = max(entropy image) + 1;
1012*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1013*b2055c35SXin Liwhere `max(entropy image)` indicates the largest prefix code stored in the
1014*b2055c35SXin Lientropy image.
1015*b2055c35SXin Li
1016*b2055c35SXin LiAs each prefix code group contains five prefix codes, the total number of prefix
1017*b2055c35SXin Licodes is:
1018*b2055c35SXin Li
1019*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1020*b2055c35SXin Liint num_prefix_codes = 5 * num_prefix_groups;
1021*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1022*b2055c35SXin Li
1023*b2055c35SXin LiGiven a pixel (x, y) in the ARGB image, we can obtain the corresponding prefix
1024*b2055c35SXin Licodes to be used as follows:
1025*b2055c35SXin Li
1026*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1027*b2055c35SXin Liint position =
1028*b2055c35SXin Li    (y >> prefix_bits) * prefix_image_width + (x >> prefix_bits);
1029*b2055c35SXin Liint meta_prefix_code = (entropy_image[position] >> 8) & 0xffff;
1030*b2055c35SXin LiPrefixCodeGroup prefix_group = prefix_code_groups[meta_prefix_code];
1031*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1032*b2055c35SXin Li
1033*b2055c35SXin Liwhere we have assumed the existence of `PrefixCodeGroup` structure, which
1034*b2055c35SXin Lirepresents a set of five prefix codes. Also, `prefix_code_groups` is an array of
1035*b2055c35SXin Li`PrefixCodeGroup` (of size `num_prefix_groups`).
1036*b2055c35SXin Li
1037*b2055c35SXin LiThe decoder then uses prefix code group `prefix_group` to decode the pixel
1038*b2055c35SXin Li(x, y), as explained in ["Decoding Entropy-Coded Image
1039*b2055c35SXin LiData"](#decoding-entropy-coded-image-data).
1040*b2055c35SXin Li
1041*b2055c35SXin Li#### 6.2.3 Decoding Entropy-Coded Image Data
1042*b2055c35SXin Li
1043*b2055c35SXin LiFor the current position (x, y) in the image, the decoder first identifies the
1044*b2055c35SXin Licorresponding prefix code group (as explained in the last section). Given the
1045*b2055c35SXin Liprefix code group, the pixel is read and decoded as follows.
1046*b2055c35SXin Li
1047*b2055c35SXin LiNext, read the symbol S from the bitstream using prefix code #1. Note that S is
1048*b2055c35SXin Liany integer in the range `0` to
1049*b2055c35SXin Li`(256 + 24 + ` [`color_cache_size`](#color-cache-code)` - 1)`.
1050*b2055c35SXin Li
1051*b2055c35SXin LiThe interpretation of S depends on its value:
1052*b2055c35SXin Li
1053*b2055c35SXin Li  1. If S < 256
1054*b2055c35SXin Li     1. Use S as the green component.
1055*b2055c35SXin Li     1. Read red from the bitstream using prefix code #2.
1056*b2055c35SXin Li     1. Read blue from the bitstream using prefix code #3.
1057*b2055c35SXin Li     1. Read alpha from the bitstream using prefix code #4.
1058*b2055c35SXin Li  1. If S >= 256 & S < 256 + 24
1059*b2055c35SXin Li     1. Use S - 256 as a length prefix code.
1060*b2055c35SXin Li     1. Read extra bits for the length from the bitstream.
1061*b2055c35SXin Li     1. Determine backward-reference length L from length prefix code and the
1062*b2055c35SXin Li        extra bits read.
1063*b2055c35SXin Li     1. Read the distance prefix code from the bitstream using prefix code #5.
1064*b2055c35SXin Li     1. Read extra bits for the distance from the bitstream.
1065*b2055c35SXin Li     1. Determine backward-reference distance D from the distance prefix code
1066*b2055c35SXin Li        and the extra bits read.
1067*b2055c35SXin Li     1. Copy L pixels (in scan-line order) from the sequence of pixels starting
1068*b2055c35SXin Li        at the current position minus D pixels.
1069*b2055c35SXin Li  1. If S >= 256 + 24
1070*b2055c35SXin Li     1. Use S - (256 + 24) as the index into the color cache.
1071*b2055c35SXin Li     1. Get ARGB color from the color cache at that index.
1072*b2055c35SXin Li
1073*b2055c35SXin Li7 Overall Structure of the Format
1074*b2055c35SXin Li---------------------------------
1075*b2055c35SXin Li
1076*b2055c35SXin LiBelow is a view into the format in Augmented Backus-Naur Form (ABNF)
1077*b2055c35SXin Li[RFC 5234][] [RFC 7405][]. It does not cover all details. The end-of-image (EOI)
1078*b2055c35SXin Liis only implicitly coded into the number of pixels (image_width * image_height).
1079*b2055c35SXin Li
1080*b2055c35SXin LiNote that `*element` means `element` can be repeated 0 or more times. `5element`
1081*b2055c35SXin Limeans `element` is repeated exactly 5 times. `%b` represents a binary value.
1082*b2055c35SXin Li
1083*b2055c35SXin Li#### 7.1 Basic Structure
1084*b2055c35SXin Li
1085*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1086*b2055c35SXin Liformat        = RIFF-header image-header image-stream
1087*b2055c35SXin LiRIFF-header   = %s"RIFF" 4OCTET %s"WEBPVP8L" 4OCTET
1088*b2055c35SXin Liimage-header  = %x2F image-size alpha-is-used version
1089*b2055c35SXin Liimage-size    = 14BIT 14BIT ; width - 1, height - 1
1090*b2055c35SXin Lialpha-is-used = 1BIT
1091*b2055c35SXin Liversion       = 3BIT ; 0
1092*b2055c35SXin Liimage-stream  = optional-transform spatially-coded-image
1093*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1094*b2055c35SXin Li
1095*b2055c35SXin Li#### 7.2 Structure of Transforms
1096*b2055c35SXin Li
1097*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1098*b2055c35SXin Lioptional-transform   =  (%b1 transform optional-transform) / %b0
1099*b2055c35SXin Litransform            =  predictor-tx / color-tx / subtract-green-tx
1100*b2055c35SXin Litransform            =/ color-indexing-tx
1101*b2055c35SXin Li
1102*b2055c35SXin Lipredictor-tx         =  %b00 predictor-image
1103*b2055c35SXin Lipredictor-image      =  3BIT ; sub-pixel code
1104*b2055c35SXin Li                        entropy-coded-image
1105*b2055c35SXin Li
1106*b2055c35SXin Licolor-tx             =  %b01 color-image
1107*b2055c35SXin Licolor-image          =  3BIT ; sub-pixel code
1108*b2055c35SXin Li                        entropy-coded-image
1109*b2055c35SXin Li
1110*b2055c35SXin Lisubtract-green-tx    =  %b10
1111*b2055c35SXin Li
1112*b2055c35SXin Licolor-indexing-tx    =  %b11 color-indexing-image
1113*b2055c35SXin Licolor-indexing-image =  8BIT ; color count
1114*b2055c35SXin Li                        entropy-coded-image
1115*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1116*b2055c35SXin Li
1117*b2055c35SXin Li#### 7.3 Structure of the Image Data
1118*b2055c35SXin Li
1119*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1120*b2055c35SXin Lispatially-coded-image =  color-cache-info meta-prefix data
1121*b2055c35SXin Lientropy-coded-image   =  color-cache-info data
1122*b2055c35SXin Li
1123*b2055c35SXin Licolor-cache-info      =  %b0
1124*b2055c35SXin Licolor-cache-info      =/ (%b1 4BIT) ; 1 followed by color cache size
1125*b2055c35SXin Li
1126*b2055c35SXin Limeta-prefix           =  %b0 / (%b1 entropy-image)
1127*b2055c35SXin Li
1128*b2055c35SXin Lidata                  =  prefix-codes lz77-coded-image
1129*b2055c35SXin Lientropy-image         =  3BIT ; subsample value
1130*b2055c35SXin Li                         entropy-coded-image
1131*b2055c35SXin Li
1132*b2055c35SXin Liprefix-codes          =  prefix-code-group *prefix-codes
1133*b2055c35SXin Liprefix-code-group     =
1134*b2055c35SXin Li    5prefix-code ; See "Interpretation of Meta Prefix Codes" to
1135*b2055c35SXin Li                 ; understand what each of these five prefix
1136*b2055c35SXin Li                 ; codes are for.
1137*b2055c35SXin Li
1138*b2055c35SXin Liprefix-code           =  simple-prefix-code / normal-prefix-code
1139*b2055c35SXin Lisimple-prefix-code    =  ; see "Simple Code Length Code" for details
1140*b2055c35SXin Linormal-prefix-code    =  ; see "Normal Code Length Code" for details
1141*b2055c35SXin Li
1142*b2055c35SXin Lilz77-coded-image      =
1143*b2055c35SXin Li    *((argb-pixel / lz77-copy / color-cache-code) lz77-coded-image)
1144*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1145*b2055c35SXin Li
1146*b2055c35SXin LiThe following is a possible example sequence:
1147*b2055c35SXin Li
1148*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1149*b2055c35SXin LiRIFF-header image-size %b1 subtract-green-tx
1150*b2055c35SXin Li%b1 predictor-tx %b0 color-cache-info
1151*b2055c35SXin Li%b0 prefix-codes lz77-coded-image
1152*b2055c35SXin Li~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1153*b2055c35SXin Li
1154*b2055c35SXin Li[RFC 5234]: https://www.rfc-editor.org/rfc/rfc5234
1155*b2055c35SXin Li[RFC 7405]: https://www.rfc-editor.org/rfc/rfc7405
1156*b2055c35SXin Li[canonical_huff]: https://en.wikipedia.org/wiki/Canonical_Huffman_code
1157