xref: /aosp_15_r20/external/mesa3d/docs/isl/ccs.rst (revision 6104692788411f58d303aa86923a9ff6ecaded22)
1*61046927SAndroid Build Coastguard WorkerSingle-sampled Color Compression
2*61046927SAndroid Build Coastguard Worker================================
3*61046927SAndroid Build Coastguard Worker
4*61046927SAndroid Build Coastguard WorkerStarting with Ivy Bridge, Intel graphics hardware provides a form of color
5*61046927SAndroid Build Coastguard Workercompression for single-sampled surfaces.  In its initial form, this provided an
6*61046927SAndroid Build Coastguard Workeracceleration of render target clear operations that, in the common case, allows
7*61046927SAndroid Build Coastguard Workeryou to avoid almost all of the bandwidth of a full-surface clear operation.  On
8*61046927SAndroid Build Coastguard WorkerSky Lake, single-sampled color compression was extended to allow for the
9*61046927SAndroid Build Coastguard Workercompression color values from actual rendering and not just the initial clear.
10*61046927SAndroid Build Coastguard WorkerFrom here on, the older Ivy Bridge form of color compression will be called
11*61046927SAndroid Build Coastguard Worker"fast-clears" and term "color compression" will be reserved for the more
12*61046927SAndroid Build Coastguard Workerpowerful Sky Lake form.
13*61046927SAndroid Build Coastguard Worker
14*61046927SAndroid Build Coastguard WorkerThe documentation for Ivy Bridge through Broadwell overloads the term MCS for
15*61046927SAndroid Build Coastguard Workerreferring both to the *multisample control surface* used for multisample
16*61046927SAndroid Build Coastguard Workercompression and the control surface used for fast-clears. In ISL, the
17*61046927SAndroid Build Coastguard Worker:c:enumerator:`isl_aux_usage.ISL_AUX_USAGE_MCS` enum always refers to
18*61046927SAndroid Build Coastguard Workermultisample color compression while the
19*61046927SAndroid Build Coastguard Worker:c:enumerator:`isl_aux_usage.ISL_AUX_USAGE_CCS_D` and
20*61046927SAndroid Build Coastguard Worker:c:enumerator:`isl_aux_usage.ISL_AUX_USAGE_CCS_E` enums always refer to
21*61046927SAndroid Build Coastguard Workersingle-sampled color compression. Throughout this chapter and the rest of the
22*61046927SAndroid Build Coastguard WorkerISL documentation, we will use the term "color control surface", abbreviated
23*61046927SAndroid Build Coastguard WorkerCCS, to denote the control surface used for both fast-clears and color
24*61046927SAndroid Build Coastguard Workercompression.  While this is still an overloaded term, Ivy Bridge fast-clears
25*61046927SAndroid Build Coastguard Workerare much closer to Sky Lake color compression than they are to multisample
26*61046927SAndroid Build Coastguard Workercompression.
27*61046927SAndroid Build Coastguard Worker
28*61046927SAndroid Build Coastguard WorkerCCS data
29*61046927SAndroid Build Coastguard Worker--------
30*61046927SAndroid Build Coastguard Worker
31*61046927SAndroid Build Coastguard WorkerFast clears and CCS are possibly the single most poorly documented aspect of
32*61046927SAndroid Build Coastguard Workersurface layout/setup for Intel graphics hardware (with HiZ coming in a neat
33*61046927SAndroid Build Coastguard Workersecond). All the documentation really says is that you can use an MCS buffer on
34*61046927SAndroid Build Coastguard Workersingle-sampled surfaces (we will call it the CCS in this case). It also
35*61046927SAndroid Build Coastguard Workerprovides some documentation on how to program the hardware to perform clear
36*61046927SAndroid Build Coastguard Workeroperations, but that's it.  How big is this buffer?  What does it contain?
37*61046927SAndroid Build Coastguard WorkerThose question are left as exercises to the reader. Almost everything we know
38*61046927SAndroid Build Coastguard Workerabout the contents of the CCS is gleaned from reverse-engineering of the
39*61046927SAndroid Build Coastguard Workerhardware.  The best bit of documentation we have ever had comes from the
40*61046927SAndroid Build Coastguard Workerdisplay section of the Sky Lake PRM Vol 12 section on planes (p. 159):
41*61046927SAndroid Build Coastguard Worker
42*61046927SAndroid Build Coastguard Worker   The Color Control Surface (CCS) contains the compression status of the
43*61046927SAndroid Build Coastguard Worker   cache-line pairs. The compression state of the cache-line pair is
44*61046927SAndroid Build Coastguard Worker   specified by 2 bits in the CCS.  Each CCS cache-line represents an area
45*61046927SAndroid Build Coastguard Worker   on the main surface of 16x16 sets of 128 byte Y-tiled cache-line-pairs.
46*61046927SAndroid Build Coastguard Worker   CCS is always Y tiled.
47*61046927SAndroid Build Coastguard Worker
48*61046927SAndroid Build Coastguard WorkerWhile this is technically for color compression and not fast-clears, it
49*61046927SAndroid Build Coastguard Workerprovides a good bit of insight into how color compression and fast-clears
50*61046927SAndroid Build Coastguard Workeroperate.  Each cache-line pair, in the main surface corresponds to 1 or 2 bits
51*61046927SAndroid Build Coastguard Workerin the CCS.  The primary difference, as far as the current discussion is
52*61046927SAndroid Build Coastguard Workerconcerned, is that fast-clears use only 1 bit per cache-line pair whereas color
53*61046927SAndroid Build Coastguard Workercompression uses 2 bits.
54*61046927SAndroid Build Coastguard Worker
55*61046927SAndroid Build Coastguard WorkerWhat is a cache-line pair?  Both the X and Y tiling formats are arranged as an
56*61046927SAndroid Build Coastguard Worker8x8 grid of cache lines.  (See the :doc:`chapter on tiling <tiling>` for more
57*61046927SAndroid Build Coastguard Workerdetails.)  In either case, a cache-line pair is a pair of cache lines whose
58*61046927SAndroid Build Coastguard Workerstarting addresses differ by 512 bytes or 8 cache lines.  This results in the
59*61046927SAndroid Build Coastguard Workertwo cache lines being vertically adjacent when the main surface is X-tiled and
60*61046927SAndroid Build Coastguard Workerhorizontally adjacent when the main surface is Y-tiled.  For an X-tiled surface
61*61046927SAndroid Build Coastguard Workerthis forms an area of 64B x 2rows and for a Y-tiled surface this forms an area
62*61046927SAndroid Build Coastguard Workerof 32B x 4rows.  In either case, it is guaranteed that, regardless of surface
63*61046927SAndroid Build Coastguard Workerformat, each 2x2 subspan coming out of a shader will land entirely within one
64*61046927SAndroid Build Coastguard Workercache-line pair.
65*61046927SAndroid Build Coastguard Worker
66*61046927SAndroid Build Coastguard WorkerWhat is the correspondence between bits and cache-line pairs?  The best model I
67*61046927SAndroid Build Coastguard Worker(Faith) know of is to consider the CCS as having a 1-bit color format for
68*61046927SAndroid Build Coastguard Workerfast-clears and a 2-bit format for color compression and a special tiling
69*61046927SAndroid Build Coastguard Workerformat.  The CCS tiling formats operate on a 1 or 2-bit granularity rather than
70*61046927SAndroid Build Coastguard Workerthe byte granularity of most tiling formats.
71*61046927SAndroid Build Coastguard Worker
72*61046927SAndroid Build Coastguard WorkerThe following table represents the bit-layouts that yield the CCS tiling format
73*61046927SAndroid Build Coastguard Workeron different hardware generations.  Bits 0-11 correspond to the regular swizzle
74*61046927SAndroid Build Coastguard Workerof bytes within a 4KB page whereas the negative bits represent the address of
75*61046927SAndroid Build Coastguard Workerthe particular 1 or 2-bit portion of a byte. (Note: The Haswell data was
76*61046927SAndroid Build Coastguard Workergathered on a dual-channel system so bit-6 swizzling was enabled.  It's unclear
77*61046927SAndroid Build Coastguard Workerhow this affects the CCS layout.)
78*61046927SAndroid Build Coastguard Worker
79*61046927SAndroid Build Coastguard Worker============ ======== =========== =========== ====================== =========== =========== =========== =========== =========== =========== =========== =========== =========== =========== =========== ===========
80*61046927SAndroid Build Coastguard Worker Generation   Tiling       11          10               9                 8           7           6           5           4           3           2           1           0          -1          -2          -3
81*61046927SAndroid Build Coastguard Worker============ ======== =========== =========== ====================== =========== =========== =========== =========== =========== =========== =========== =========== =========== =========== =========== ===========
82*61046927SAndroid Build Coastguard Worker Ivy Bridge   X or Y  :math:`u_6` :math:`u_5`      :math:`u_4`       :math:`v_7` :math:`v_6` :math:`v_5` :math:`v_4` :math:`v_2` :math:`v_3` :math:`v_1` :math:`v_0` :math:`u_3` :math:`u_2` :math:`u_1` :math:`u_0`
83*61046927SAndroid Build Coastguard Worker Haswell        X     :math:`u_6` :math:`u_5` :math:`v_3 \oplus u_1` :math:`v_7` :math:`v_6` :math:`v_5` :math:`v_4` :math:`v_2` :math:`v_3` :math:`v_1` :math:`v_0` :math:`u_4` :math:`u_3` :math:`u_2` :math:`u_0`
84*61046927SAndroid Build Coastguard Worker Haswell        Y     :math:`u_6` :math:`u_5` :math:`v_2 \oplus u_1` :math:`v_7` :math:`v_6` :math:`v_5` :math:`v_4` :math:`v_2` :math:`v_3` :math:`v_1` :math:`v_0` :math:`u_4` :math:`u_3` :math:`u_2` :math:`u_0`
85*61046927SAndroid Build Coastguard Worker Broadwell      X     :math:`u_6` :math:`u_5`      :math:`u_4`       :math:`v_7` :math:`v_6` :math:`v_5` :math:`v_4` :math:`u_3` :math:`v_3` :math:`u_2` :math:`u_1` :math:`u_0` :math:`v_2` :math:`v_1` :math:`v_0`
86*61046927SAndroid Build Coastguard Worker Broadwell      Y     :math:`u_6` :math:`u_5`      :math:`u_4`       :math:`v_7` :math:`v_6` :math:`v_5` :math:`v_4` :math:`v_2` :math:`v_3` :math:`u_3` :math:`u_2` :math:`u_1` :math:`v_1` :math:`v_0` :math:`u_0`
87*61046927SAndroid Build Coastguard Worker Sky Lake       Y     :math:`u_6` :math:`u_5`      :math:`u_4`       :math:`v_6` :math:`v_5` :math:`v_4` :math:`v_3` :math:`v_2` :math:`v_1` :math:`u_3` :math:`u_2` :math:`u_1` :math:`v_0` :math:`u_0`
88*61046927SAndroid Build Coastguard Worker============ ======== =========== =========== ====================== =========== =========== =========== =========== =========== =========== =========== =========== =========== =========== =========== ===========
89*61046927SAndroid Build Coastguard Worker
90*61046927SAndroid Build Coastguard WorkerCCS surface layout
91*61046927SAndroid Build Coastguard Worker------------------
92*61046927SAndroid Build Coastguard Worker
93*61046927SAndroid Build Coastguard WorkerStarting with Broadwell, fast-clears and color compression can be used on
94*61046927SAndroid Build Coastguard Workermipmapped and array surfaces.  When considered from a higher level, the CCS is
95*61046927SAndroid Build Coastguard Workerlaid out like any other surface.  The Broadwell and Sky Lake PRMs describe
96*61046927SAndroid Build Coastguard Workerthis as follows:
97*61046927SAndroid Build Coastguard Worker
98*61046927SAndroid Build Coastguard WorkerBroadwell PRM Vol 7, "MCS Buffer for Render Target(s)" (p. 676):
99*61046927SAndroid Build Coastguard Worker
100*61046927SAndroid Build Coastguard Worker   Mip-mapped and arrayed surfaces are supported with MCS buffer layout with
101*61046927SAndroid Build Coastguard Worker   these alignments in the RT space: Horizontal Alignment = 256 and Vertical
102*61046927SAndroid Build Coastguard Worker   Alignment = 128.
103*61046927SAndroid Build Coastguard Worker
104*61046927SAndroid Build Coastguard WorkerBroadwell PRM Vol 2d, "RENDER_SURFACE_STATE" (p. 279):
105*61046927SAndroid Build Coastguard Worker
106*61046927SAndroid Build Coastguard Worker   For non-multisampled render target's auxiliary surface, MCS, QPitch must be
107*61046927SAndroid Build Coastguard Worker   computed with Horizontal Alignment = 256 and Surface Vertical Alignment =
108*61046927SAndroid Build Coastguard Worker   128. These alignments are only for MCS buffer and not for associated render
109*61046927SAndroid Build Coastguard Worker   target.
110*61046927SAndroid Build Coastguard Worker
111*61046927SAndroid Build Coastguard WorkerSky Lake PRM Vol 7, "MCS Buffer for Render Target(s)" (p. 632):
112*61046927SAndroid Build Coastguard Worker
113*61046927SAndroid Build Coastguard Worker   Mip-mapped and arrayed surfaces are supported with MCS buffer layout with
114*61046927SAndroid Build Coastguard Worker   these alignments in the RT space: Horizontal Alignment = 128 and Vertical
115*61046927SAndroid Build Coastguard Worker   Alignment = 64.
116*61046927SAndroid Build Coastguard Worker
117*61046927SAndroid Build Coastguard WorkerSky Lake PRM Vol. 2d, "RENDER_SURFACE_STATE" (p. 435):
118*61046927SAndroid Build Coastguard Worker
119*61046927SAndroid Build Coastguard Worker   For non-multisampled render target's CCS auxiliary surface, QPitch must be
120*61046927SAndroid Build Coastguard Worker   computed with Horizontal Alignment = 128 and Surface Vertical Alignment
121*61046927SAndroid Build Coastguard Worker   = 256. These alignments are only for CCS buffer and not for associated
122*61046927SAndroid Build Coastguard Worker   render target.
123*61046927SAndroid Build Coastguard Worker
124*61046927SAndroid Build Coastguard WorkerEmpirical evidence seems to confirm this.  On Sky Lake, the vertical alignment
125*61046927SAndroid Build Coastguard Workeris always one cache line.  The horizontal alignment, however, varies by main
126*61046927SAndroid Build Coastguard Workersurface format: 1 cache line for 32bpp, 2 for 64bpp and 4 cache lines for
127*61046927SAndroid Build Coastguard Worker128bpp formats.  This nicely corresponds to the alignment of 128x64 pixels in
128*61046927SAndroid Build Coastguard Workerthe primary color surface.  The second PRM citation about Sky Lake CCS above
129*61046927SAndroid Build Coastguard Workergives a vertical alignment of 256 rather than 64.  With a little
130*61046927SAndroid Build Coastguard Workerexperimentation, this additional alignment appears to only apply to QPitch and
131*61046927SAndroid Build Coastguard Workernot to the miplevels within a slice.
132*61046927SAndroid Build Coastguard Worker
133*61046927SAndroid Build Coastguard WorkerOn Broadwell, each miplevel in the CCS is aligned to a cache-line pair
134*61046927SAndroid Build Coastguard Workerboundary: horizontal when the primary surface is X-tiled and vertical when
135*61046927SAndroid Build Coastguard WorkerY-tiled. For a 32bpp format, this works out to an alignment of 256x128 main
136*61046927SAndroid Build Coastguard Workersurface pixels regardless of X or Y tiling.  On Sky Lake, the alignment is
137*61046927SAndroid Build Coastguard Workera single cache line which works out to an alignment of 128x64 main surface
138*61046927SAndroid Build Coastguard Workerpixels.
139*61046927SAndroid Build Coastguard Worker
140*61046927SAndroid Build Coastguard WorkerTODO: More than just 32bpp formats on Broadwell!
141*61046927SAndroid Build Coastguard Worker
142*61046927SAndroid Build Coastguard WorkerOnce armed with the above alignment information, we can lay out the CCS surface
143*61046927SAndroid Build Coastguard Workeritself.  The way ISL does CCS layout calculations is by a very careful  and
144*61046927SAndroid Build Coastguard Workersubtle application of its normal surface layout code.
145*61046927SAndroid Build Coastguard Worker
146*61046927SAndroid Build Coastguard WorkerAbove, we described the CCS data layout as mapping of address bits. In
147*61046927SAndroid Build Coastguard WorkerISL, this is represented by :c:enumerator:`isl_tiling.ISL_TILING_CCS`.  The
148*61046927SAndroid Build Coastguard Workerlogical and physical tile dimensions corresponding to the above mapping.
149*61046927SAndroid Build Coastguard Worker
150*61046927SAndroid Build Coastguard WorkerWe also have special :c:enum:`isl_format` enums for CCS.  These formats are 1
151*61046927SAndroid Build Coastguard Workerbit-per-pixel on Ivy Bridge through Broadwell and 2 bits-per-pixel on Skylake
152*61046927SAndroid Build Coastguard Workerand above to correspond to the 1 and 2-bit values represented in the CCS data.
153*61046927SAndroid Build Coastguard WorkerThey have a block size (similar to a block compressed format such as BC or
154*61046927SAndroid Build Coastguard WorkerASTC) which says what area (in surface elements) in the main surface is covered
155*61046927SAndroid Build Coastguard Workerby a single CCS element (1 or 2-bit).  Because this depends on the main surface
156*61046927SAndroid Build Coastguard Workertiling and format, we have several different CCS formats.
157*61046927SAndroid Build Coastguard Worker
158*61046927SAndroid Build Coastguard WorkerOnce the appropriate :c:enum:`isl_format` has been selected, computing the
159*61046927SAndroid Build Coastguard Workersize and layout of a CCS surface is as simple as passing the same surface
160*61046927SAndroid Build Coastguard Workercreation parameters to :c:func:`isl_surf_init_s` as were used to create the
161*61046927SAndroid Build Coastguard Workerprimary surface only with :c:enumerator:`isl_tiling.ISL_TILING_CCS` and the
162*61046927SAndroid Build Coastguard Workercorrect CCS format.  This not only results in a correctly sized surface but
163*61046927SAndroid Build Coastguard Workermost other ISL helpers for things such as computing offsets into surfaces work
164*61046927SAndroid Build Coastguard Workercorrectly as well.
165*61046927SAndroid Build Coastguard Worker
166*61046927SAndroid Build Coastguard WorkerCCS on Tigerlake and above
167*61046927SAndroid Build Coastguard Worker--------------------------
168*61046927SAndroid Build Coastguard Worker
169*61046927SAndroid Build Coastguard WorkerStarting with Tigerlake, CCS is no longer done via a surface and, instead, the
170*61046927SAndroid Build Coastguard Workerterm CCS gets overloaded once again (gotta love it!) to now refer to a form of
171*61046927SAndroid Build Coastguard Workeruniversal compression which can be applied to almost any surface.  Nothing in
172*61046927SAndroid Build Coastguard Workerthis chapter applies to any hardware with a graphics IP version 12 or above.
173