1*61046927SAndroid Build Coastguard WorkerSingle-sampled Color Compression 2*61046927SAndroid Build Coastguard Worker================================ 3*61046927SAndroid Build Coastguard Worker 4*61046927SAndroid Build Coastguard WorkerStarting with Ivy Bridge, Intel graphics hardware provides a form of color 5*61046927SAndroid Build Coastguard Workercompression for single-sampled surfaces. In its initial form, this provided an 6*61046927SAndroid Build Coastguard Workeracceleration of render target clear operations that, in the common case, allows 7*61046927SAndroid Build Coastguard Workeryou to avoid almost all of the bandwidth of a full-surface clear operation. On 8*61046927SAndroid Build Coastguard WorkerSky Lake, single-sampled color compression was extended to allow for the 9*61046927SAndroid Build Coastguard Workercompression color values from actual rendering and not just the initial clear. 10*61046927SAndroid Build Coastguard WorkerFrom here on, the older Ivy Bridge form of color compression will be called 11*61046927SAndroid Build Coastguard Worker"fast-clears" and term "color compression" will be reserved for the more 12*61046927SAndroid Build Coastguard Workerpowerful Sky Lake form. 13*61046927SAndroid Build Coastguard Worker 14*61046927SAndroid Build Coastguard WorkerThe documentation for Ivy Bridge through Broadwell overloads the term MCS for 15*61046927SAndroid Build Coastguard Workerreferring both to the *multisample control surface* used for multisample 16*61046927SAndroid Build Coastguard Workercompression and the control surface used for fast-clears. In ISL, the 17*61046927SAndroid Build Coastguard Worker:c:enumerator:`isl_aux_usage.ISL_AUX_USAGE_MCS` enum always refers to 18*61046927SAndroid Build Coastguard Workermultisample color compression while the 19*61046927SAndroid Build Coastguard Worker:c:enumerator:`isl_aux_usage.ISL_AUX_USAGE_CCS_D` and 20*61046927SAndroid Build Coastguard Worker:c:enumerator:`isl_aux_usage.ISL_AUX_USAGE_CCS_E` enums always refer to 21*61046927SAndroid Build Coastguard Workersingle-sampled color compression. Throughout this chapter and the rest of the 22*61046927SAndroid Build Coastguard WorkerISL documentation, we will use the term "color control surface", abbreviated 23*61046927SAndroid Build Coastguard WorkerCCS, to denote the control surface used for both fast-clears and color 24*61046927SAndroid Build Coastguard Workercompression. While this is still an overloaded term, Ivy Bridge fast-clears 25*61046927SAndroid Build Coastguard Workerare much closer to Sky Lake color compression than they are to multisample 26*61046927SAndroid Build Coastguard Workercompression. 27*61046927SAndroid Build Coastguard Worker 28*61046927SAndroid Build Coastguard WorkerCCS data 29*61046927SAndroid Build Coastguard Worker-------- 30*61046927SAndroid Build Coastguard Worker 31*61046927SAndroid Build Coastguard WorkerFast clears and CCS are possibly the single most poorly documented aspect of 32*61046927SAndroid Build Coastguard Workersurface layout/setup for Intel graphics hardware (with HiZ coming in a neat 33*61046927SAndroid Build Coastguard Workersecond). All the documentation really says is that you can use an MCS buffer on 34*61046927SAndroid Build Coastguard Workersingle-sampled surfaces (we will call it the CCS in this case). It also 35*61046927SAndroid Build Coastguard Workerprovides some documentation on how to program the hardware to perform clear 36*61046927SAndroid Build Coastguard Workeroperations, but that's it. How big is this buffer? What does it contain? 37*61046927SAndroid Build Coastguard WorkerThose question are left as exercises to the reader. Almost everything we know 38*61046927SAndroid Build Coastguard Workerabout the contents of the CCS is gleaned from reverse-engineering of the 39*61046927SAndroid Build Coastguard Workerhardware. The best bit of documentation we have ever had comes from the 40*61046927SAndroid Build Coastguard Workerdisplay section of the Sky Lake PRM Vol 12 section on planes (p. 159): 41*61046927SAndroid Build Coastguard Worker 42*61046927SAndroid Build Coastguard Worker The Color Control Surface (CCS) contains the compression status of the 43*61046927SAndroid Build Coastguard Worker cache-line pairs. The compression state of the cache-line pair is 44*61046927SAndroid Build Coastguard Worker specified by 2 bits in the CCS. Each CCS cache-line represents an area 45*61046927SAndroid Build Coastguard Worker on the main surface of 16x16 sets of 128 byte Y-tiled cache-line-pairs. 46*61046927SAndroid Build Coastguard Worker CCS is always Y tiled. 47*61046927SAndroid Build Coastguard Worker 48*61046927SAndroid Build Coastguard WorkerWhile this is technically for color compression and not fast-clears, it 49*61046927SAndroid Build Coastguard Workerprovides a good bit of insight into how color compression and fast-clears 50*61046927SAndroid Build Coastguard Workeroperate. Each cache-line pair, in the main surface corresponds to 1 or 2 bits 51*61046927SAndroid Build Coastguard Workerin the CCS. The primary difference, as far as the current discussion is 52*61046927SAndroid Build Coastguard Workerconcerned, is that fast-clears use only 1 bit per cache-line pair whereas color 53*61046927SAndroid Build Coastguard Workercompression uses 2 bits. 54*61046927SAndroid Build Coastguard Worker 55*61046927SAndroid Build Coastguard WorkerWhat is a cache-line pair? Both the X and Y tiling formats are arranged as an 56*61046927SAndroid Build Coastguard Worker8x8 grid of cache lines. (See the :doc:`chapter on tiling <tiling>` for more 57*61046927SAndroid Build Coastguard Workerdetails.) In either case, a cache-line pair is a pair of cache lines whose 58*61046927SAndroid Build Coastguard Workerstarting addresses differ by 512 bytes or 8 cache lines. This results in the 59*61046927SAndroid Build Coastguard Workertwo cache lines being vertically adjacent when the main surface is X-tiled and 60*61046927SAndroid Build Coastguard Workerhorizontally adjacent when the main surface is Y-tiled. For an X-tiled surface 61*61046927SAndroid Build Coastguard Workerthis forms an area of 64B x 2rows and for a Y-tiled surface this forms an area 62*61046927SAndroid Build Coastguard Workerof 32B x 4rows. In either case, it is guaranteed that, regardless of surface 63*61046927SAndroid Build Coastguard Workerformat, each 2x2 subspan coming out of a shader will land entirely within one 64*61046927SAndroid Build Coastguard Workercache-line pair. 65*61046927SAndroid Build Coastguard Worker 66*61046927SAndroid Build Coastguard WorkerWhat is the correspondence between bits and cache-line pairs? The best model I 67*61046927SAndroid Build Coastguard Worker(Faith) know of is to consider the CCS as having a 1-bit color format for 68*61046927SAndroid Build Coastguard Workerfast-clears and a 2-bit format for color compression and a special tiling 69*61046927SAndroid Build Coastguard Workerformat. The CCS tiling formats operate on a 1 or 2-bit granularity rather than 70*61046927SAndroid Build Coastguard Workerthe byte granularity of most tiling formats. 71*61046927SAndroid Build Coastguard Worker 72*61046927SAndroid Build Coastguard WorkerThe following table represents the bit-layouts that yield the CCS tiling format 73*61046927SAndroid Build Coastguard Workeron different hardware generations. Bits 0-11 correspond to the regular swizzle 74*61046927SAndroid Build Coastguard Workerof bytes within a 4KB page whereas the negative bits represent the address of 75*61046927SAndroid Build Coastguard Workerthe particular 1 or 2-bit portion of a byte. (Note: The Haswell data was 76*61046927SAndroid Build Coastguard Workergathered on a dual-channel system so bit-6 swizzling was enabled. It's unclear 77*61046927SAndroid Build Coastguard Workerhow this affects the CCS layout.) 78*61046927SAndroid Build Coastguard Worker 79*61046927SAndroid Build Coastguard Worker============ ======== =========== =========== ====================== =========== =========== =========== =========== =========== =========== =========== =========== =========== =========== =========== =========== 80*61046927SAndroid Build Coastguard Worker Generation Tiling 11 10 9 8 7 6 5 4 3 2 1 0 -1 -2 -3 81*61046927SAndroid Build Coastguard Worker============ ======== =========== =========== ====================== =========== =========== =========== =========== =========== =========== =========== =========== =========== =========== =========== =========== 82*61046927SAndroid Build Coastguard Worker Ivy Bridge X or Y :math:`u_6` :math:`u_5` :math:`u_4` :math:`v_7` :math:`v_6` :math:`v_5` :math:`v_4` :math:`v_2` :math:`v_3` :math:`v_1` :math:`v_0` :math:`u_3` :math:`u_2` :math:`u_1` :math:`u_0` 83*61046927SAndroid Build Coastguard Worker Haswell X :math:`u_6` :math:`u_5` :math:`v_3 \oplus u_1` :math:`v_7` :math:`v_6` :math:`v_5` :math:`v_4` :math:`v_2` :math:`v_3` :math:`v_1` :math:`v_0` :math:`u_4` :math:`u_3` :math:`u_2` :math:`u_0` 84*61046927SAndroid Build Coastguard Worker Haswell Y :math:`u_6` :math:`u_5` :math:`v_2 \oplus u_1` :math:`v_7` :math:`v_6` :math:`v_5` :math:`v_4` :math:`v_2` :math:`v_3` :math:`v_1` :math:`v_0` :math:`u_4` :math:`u_3` :math:`u_2` :math:`u_0` 85*61046927SAndroid Build Coastguard Worker Broadwell X :math:`u_6` :math:`u_5` :math:`u_4` :math:`v_7` :math:`v_6` :math:`v_5` :math:`v_4` :math:`u_3` :math:`v_3` :math:`u_2` :math:`u_1` :math:`u_0` :math:`v_2` :math:`v_1` :math:`v_0` 86*61046927SAndroid Build Coastguard Worker Broadwell Y :math:`u_6` :math:`u_5` :math:`u_4` :math:`v_7` :math:`v_6` :math:`v_5` :math:`v_4` :math:`v_2` :math:`v_3` :math:`u_3` :math:`u_2` :math:`u_1` :math:`v_1` :math:`v_0` :math:`u_0` 87*61046927SAndroid Build Coastguard Worker Sky Lake Y :math:`u_6` :math:`u_5` :math:`u_4` :math:`v_6` :math:`v_5` :math:`v_4` :math:`v_3` :math:`v_2` :math:`v_1` :math:`u_3` :math:`u_2` :math:`u_1` :math:`v_0` :math:`u_0` 88*61046927SAndroid Build Coastguard Worker============ ======== =========== =========== ====================== =========== =========== =========== =========== =========== =========== =========== =========== =========== =========== =========== =========== 89*61046927SAndroid Build Coastguard Worker 90*61046927SAndroid Build Coastguard WorkerCCS surface layout 91*61046927SAndroid Build Coastguard Worker------------------ 92*61046927SAndroid Build Coastguard Worker 93*61046927SAndroid Build Coastguard WorkerStarting with Broadwell, fast-clears and color compression can be used on 94*61046927SAndroid Build Coastguard Workermipmapped and array surfaces. When considered from a higher level, the CCS is 95*61046927SAndroid Build Coastguard Workerlaid out like any other surface. The Broadwell and Sky Lake PRMs describe 96*61046927SAndroid Build Coastguard Workerthis as follows: 97*61046927SAndroid Build Coastguard Worker 98*61046927SAndroid Build Coastguard WorkerBroadwell PRM Vol 7, "MCS Buffer for Render Target(s)" (p. 676): 99*61046927SAndroid Build Coastguard Worker 100*61046927SAndroid Build Coastguard Worker Mip-mapped and arrayed surfaces are supported with MCS buffer layout with 101*61046927SAndroid Build Coastguard Worker these alignments in the RT space: Horizontal Alignment = 256 and Vertical 102*61046927SAndroid Build Coastguard Worker Alignment = 128. 103*61046927SAndroid Build Coastguard Worker 104*61046927SAndroid Build Coastguard WorkerBroadwell PRM Vol 2d, "RENDER_SURFACE_STATE" (p. 279): 105*61046927SAndroid Build Coastguard Worker 106*61046927SAndroid Build Coastguard Worker For non-multisampled render target's auxiliary surface, MCS, QPitch must be 107*61046927SAndroid Build Coastguard Worker computed with Horizontal Alignment = 256 and Surface Vertical Alignment = 108*61046927SAndroid Build Coastguard Worker 128. These alignments are only for MCS buffer and not for associated render 109*61046927SAndroid Build Coastguard Worker target. 110*61046927SAndroid Build Coastguard Worker 111*61046927SAndroid Build Coastguard WorkerSky Lake PRM Vol 7, "MCS Buffer for Render Target(s)" (p. 632): 112*61046927SAndroid Build Coastguard Worker 113*61046927SAndroid Build Coastguard Worker Mip-mapped and arrayed surfaces are supported with MCS buffer layout with 114*61046927SAndroid Build Coastguard Worker these alignments in the RT space: Horizontal Alignment = 128 and Vertical 115*61046927SAndroid Build Coastguard Worker Alignment = 64. 116*61046927SAndroid Build Coastguard Worker 117*61046927SAndroid Build Coastguard WorkerSky Lake PRM Vol. 2d, "RENDER_SURFACE_STATE" (p. 435): 118*61046927SAndroid Build Coastguard Worker 119*61046927SAndroid Build Coastguard Worker For non-multisampled render target's CCS auxiliary surface, QPitch must be 120*61046927SAndroid Build Coastguard Worker computed with Horizontal Alignment = 128 and Surface Vertical Alignment 121*61046927SAndroid Build Coastguard Worker = 256. These alignments are only for CCS buffer and not for associated 122*61046927SAndroid Build Coastguard Worker render target. 123*61046927SAndroid Build Coastguard Worker 124*61046927SAndroid Build Coastguard WorkerEmpirical evidence seems to confirm this. On Sky Lake, the vertical alignment 125*61046927SAndroid Build Coastguard Workeris always one cache line. The horizontal alignment, however, varies by main 126*61046927SAndroid Build Coastguard Workersurface format: 1 cache line for 32bpp, 2 for 64bpp and 4 cache lines for 127*61046927SAndroid Build Coastguard Worker128bpp formats. This nicely corresponds to the alignment of 128x64 pixels in 128*61046927SAndroid Build Coastguard Workerthe primary color surface. The second PRM citation about Sky Lake CCS above 129*61046927SAndroid Build Coastguard Workergives a vertical alignment of 256 rather than 64. With a little 130*61046927SAndroid Build Coastguard Workerexperimentation, this additional alignment appears to only apply to QPitch and 131*61046927SAndroid Build Coastguard Workernot to the miplevels within a slice. 132*61046927SAndroid Build Coastguard Worker 133*61046927SAndroid Build Coastguard WorkerOn Broadwell, each miplevel in the CCS is aligned to a cache-line pair 134*61046927SAndroid Build Coastguard Workerboundary: horizontal when the primary surface is X-tiled and vertical when 135*61046927SAndroid Build Coastguard WorkerY-tiled. For a 32bpp format, this works out to an alignment of 256x128 main 136*61046927SAndroid Build Coastguard Workersurface pixels regardless of X or Y tiling. On Sky Lake, the alignment is 137*61046927SAndroid Build Coastguard Workera single cache line which works out to an alignment of 128x64 main surface 138*61046927SAndroid Build Coastguard Workerpixels. 139*61046927SAndroid Build Coastguard Worker 140*61046927SAndroid Build Coastguard WorkerTODO: More than just 32bpp formats on Broadwell! 141*61046927SAndroid Build Coastguard Worker 142*61046927SAndroid Build Coastguard WorkerOnce armed with the above alignment information, we can lay out the CCS surface 143*61046927SAndroid Build Coastguard Workeritself. The way ISL does CCS layout calculations is by a very careful and 144*61046927SAndroid Build Coastguard Workersubtle application of its normal surface layout code. 145*61046927SAndroid Build Coastguard Worker 146*61046927SAndroid Build Coastguard WorkerAbove, we described the CCS data layout as mapping of address bits. In 147*61046927SAndroid Build Coastguard WorkerISL, this is represented by :c:enumerator:`isl_tiling.ISL_TILING_CCS`. The 148*61046927SAndroid Build Coastguard Workerlogical and physical tile dimensions corresponding to the above mapping. 149*61046927SAndroid Build Coastguard Worker 150*61046927SAndroid Build Coastguard WorkerWe also have special :c:enum:`isl_format` enums for CCS. These formats are 1 151*61046927SAndroid Build Coastguard Workerbit-per-pixel on Ivy Bridge through Broadwell and 2 bits-per-pixel on Skylake 152*61046927SAndroid Build Coastguard Workerand above to correspond to the 1 and 2-bit values represented in the CCS data. 153*61046927SAndroid Build Coastguard WorkerThey have a block size (similar to a block compressed format such as BC or 154*61046927SAndroid Build Coastguard WorkerASTC) which says what area (in surface elements) in the main surface is covered 155*61046927SAndroid Build Coastguard Workerby a single CCS element (1 or 2-bit). Because this depends on the main surface 156*61046927SAndroid Build Coastguard Workertiling and format, we have several different CCS formats. 157*61046927SAndroid Build Coastguard Worker 158*61046927SAndroid Build Coastguard WorkerOnce the appropriate :c:enum:`isl_format` has been selected, computing the 159*61046927SAndroid Build Coastguard Workersize and layout of a CCS surface is as simple as passing the same surface 160*61046927SAndroid Build Coastguard Workercreation parameters to :c:func:`isl_surf_init_s` as were used to create the 161*61046927SAndroid Build Coastguard Workerprimary surface only with :c:enumerator:`isl_tiling.ISL_TILING_CCS` and the 162*61046927SAndroid Build Coastguard Workercorrect CCS format. This not only results in a correctly sized surface but 163*61046927SAndroid Build Coastguard Workermost other ISL helpers for things such as computing offsets into surfaces work 164*61046927SAndroid Build Coastguard Workercorrectly as well. 165*61046927SAndroid Build Coastguard Worker 166*61046927SAndroid Build Coastguard WorkerCCS on Tigerlake and above 167*61046927SAndroid Build Coastguard Worker-------------------------- 168*61046927SAndroid Build Coastguard Worker 169*61046927SAndroid Build Coastguard WorkerStarting with Tigerlake, CCS is no longer done via a surface and, instead, the 170*61046927SAndroid Build Coastguard Workerterm CCS gets overloaded once again (gotta love it!) to now refer to a form of 171*61046927SAndroid Build Coastguard Workeruniversal compression which can be applied to almost any surface. Nothing in 172*61046927SAndroid Build Coastguard Workerthis chapter applies to any hardware with a graphics IP version 12 or above. 173