xref: /aosp_15_r20/external/mesa3d/docs/isl/tiling.rst (revision 6104692788411f58d303aa86923a9ff6ecaded22)
1*61046927SAndroid Build Coastguard WorkerTiling
2*61046927SAndroid Build Coastguard Worker======
3*61046927SAndroid Build Coastguard Worker
4*61046927SAndroid Build Coastguard WorkerThe naive view of an image in memory is that the pixels are stored one after
5*61046927SAndroid Build Coastguard Workeranother in memory usually in an X-major order.  An image that is arranged in
6*61046927SAndroid Build Coastguard Workerthis way is called "linear".  Linear images, while easy to reason about, can
7*61046927SAndroid Build Coastguard Workerhave very bad cache locality.  Graphics operations tend to act on pixels that
8*61046927SAndroid Build Coastguard Workerare close together in 2-D euclidean space.  If you move one pixel to the right
9*61046927SAndroid Build Coastguard Workeror left in a linear image, you only move a few bytes to one side or the other
10*61046927SAndroid Build Coastguard Workerin memory.  However, if you move one pixel up or down you can end up kilobytes
11*61046927SAndroid Build Coastguard Workeror even megabytes away.
12*61046927SAndroid Build Coastguard Worker
13*61046927SAndroid Build Coastguard WorkerTiling (sometimes referred to as swizzling) is a method of re-arranging the
14*61046927SAndroid Build Coastguard Workerpixels of a surface so that pixels which are close in 2-D euclidean space are
15*61046927SAndroid Build Coastguard Workerlikely to be close in memory.
16*61046927SAndroid Build Coastguard Worker
17*61046927SAndroid Build Coastguard WorkerBasics
18*61046927SAndroid Build Coastguard Worker------
19*61046927SAndroid Build Coastguard Worker
20*61046927SAndroid Build Coastguard WorkerThe basic idea of a tiled image is that the image is first divided into
21*61046927SAndroid Build Coastguard Workertwo-dimensional blocks or tiles.  Each tile takes up a chunk of contiguous
22*61046927SAndroid Build Coastguard Workermemory and the tiles are arranged like pixels in linear surface.  This is best
23*61046927SAndroid Build Coastguard Workerdemonstrated with a specific example. Suppose we have a RGBA8888 X-tiled
24*61046927SAndroid Build Coastguard Workersurface on Intel graphics.  Then the surface is divided into 128x8 pixel tiles
25*61046927SAndroid Build Coastguard Workereach of which is 4KB of memory.  Within each tile, the pixels are laid out like
26*61046927SAndroid Build Coastguard Workera 128x8 linear image.  The tiles themselves are laid out row-major in memory
27*61046927SAndroid Build Coastguard Workerlike giant pixels.  This means that, as long as you don't leave your 128x8
28*61046927SAndroid Build Coastguard Workertile, you can move in both dimensions without leaving the same 4K page in
29*61046927SAndroid Build Coastguard Workermemory.
30*61046927SAndroid Build Coastguard Worker
31*61046927SAndroid Build Coastguard Worker.. image:: tiling-basic.svg
32*61046927SAndroid Build Coastguard Worker   :alt: Example of an X-tiled image
33*61046927SAndroid Build Coastguard Worker
34*61046927SAndroid Build Coastguard WorkerYou can, however do even better than this.  Suppose that same image is,
35*61046927SAndroid Build Coastguard Workerinstead, Y-tiled.  Then the surface is divided into 32x32 pixel tiles each of
36*61046927SAndroid Build Coastguard Workerwhich is 4KB of memory.  Within a tile, each 64B cache line corresponds to 4x4
37*61046927SAndroid Build Coastguard Workerpixel region of the image (you can think of it as a tile within a tile).  This
38*61046927SAndroid Build Coastguard Workermeans that very small deviations don't even leave the cache line.  This added
39*61046927SAndroid Build Coastguard Workerbit of pixel shuffling is known to have a substantial performance impact in
40*61046927SAndroid Build Coastguard Workermost real-world applications.
41*61046927SAndroid Build Coastguard Worker
42*61046927SAndroid Build Coastguard WorkerIntel graphics has several different tiling formats that we'll discuss in
43*61046927SAndroid Build Coastguard Workerdetail in later sections.  The most commonly used as of the writing of this
44*61046927SAndroid Build Coastguard Workerchapter is Y-tiling.  In all tiling formats the basic principal is the same:
45*61046927SAndroid Build Coastguard WorkerThe image is divided into tiles of a particular size and, within those tiles,
46*61046927SAndroid Build Coastguard Workerthe data is re-arranged (or swizzled) based on a particular pattern.  A tile
47*61046927SAndroid Build Coastguard Workersize will always be specified in bytes by rows and the actual X-dimension of
48*61046927SAndroid Build Coastguard Workerthe tile in elements depends on the size of the element in bytes.
49*61046927SAndroid Build Coastguard Worker
50*61046927SAndroid Build Coastguard WorkerBit-6 Swizzling
51*61046927SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^
52*61046927SAndroid Build Coastguard Worker
53*61046927SAndroid Build Coastguard WorkerOn some older hardware, there is an additional address swizzle that is applied
54*61046927SAndroid Build Coastguard Workeron top of the tiling format.  This has been removed starting with Broadwell
55*61046927SAndroid Build Coastguard Workerbecause, as it says in the Broadwell PRM Vol 5 "Tiling Algorithm" (p. 17):
56*61046927SAndroid Build Coastguard Worker
57*61046927SAndroid Build Coastguard Worker   Address Swizzling for Tiled-Surfaces is no longer used because the main
58*61046927SAndroid Build Coastguard Worker   memory controller has a more effective address swizzling algorithm.
59*61046927SAndroid Build Coastguard Worker
60*61046927SAndroid Build Coastguard WorkerWhether or not swizzling is enabled depends on the memory configuration of the
61*61046927SAndroid Build Coastguard Workersystem.  Generally, systems with dual-channel RAM have swizzling enabled and
62*61046927SAndroid Build Coastguard Workersingle-channel do not.  Supposedly, this swizzling allows for better balancing
63*61046927SAndroid Build Coastguard Workerbetween the two memory channels and increases performance. Because it depends
64*61046927SAndroid Build Coastguard Workeron the memory configuration which may change from one boot to the next, it
65*61046927SAndroid Build Coastguard Workerrequires a run-time check.
66*61046927SAndroid Build Coastguard Worker
67*61046927SAndroid Build Coastguard WorkerThe best documentation for bit-6 swizzling can be found in the Haswell PRM Vol.
68*61046927SAndroid Build Coastguard Worker5 "Memory Views" in the section entitled "Address Swizzling for Tiled-Y
69*61046927SAndroid Build Coastguard WorkerSurfaces".  It exists on older platforms but the docs get progressively worse
70*61046927SAndroid Build Coastguard Workerthe further you go back.
71*61046927SAndroid Build Coastguard Worker
72*61046927SAndroid Build Coastguard WorkerISL Representation
73*61046927SAndroid Build Coastguard Worker------------------
74*61046927SAndroid Build Coastguard Worker
75*61046927SAndroid Build Coastguard WorkerThe structure of any given tiling format is represented by ISL using the
76*61046927SAndroid Build Coastguard Worker:c:enum:`isl_tiling` enum and the :c:struct:`isl_tile_info` structure:
77*61046927SAndroid Build Coastguard Worker
78*61046927SAndroid Build Coastguard Worker.. c:autoenum:: isl_tiling
79*61046927SAndroid Build Coastguard Worker   :file: src/intel/isl/isl.h
80*61046927SAndroid Build Coastguard Worker   :members:
81*61046927SAndroid Build Coastguard Worker
82*61046927SAndroid Build Coastguard Worker.. c:autofunction:: isl_tiling_get_info
83*61046927SAndroid Build Coastguard Worker   :file: src/intel/isl/isl.c
84*61046927SAndroid Build Coastguard Worker
85*61046927SAndroid Build Coastguard Worker.. c:autostruct:: isl_tile_info
86*61046927SAndroid Build Coastguard Worker   :members:
87*61046927SAndroid Build Coastguard Worker
88*61046927SAndroid Build Coastguard WorkerThe ``isl_tile_info`` structure has two different sizes for a tile: a logical
89*61046927SAndroid Build Coastguard Workersize in surface elements and a physical size in bytes.  In order to determine
90*61046927SAndroid Build Coastguard Workerthe proper logical size, the bits-per-block of the underlying format has to be
91*61046927SAndroid Build Coastguard Workerpassed into ``isl_tiling_get_info``. The proper way to compute the size of an
92*61046927SAndroid Build Coastguard Workerimage in bytes given a width and height in elements is as follows:
93*61046927SAndroid Build Coastguard Worker
94*61046927SAndroid Build Coastguard Worker.. code-block:: c
95*61046927SAndroid Build Coastguard Worker
96*61046927SAndroid Build Coastguard Worker   uint32_t width_tl = DIV_ROUND_UP(width_el * (format_bpb / tile_info.format_bpb),
97*61046927SAndroid Build Coastguard Worker                                    tile_info.logical_extent_el.w);
98*61046927SAndroid Build Coastguard Worker   uint32_t height_tl = DIV_ROUND_UP(height_el, tile_info.logical_extent_el.h);
99*61046927SAndroid Build Coastguard Worker   uint32_t row_pitch = width_tl * tile_info.phys_extent_el.w;
100*61046927SAndroid Build Coastguard Worker   uint32_t size = height_tl * tile_info.phys_extent_el.h * row_pitch;
101*61046927SAndroid Build Coastguard Worker
102*61046927SAndroid Build Coastguard WorkerIt is very important to note that there is no direct conversion between
103*61046927SAndroid Build Coastguard Worker:c:member:`isl_tile_info.logical_extent_el` and
104*61046927SAndroid Build Coastguard Worker:c:member:`isl_tile_info.phys_extent_B`.  It is tempting to assume that the
105*61046927SAndroid Build Coastguard Workerlogical and physical heights are the same and simply divide the width of
106*61046927SAndroid Build Coastguard Worker:c:member:`isl_tile_info.phys_extent_B` by the size of the format (which is
107*61046927SAndroid Build Coastguard Workerwhat the PRM does) to get :c:member:`isl_tile_info.logical_extent_el` but
108*61046927SAndroid Build Coastguard Workerthis is not at all correct. Some tiling formats have logical and physical
109*61046927SAndroid Build Coastguard Workerheights that differ and so no such calculation will work in general.  The
110*61046927SAndroid Build Coastguard Workereasiest case study for this is W-tiling. From the Sky Lake PRM Vol. 2d,
111*61046927SAndroid Build Coastguard Worker"RENDER_SURFACE_STATE" (p. 427):
112*61046927SAndroid Build Coastguard Worker
113*61046927SAndroid Build Coastguard Worker   If the surface is a stencil buffer (and thus has Tile Mode set to
114*61046927SAndroid Build Coastguard Worker   TILEMODE_WMAJOR), the pitch must be set to 2x the value computed based on
115*61046927SAndroid Build Coastguard Worker   width, as the stencil buffer is stored with two rows interleaved.
116*61046927SAndroid Build Coastguard Worker
117*61046927SAndroid Build Coastguard WorkerWhat does this mean?  Why are we multiplying the pitch by two?  What does it
118*61046927SAndroid Build Coastguard Workermean that "the stencil buffer is stored with two rows interleaved"?  The
119*61046927SAndroid Build Coastguard Workerexplanation for all these questions is that a W-tile (which is only used for
120*61046927SAndroid Build Coastguard Workerstencil) has a logical size of 64el x 64el but a physical size of 128B
121*61046927SAndroid Build Coastguard Workerx 32rows.  In memory, a W-tile has the same footprint as a Y-tile (128B
122*61046927SAndroid Build Coastguard Workerx 32rows) but every pair of rows in the stencil buffer is interleaved into
123*61046927SAndroid Build Coastguard Workera single row of bytes yielding a two-dimensional area of 64el x 64el.  You can
124*61046927SAndroid Build Coastguard Workerconsider this as its own tiling format or as a modification of Y-tiling.  The
125*61046927SAndroid Build Coastguard Workerinterpretation in the PRMs vary by hardware generation; on Sandy Bridge they
126*61046927SAndroid Build Coastguard Workersimply said it was Y-tiled but by Sky Lake there is almost no mention of
127*61046927SAndroid Build Coastguard WorkerY-tiling in connection with stencil buffers and they are always W-tiled. This
128*61046927SAndroid Build Coastguard Workermismatch between logical and physical tile sizes are also relevant for
129*61046927SAndroid Build Coastguard Workerhierarchical depth buffers as well as single-channel MCS and CCS buffers.
130*61046927SAndroid Build Coastguard Worker
131*61046927SAndroid Build Coastguard WorkerX-tiling
132*61046927SAndroid Build Coastguard Worker--------
133*61046927SAndroid Build Coastguard Worker
134*61046927SAndroid Build Coastguard WorkerThe simplest tiling format available on Intel graphics (which has been
135*61046927SAndroid Build Coastguard Workeravailable since gen4) is X-tiling.  An X-tile is 512B x 8rows and, within the
136*61046927SAndroid Build Coastguard Workertile, the data is arranged in an X-major linear fashion.  You can also look at
137*61046927SAndroid Build Coastguard WorkerX-tiling as being an 8x8 cache line grid where the cache lines are arranged
138*61046927SAndroid Build Coastguard WorkerX-major as follows:
139*61046927SAndroid Build Coastguard Worker
140*61046927SAndroid Build Coastguard Worker======= ======= ======= ======= ======= ======= ======= =======
141*61046927SAndroid Build Coastguard Worker`0x000` `0x040` `0x080` `0x0c0` `0x100` `0x140` `0x180` `0x1c0`
142*61046927SAndroid Build Coastguard Worker`0x200` `0x240` `0x280` `0x2c0` `0x300` `0x340` `0x380` `0x3c0`
143*61046927SAndroid Build Coastguard Worker`0x400` `0x440` `0x480` `0x4c0` `0x500` `0x540` `0x580` `0x5c0`
144*61046927SAndroid Build Coastguard Worker`0x600` `0x640` `0x680` `0x6c0` `0x700` `0x740` `0x780` `0x7c0`
145*61046927SAndroid Build Coastguard Worker`0x800` `0x840` `0x880` `0x8c0` `0x900` `0x940` `0x980` `0x9c0`
146*61046927SAndroid Build Coastguard Worker`0xa00` `0xa40` `0xa80` `0xac0` `0xb00` `0xb40` `0xb80` `0xbc0`
147*61046927SAndroid Build Coastguard Worker`0xc00` `0xc40` `0xc80` `0xcc0` `0xd00` `0xd40` `0xd80` `0xdc0`
148*61046927SAndroid Build Coastguard Worker`0xe00` `0xe40` `0xe80` `0xec0` `0xf00` `0xf40` `0xf80` `0xfc0`
149*61046927SAndroid Build Coastguard Worker======= ======= ======= ======= ======= ======= ======= =======
150*61046927SAndroid Build Coastguard Worker
151*61046927SAndroid Build Coastguard WorkerEach cache line represents a piece of a single row of pixels within the image.
152*61046927SAndroid Build Coastguard WorkerThe memory locations of two vertically adjacent pixels within the same X-tile
153*61046927SAndroid Build Coastguard Workeralways differs by 512B or 8 cache lines.
154*61046927SAndroid Build Coastguard Worker
155*61046927SAndroid Build Coastguard WorkerAs mentioned above, X-tiling is slower than Y-tiling (though still faster than
156*61046927SAndroid Build Coastguard Workerlinear).  However, until Sky Lake, the display scan-out hardware could only do
157*61046927SAndroid Build Coastguard WorkerX-tiling so we have historically used X-tiling for all window-system buffers
158*61046927SAndroid Build Coastguard Worker(because X or a Wayland compositor may want to put it in a plane).
159*61046927SAndroid Build Coastguard Worker
160*61046927SAndroid Build Coastguard WorkerBit-6 Swizzling
161*61046927SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^
162*61046927SAndroid Build Coastguard Worker
163*61046927SAndroid Build Coastguard WorkerWhen bit-6 swizzling is enabled, bits 9 and 10 are XORed in with bit 6 of the
164*61046927SAndroid Build Coastguard Workertiled address:
165*61046927SAndroid Build Coastguard Worker
166*61046927SAndroid Build Coastguard Worker.. code-block:: c
167*61046927SAndroid Build Coastguard Worker
168*61046927SAndroid Build Coastguard Worker   addr[6] ^= addr[9] ^ addr[10];
169*61046927SAndroid Build Coastguard Worker
170*61046927SAndroid Build Coastguard WorkerY-tiling
171*61046927SAndroid Build Coastguard Worker--------
172*61046927SAndroid Build Coastguard Worker
173*61046927SAndroid Build Coastguard WorkerThe Y-tiling format, also available since gen4, is substantially different from
174*61046927SAndroid Build Coastguard WorkerX-tiling and performs much better in practice.  Each Y-tile is an 8x8 grid of cache lines arranged Y-major as follows:
175*61046927SAndroid Build Coastguard Worker
176*61046927SAndroid Build Coastguard Worker======= ======= ======= ======= ======= ======= ======= =======
177*61046927SAndroid Build Coastguard Worker`0x000` `0x200` `0x400` `0x600` `0x800` `0xa00` `0xc00` `0xe00`
178*61046927SAndroid Build Coastguard Worker`0x040` `0x240` `0x440` `0x640` `0x840` `0xa40` `0xc40` `0xe40`
179*61046927SAndroid Build Coastguard Worker`0x080` `0x280` `0x480` `0x680` `0x880` `0xa80` `0xc80` `0xe80`
180*61046927SAndroid Build Coastguard Worker`0x0c0` `0x2c0` `0x4c0` `0x6c0` `0x8c0` `0xac0` `0xcc0` `0xec0`
181*61046927SAndroid Build Coastguard Worker`0x100` `0x300` `0x500` `0x700` `0x900` `0xb00` `0xd00` `0xf00`
182*61046927SAndroid Build Coastguard Worker`0x140` `0x340` `0x540` `0x740` `0x940` `0xb40` `0xd40` `0xf40`
183*61046927SAndroid Build Coastguard Worker`0x180` `0x380` `0x580` `0x780` `0x980` `0xb80` `0xd80` `0xf80`
184*61046927SAndroid Build Coastguard Worker`0x1c0` `0x3c0` `0x5c0` `0x7c0` `0x9c0` `0xbc0` `0xdc0` `0xfc0`
185*61046927SAndroid Build Coastguard Worker======= ======= ======= ======= ======= ======= ======= =======
186*61046927SAndroid Build Coastguard Worker
187*61046927SAndroid Build Coastguard WorkerEach 64B cache line within the tile is laid out as 4 rows of 16B each:
188*61046927SAndroid Build Coastguard Worker
189*61046927SAndroid Build Coastguard Worker====== ====== ====== ====== ====== ====== ====== ====== ====== ====== ====== ====== ====== ====== ====== ======
190*61046927SAndroid Build Coastguard Worker`0x00` `0x01` `0x02` `0x03` `0x04` `0x05` `0x06` `0x07` `0x08` `0x09` `0x0a` `0x0b` `0x0c` `0x0d` `0x0e` `0x0f`
191*61046927SAndroid Build Coastguard Worker`0x10` `0x11` `0x12` `0x13` `0x14` `0x15` `0x16` `0x17` `0x18` `0x19` `0x1a` `0x1b` `0x1c` `0x1d` `0x1e` `0x1f`
192*61046927SAndroid Build Coastguard Worker`0x20` `0x21` `0x22` `0x23` `0x24` `0x25` `0x26` `0x27` `0x28` `0x29` `0x2a` `0x2b` `0x2c` `0x2d` `0x2e` `0x2f`
193*61046927SAndroid Build Coastguard Worker`0x30` `0x31` `0x32` `0x33` `0x34` `0x35` `0x36` `0x37` `0x38` `0x39` `0x3a` `0x3b` `0x3c` `0x3d` `0x3e` `0x3f`
194*61046927SAndroid Build Coastguard Worker====== ====== ====== ====== ====== ====== ====== ====== ====== ====== ====== ====== ====== ====== ====== ======
195*61046927SAndroid Build Coastguard Worker
196*61046927SAndroid Build Coastguard WorkerY-tiling is widely regarded as being substantially faster than X-tiling so it
197*61046927SAndroid Build Coastguard Workeris generally preferred.  However, prior to Sky Lake, Y-tiling was not available
198*61046927SAndroid Build Coastguard Workerfor scanout so X tiling was used for any sort of window-system buffers.
199*61046927SAndroid Build Coastguard WorkerStarting with Sky Lake, we can scan out from Y-tiled buffers.
200*61046927SAndroid Build Coastguard Worker
201*61046927SAndroid Build Coastguard WorkerBit-6 Swizzling
202*61046927SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^
203*61046927SAndroid Build Coastguard Worker
204*61046927SAndroid Build Coastguard WorkerWhen bit-6 swizzling is enabled, bit 9 is XORed in with bit 6 of the tiled
205*61046927SAndroid Build Coastguard Workeraddress:
206*61046927SAndroid Build Coastguard Worker
207*61046927SAndroid Build Coastguard Worker.. code-block:: c
208*61046927SAndroid Build Coastguard Worker
209*61046927SAndroid Build Coastguard Worker   addr[6] ^= addr[9];
210*61046927SAndroid Build Coastguard Worker
211*61046927SAndroid Build Coastguard WorkerW-tiling
212*61046927SAndroid Build Coastguard Worker--------
213*61046927SAndroid Build Coastguard Worker
214*61046927SAndroid Build Coastguard WorkerW-tiling is a new tiling format added on Sandy Bridge for use in stencil
215*61046927SAndroid Build Coastguard Workerbuffers.  W-tiling is similar to Y-tiling in that it's arranged as an 8x8
216*61046927SAndroid Build Coastguard WorkerY-major grid of cache lines.  The bytes within each cache line are arranged as
217*61046927SAndroid Build Coastguard Workerfollows:
218*61046927SAndroid Build Coastguard Worker
219*61046927SAndroid Build Coastguard Worker====== ====== ====== ====== ====== ====== ====== ======
220*61046927SAndroid Build Coastguard Worker`0x00` `0x01` `0x04` `0x05` `0x10` `0x11` `0x14` `0x15`
221*61046927SAndroid Build Coastguard Worker`0x02` `0x03` `0x06` `0x07` `0x12` `0x13` `0x16` `0x17`
222*61046927SAndroid Build Coastguard Worker`0x08` `0x09` `0x0c` `0x0d` `0x18` `0x19` `0x1c` `0x1d`
223*61046927SAndroid Build Coastguard Worker`0x0a` `0x0b` `0x0e` `0x0f` `0x1a` `0x1b` `0x1e` `0x1f`
224*61046927SAndroid Build Coastguard Worker`0x20` `0x21` `0x24` `0x25` `0x30` `0x31` `0x34` `0x35`
225*61046927SAndroid Build Coastguard Worker`0x22` `0x23` `0x26` `0x27` `0x32` `0x33` `0x36` `0x37`
226*61046927SAndroid Build Coastguard Worker`0x28` `0x29` `0x2c` `0x2d` `0x38` `0x39` `0x3c` `0x3d`
227*61046927SAndroid Build Coastguard Worker`0x2a` `0x2b` `0x2e` `0x2f` `0x3a` `0x3b` `0x3e` `0x3f`
228*61046927SAndroid Build Coastguard Worker====== ====== ====== ====== ====== ====== ====== ======
229*61046927SAndroid Build Coastguard Worker
230*61046927SAndroid Build Coastguard WorkerWhile W-tiling has been required for stencil all the way back to Sandy Bridge,
231*61046927SAndroid Build Coastguard Workerthe docs are somewhat confused as to whether stencil buffers are W or Y-tiled.
232*61046927SAndroid Build Coastguard WorkerThis seems to stem from the fact that the hardware seems to implement W-tiling
233*61046927SAndroid Build Coastguard Workeras a sort of modified Y-tiling.  One example of this is the somewhat odd
234*61046927SAndroid Build Coastguard Workerrequirement that W-tiled buffers have their pitch multiplied by 2.  From the
235*61046927SAndroid Build Coastguard WorkerSky Lake PRM Vol. 2d, "RENDER_SURFACE_STATE" (p. 427):
236*61046927SAndroid Build Coastguard Worker
237*61046927SAndroid Build Coastguard Worker   If the surface is a stencil buffer (and thus has Tile Mode set to
238*61046927SAndroid Build Coastguard Worker   TILEMODE_WMAJOR), the pitch must be set to 2x the value computed based on
239*61046927SAndroid Build Coastguard Worker   width, as the stencil buffer is stored with two rows interleaved.
240*61046927SAndroid Build Coastguard Worker
241*61046927SAndroid Build Coastguard WorkerThe last phrase holds the key here: "the stencil buffer is stored with two rows
242*61046927SAndroid Build Coastguard Workerinterleaved".  More accurately, a W-tiled buffer can be viewed as a Y-tiled
243*61046927SAndroid Build Coastguard Workerbuffer with each set of 4 W-tiled lines interleaved to form 2 Y-tiled lines. In
244*61046927SAndroid Build Coastguard WorkerISL, we represent a W-tile as a tiling with a logical dimension of 64el x 64el
245*61046927SAndroid Build Coastguard Workerbut a physical size of 128B x 32rows.  This cleanly takes care of the pitch
246*61046927SAndroid Build Coastguard Workerissue above and seems to nicely model the hardware.
247*61046927SAndroid Build Coastguard Worker
248*61046927SAndroid Build Coastguard WorkerTile4
249*61046927SAndroid Build Coastguard Worker-----
250*61046927SAndroid Build Coastguard Worker
251*61046927SAndroid Build Coastguard WorkerThe tile4 format, introduced on Xe-HP, is somewhat similar to Y but with more
252*61046927SAndroid Build Coastguard Workerinternal shuffling.  Each tile4 tile is an 8x8 grid of cache lines arranged
253*61046927SAndroid Build Coastguard Workeras follows:
254*61046927SAndroid Build Coastguard Worker
255*61046927SAndroid Build Coastguard Worker======= ======= ======= ======= ======= ======= ======= =======
256*61046927SAndroid Build Coastguard Worker`0x000` `0x040` `0x080` `0x0a0` `0x200` `0x240` `0x280` `0x2a0`
257*61046927SAndroid Build Coastguard Worker`0x100` `0x140` `0x180` `0x1a0` `0x300` `0x340` `0x380` `0x3a0`
258*61046927SAndroid Build Coastguard Worker`0x400` `0x440` `0x480` `0x4a0` `0x600` `0x640` `0x680` `0x6a0`
259*61046927SAndroid Build Coastguard Worker`0x500` `0x540` `0x580` `0x5a0` `0x700` `0x740` `0x780` `0x7a0`
260*61046927SAndroid Build Coastguard Worker`0x800` `0x840` `0x880` `0x8a0` `0xa00` `0xa40` `0xa80` `0xaa0`
261*61046927SAndroid Build Coastguard Worker`0x900` `0x940` `0x980` `0x9a0` `0xb00` `0xb40` `0xb80` `0xba0`
262*61046927SAndroid Build Coastguard Worker`0xc00` `0xc40` `0xc80` `0xca0` `0xe00` `0xe40` `0xe80` `0xea0`
263*61046927SAndroid Build Coastguard Worker`0xd00` `0xd40` `0xd80` `0xda0` `0xf00` `0xf40` `0xf80` `0xfa0`
264*61046927SAndroid Build Coastguard Worker======= ======= ======= ======= ======= ======= ======= =======
265*61046927SAndroid Build Coastguard Worker
266*61046927SAndroid Build Coastguard WorkerEach 64B cache line within the tile is laid out the same way as for a Y-tile,
267*61046927SAndroid Build Coastguard Workeras 4 rows of 16B each:
268*61046927SAndroid Build Coastguard Worker
269*61046927SAndroid Build Coastguard Worker====== ====== ====== ====== ====== ====== ====== ====== ====== ====== ====== ====== ====== ====== ====== ======
270*61046927SAndroid Build Coastguard Worker`0x00` `0x01` `0x02` `0x03` `0x04` `0x05` `0x06` `0x07` `0x08` `0x09` `0x0a` `0x0b` `0x0c` `0x0d` `0x0e` `0x0f`
271*61046927SAndroid Build Coastguard Worker`0x10` `0x11` `0x12` `0x13` `0x14` `0x15` `0x16` `0x17` `0x18` `0x19` `0x1a` `0x1b` `0x1c` `0x1d` `0x1e` `0x1f`
272*61046927SAndroid Build Coastguard Worker`0x20` `0x21` `0x22` `0x23` `0x24` `0x25` `0x26` `0x27` `0x28` `0x29` `0x2a` `0x2b` `0x2c` `0x2d` `0x2e` `0x2f`
273*61046927SAndroid Build Coastguard Worker`0x30` `0x31` `0x32` `0x33` `0x34` `0x35` `0x36` `0x37` `0x38` `0x39` `0x3a` `0x3b` `0x3c` `0x3d` `0x3e` `0x3f`
274*61046927SAndroid Build Coastguard Worker====== ====== ====== ====== ====== ====== ====== ====== ====== ====== ====== ====== ====== ====== ====== ======
275*61046927SAndroid Build Coastguard Worker
276*61046927SAndroid Build Coastguard WorkerTiling as a bit pattern
277*61046927SAndroid Build Coastguard Worker-----------------------
278*61046927SAndroid Build Coastguard Worker
279*61046927SAndroid Build Coastguard WorkerThere is one more important angle on tiling that should be discussed before we
280*61046927SAndroid Build Coastguard Workerfinish.  Every tiling can be described by three things:
281*61046927SAndroid Build Coastguard Worker
282*61046927SAndroid Build Coastguard Worker 1. A logical width and height in elements
283*61046927SAndroid Build Coastguard Worker 2. A physical width in bytes and height in rows
284*61046927SAndroid Build Coastguard Worker 3. A mapping from logical elements to physical bytes within the tile
285*61046927SAndroid Build Coastguard Worker
286*61046927SAndroid Build Coastguard WorkerWe have spent a good deal of time on the first two because this is what you
287*61046927SAndroid Build Coastguard Workerreally need for doing surface layout calculations.  However, there are cases in
288*61046927SAndroid Build Coastguard Workerwhich the map from logical to physical elements is critical.  One example is
289*61046927SAndroid Build Coastguard WorkerW-tiling where we have code to do W-tiled encoding and decoding in the shader
290*61046927SAndroid Build Coastguard Workerfor doing stencil blits because the hardware does not allow us to render to
291*61046927SAndroid Build Coastguard WorkerW-tiled surfaces.
292*61046927SAndroid Build Coastguard Worker
293*61046927SAndroid Build Coastguard WorkerThere are many ways to mathematically describe the mapping from logical
294*61046927SAndroid Build Coastguard Workerelements to physical bytes.  In the PRMs they give a very complicated set of
295*61046927SAndroid Build Coastguard Workerformulas involving lots of multiplication, modulus, and sums that show you how
296*61046927SAndroid Build Coastguard Workerto compute the mapping.  With a little creativity, you can easily reduce those
297*61046927SAndroid Build Coastguard Workerto a set of bit shifts and ORs.  By far the simplest formulation, however, is
298*61046927SAndroid Build Coastguard Workeras a mapping from the bits of the texture coordinates to bits in the address.
299*61046927SAndroid Build Coastguard WorkerSuppose that :math:`(u, v)` is location of a 1-byte element within a tile.  If
300*61046927SAndroid Build Coastguard Workeryou represent :math:`u` as :math:`u_n u_{n-1} \cdots u_2 u_1 u_0` where
301*61046927SAndroid Build Coastguard Worker:math:`u_0` is the LSB and :math:`u_n` is the MSB of :math:`u` and similarly
302*61046927SAndroid Build Coastguard Worker:math:`v = v_m v_{m-1} \cdots v_2 v_1 v_0`, then the bits of the address within
303*61046927SAndroid Build Coastguard Workerthe tile are given by the table below:
304*61046927SAndroid Build Coastguard Worker
305*61046927SAndroid Build Coastguard Worker=========================================== =========== =========== =========== =========== =========== =========== =========== =========== =========== =========== =========== ===========
306*61046927SAndroid Build Coastguard Worker Tiling                                          11          10          9           8           7           6           5           4           3           2           1           0
307*61046927SAndroid Build Coastguard Worker=========================================== =========== =========== =========== =========== =========== =========== =========== =========== =========== =========== =========== ===========
308*61046927SAndroid Build Coastguard Worker:c:enumerator:`isl_tiling.ISL_TILING_X`     :math:`v_2` :math:`v_1` :math:`v_0` :math:`u_8` :math:`u_7` :math:`u_6` :math:`u_5` :math:`u_4` :math:`u_3` :math:`u_2` :math:`u_1` :math:`u_0`
309*61046927SAndroid Build Coastguard Worker:c:enumerator:`isl_tiling.ISL_TILING_Y0`    :math:`u_6` :math:`u_5` :math:`u_4` :math:`v_4` :math:`v_3` :math:`v_2` :math:`v_1` :math:`v_0` :math:`u_3` :math:`u_2` :math:`u_1` :math:`u_0`
310*61046927SAndroid Build Coastguard Worker:c:enumerator:`isl_tiling.ISL_TILING_W`     :math:`u_5` :math:`u_4` :math:`u_3` :math:`v_5` :math:`v_4` :math:`v_3` :math:`v_2` :math:`u_2` :math:`v_1` :math:`u_1` :math:`v_0` :math:`u_0`
311*61046927SAndroid Build Coastguard Worker:c:enumerator:`isl_tiling.ISL_TILING_4`     :math:`v_4` :math:`v_3` :math:`u_6` :math:`v_2` :math:`u_5` :math:`u_4` :math:`v_1` :math:`v_0` :math:`u_3` :math:`u_2` :math:`u_1` :math:`u_0`
312*61046927SAndroid Build Coastguard Worker=========================================== =========== =========== =========== =========== =========== =========== =========== =========== =========== =========== =========== ===========
313*61046927SAndroid Build Coastguard Worker
314*61046927SAndroid Build Coastguard WorkerConstructing the mapping this way makes a lot of sense when you think about
315*61046927SAndroid Build Coastguard Workerhardware.  It may seem complex on paper but "simple" things such as addition
316*61046927SAndroid Build Coastguard Workerare relatively expensive in hardware while interleaving bits in a well-defined
317*61046927SAndroid Build Coastguard Workerpattern is practically free. For a format that has more than one byte per
318*61046927SAndroid Build Coastguard Workerelement, you simply chop bits off the bottom of the pattern, hard-code them to
319*61046927SAndroid Build Coastguard Worker0, and adjust bit indices as needed.  For a 128-bit format, for instance, the
320*61046927SAndroid Build Coastguard WorkerY-tiled pattern becomes :math:`u_2 u_1 u_0 v_4 v_3 v_2 v_1 v_0`.  The Sky Lake
321*61046927SAndroid Build Coastguard WorkerPRM Vol. 5 in the section "2D Surfaces" contains an expanded version of the
322*61046927SAndroid Build Coastguard Workerabove table (which we will not repeat here) that also includes the bit patterns
323*61046927SAndroid Build Coastguard Workerfor the Ys and Yf tiling formats.
324