xref: /aosp_15_r20/external/mesa3d/docs/drivers/freedreno.rst (revision 6104692788411f58d303aa86923a9ff6ecaded22)
1*61046927SAndroid Build Coastguard WorkerFreedreno
2*61046927SAndroid Build Coastguard Worker=========
3*61046927SAndroid Build Coastguard Worker
4*61046927SAndroid Build Coastguard WorkerFreedreno GLES and GL driver for Adreno 2xx-6xx GPUs.  It implements up to
5*61046927SAndroid Build Coastguard WorkerOpenGL ES 3.2 and desktop OpenGL 4.5.
6*61046927SAndroid Build Coastguard Worker
7*61046927SAndroid Build Coastguard WorkerSee the `Freedreno Wiki
8*61046927SAndroid Build Coastguard Worker<https://gitlab.freedesktop.org/freedreno/freedreno/-/wikis/home>`__ for more
9*61046927SAndroid Build Coastguard Workerdetails.
10*61046927SAndroid Build Coastguard Worker
11*61046927SAndroid Build Coastguard WorkerTurnip
12*61046927SAndroid Build Coastguard Worker------
13*61046927SAndroid Build Coastguard Worker
14*61046927SAndroid Build Coastguard WorkerTurnip is a Vulkan 1.3 driver for Adreno 6xx GPUs.
15*61046927SAndroid Build Coastguard Worker
16*61046927SAndroid Build Coastguard WorkerThe current set of specific chip versions supported can be found in
17*61046927SAndroid Build Coastguard Worker:file:`src/freedreno/common/freedreno_devices.py`.  The current set of features
18*61046927SAndroid Build Coastguard Workersupported can be found rendered at `Mesa Matrix <https://mesamatrix.net/>`__.
19*61046927SAndroid Build Coastguard WorkerThere are no plans to port to a5xx or earlier GPUs.
20*61046927SAndroid Build Coastguard Worker
21*61046927SAndroid Build Coastguard WorkerHardware architecture
22*61046927SAndroid Build Coastguard Worker---------------------
23*61046927SAndroid Build Coastguard Worker
24*61046927SAndroid Build Coastguard WorkerAdreno is a mostly tile-mode renderer, but with the option to bypass tiling
25*61046927SAndroid Build Coastguard Worker("gmem") and render directly to system memory ("sysmem").  It is UMA, using
26*61046927SAndroid Build Coastguard Workermostly write combined memory but with the ability to map some buffers as cache
27*61046927SAndroid Build Coastguard Workercoherent with the CPU.
28*61046927SAndroid Build Coastguard Worker
29*61046927SAndroid Build Coastguard Worker.. toctree::
30*61046927SAndroid Build Coastguard Worker   :glob:
31*61046927SAndroid Build Coastguard Worker
32*61046927SAndroid Build Coastguard Worker   freedreno/hw/*
33*61046927SAndroid Build Coastguard Worker
34*61046927SAndroid Build Coastguard WorkerHardware acronyms
35*61046927SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^^^
36*61046927SAndroid Build Coastguard Worker
37*61046927SAndroid Build Coastguard Worker.. glossary::
38*61046927SAndroid Build Coastguard Worker
39*61046927SAndroid Build Coastguard Worker  Cluster
40*61046927SAndroid Build Coastguard Worker    A group of hardware registers, often with multiple copies to allow
41*61046927SAndroid Build Coastguard Worker    pipelining.  There is an M:N relationship between hardware blocks that do
42*61046927SAndroid Build Coastguard Worker    work and the clusters of registers for the state that hardware blocks use.
43*61046927SAndroid Build Coastguard Worker
44*61046927SAndroid Build Coastguard Worker  CP
45*61046927SAndroid Build Coastguard Worker    Command Processor.  Reads the stream of state changes and draw commands
46*61046927SAndroid Build Coastguard Worker    generated by the driver.
47*61046927SAndroid Build Coastguard Worker
48*61046927SAndroid Build Coastguard Worker  PFP
49*61046927SAndroid Build Coastguard Worker    Prefetch Parser.  Adreno 2xx-4xx CP component.
50*61046927SAndroid Build Coastguard Worker
51*61046927SAndroid Build Coastguard Worker  ME
52*61046927SAndroid Build Coastguard Worker    Micro Engine. Adreno 2xx-4xx CP component after PFP, handles most PM4 commands.
53*61046927SAndroid Build Coastguard Worker
54*61046927SAndroid Build Coastguard Worker  SQE
55*61046927SAndroid Build Coastguard Worker    a6xx+ replacement for PFP/ME.  This is the microcontroller that runs the
56*61046927SAndroid Build Coastguard Worker    microcode (loaded from Linux) which actually processes the command stream
57*61046927SAndroid Build Coastguard Worker    and writes to the hardware registers.  See `afuc
58*61046927SAndroid Build Coastguard Worker    <https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/freedreno/afuc/README.rst>`__.
59*61046927SAndroid Build Coastguard Worker
60*61046927SAndroid Build Coastguard Worker  ROQ
61*61046927SAndroid Build Coastguard Worker    DMA engine used by the SQE for reading memory, with some prefetch buffering.
62*61046927SAndroid Build Coastguard Worker    Mostly reads in the command stream, but also serves for
63*61046927SAndroid Build Coastguard Worker    ``CP_MEMCPY``/``CP_MEM_TO_REG`` and visibility stream reads.
64*61046927SAndroid Build Coastguard Worker
65*61046927SAndroid Build Coastguard Worker  SP
66*61046927SAndroid Build Coastguard Worker    Shader Processor.  Unified, scalar shader engine.  One or more, depending on
67*61046927SAndroid Build Coastguard Worker    GPU and tier.
68*61046927SAndroid Build Coastguard Worker
69*61046927SAndroid Build Coastguard Worker  TP
70*61046927SAndroid Build Coastguard Worker    Texture Processor.
71*61046927SAndroid Build Coastguard Worker
72*61046927SAndroid Build Coastguard Worker  UCHE
73*61046927SAndroid Build Coastguard Worker    Unified L2 Cache.  32KB on A330, unclear how big now.
74*61046927SAndroid Build Coastguard Worker
75*61046927SAndroid Build Coastguard Worker  CCU
76*61046927SAndroid Build Coastguard Worker    Color Cache Unit.
77*61046927SAndroid Build Coastguard Worker
78*61046927SAndroid Build Coastguard Worker  VSC
79*61046927SAndroid Build Coastguard Worker    Visibility Stream Compressor
80*61046927SAndroid Build Coastguard Worker
81*61046927SAndroid Build Coastguard Worker  PVS
82*61046927SAndroid Build Coastguard Worker    Primitive Visibility Stream
83*61046927SAndroid Build Coastguard Worker
84*61046927SAndroid Build Coastguard Worker  FE
85*61046927SAndroid Build Coastguard Worker    Front End?  Index buffer and vertex attribute fetch cluster.  Includes PC,
86*61046927SAndroid Build Coastguard Worker    VFD, VPC.
87*61046927SAndroid Build Coastguard Worker
88*61046927SAndroid Build Coastguard Worker  VFD
89*61046927SAndroid Build Coastguard Worker    Vertex Fetch and Decode
90*61046927SAndroid Build Coastguard Worker
91*61046927SAndroid Build Coastguard Worker  VPC
92*61046927SAndroid Build Coastguard Worker    Varying/Position Cache?  Hardware block that stores shaded vertex data for
93*61046927SAndroid Build Coastguard Worker    primitive assembly.
94*61046927SAndroid Build Coastguard Worker
95*61046927SAndroid Build Coastguard Worker  HLSQ
96*61046927SAndroid Build Coastguard Worker    High Level Sequencer.  Manages state for the SPs, batches up PS invocations
97*61046927SAndroid Build Coastguard Worker    between primitives, is involved in preemption.
98*61046927SAndroid Build Coastguard Worker
99*61046927SAndroid Build Coastguard Worker  PC_VS
100*61046927SAndroid Build Coastguard Worker    Cluster where varyings are read from VPC and assembled into primitives to
101*61046927SAndroid Build Coastguard Worker    feed GRAS.
102*61046927SAndroid Build Coastguard Worker
103*61046927SAndroid Build Coastguard Worker  VS
104*61046927SAndroid Build Coastguard Worker    Vertex Shader. Responsible for generating VS/GS/tess invocations
105*61046927SAndroid Build Coastguard Worker
106*61046927SAndroid Build Coastguard Worker  GRAS
107*61046927SAndroid Build Coastguard Worker    Rasterizer. Responsible for generating PS invocations from primitives, also
108*61046927SAndroid Build Coastguard Worker    does LRZ
109*61046927SAndroid Build Coastguard Worker
110*61046927SAndroid Build Coastguard Worker  PS
111*61046927SAndroid Build Coastguard Worker    Pixel Shader.
112*61046927SAndroid Build Coastguard Worker
113*61046927SAndroid Build Coastguard Worker  RB
114*61046927SAndroid Build Coastguard Worker    Render Backend.  Performs both early and late Z testing, blending, and
115*61046927SAndroid Build Coastguard Worker    attachment stores of output of the PS.
116*61046927SAndroid Build Coastguard Worker
117*61046927SAndroid Build Coastguard Worker  GMEM
118*61046927SAndroid Build Coastguard Worker    Roughly 128KB-1MB of memory on the GPU (SKU-dependent), used to store
119*61046927SAndroid Build Coastguard Worker    attachments during tiled rendering
120*61046927SAndroid Build Coastguard Worker
121*61046927SAndroid Build Coastguard Worker  LRZ
122*61046927SAndroid Build Coastguard Worker    Low Resolution Z.  A low resolution area of the depth buffer that can be
123*61046927SAndroid Build Coastguard Worker    initialized during the binning pass to contain the worst-case (farthest) Z
124*61046927SAndroid Build Coastguard Worker    values in a block, and then used to early reject fragments during
125*61046927SAndroid Build Coastguard Worker    rasterization.
126*61046927SAndroid Build Coastguard Worker
127*61046927SAndroid Build Coastguard WorkerCache hierarchy
128*61046927SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^
129*61046927SAndroid Build Coastguard Worker
130*61046927SAndroid Build Coastguard WorkerThe a6xx GPUs have two main caches: CCU and UCHE.
131*61046927SAndroid Build Coastguard Worker
132*61046927SAndroid Build Coastguard WorkerUCHE (Unified L2 Cache) is the cache behind the vertex fetch, VSC writes,
133*61046927SAndroid Build Coastguard Workertexture L1, LRZ, and storage image accesses (``ldib``/``stib``).  Misses and
134*61046927SAndroid Build Coastguard Workerflushes access system memory.
135*61046927SAndroid Build Coastguard Worker
136*61046927SAndroid Build Coastguard WorkerThe CCU is the separate cache used by 2D blits and sysmem render target access
137*61046927SAndroid Build Coastguard Worker(and also for resolves to system memory when in GMEM mode).  Its memory comes
138*61046927SAndroid Build Coastguard Workerfrom a carveout of GMEM controlled by ``RB_CCU_CNTL``, with a varying amount
139*61046927SAndroid Build Coastguard Workerreserved based on whether we're in a render pass using GMEM for attachment
140*61046927SAndroid Build Coastguard Workerstorage, or we're doing sysmem rendering.  Cache entries have the attachment
141*61046927SAndroid Build Coastguard Workernumber and layer mixed into the cache tag in some way, likely so that a
142*61046927SAndroid Build Coastguard Workerfragment's access is spread through the cache even if the attachments are the
143*61046927SAndroid Build Coastguard Workersame size and alignments in address space.  This means that the cache must be
144*61046927SAndroid Build Coastguard Workerflushed and invalidated between memory being used for one attachment and another
145*61046927SAndroid Build Coastguard Worker(notably depth vs color, but also MRT color).
146*61046927SAndroid Build Coastguard Worker
147*61046927SAndroid Build Coastguard WorkerThe Texture Processors (TP) additionally have a small L1 cache (1KB on A330,
148*61046927SAndroid Build Coastguard Workerunclear how big now) before accessing UCHE. This cache is used for normal
149*61046927SAndroid Build Coastguard Workersampling like ``sam``` and ``isam`` (and the compiler will make read-only
150*61046927SAndroid Build Coastguard Workerstorage image access through it as well).  It is not coherent with UCHE (may get
151*61046927SAndroid Build Coastguard Workerstale results when you ``sam`` after ``stib``), but must get flushed per draw or
152*61046927SAndroid Build Coastguard Workersomething because you don't need a manual invalidate between draws storing to an
153*61046927SAndroid Build Coastguard Workerimage and draws sampling from a texture.
154*61046927SAndroid Build Coastguard Worker
155*61046927SAndroid Build Coastguard WorkerThe command processor (CP) does not read from either of these caches, and
156*61046927SAndroid Build Coastguard Workerinstead uses FIFOs in the ROQ to avoid stalls reading from system memory.
157*61046927SAndroid Build Coastguard Worker
158*61046927SAndroid Build Coastguard WorkerDraw states
159*61046927SAndroid Build Coastguard Worker^^^^^^^^^^^
160*61046927SAndroid Build Coastguard Worker
161*61046927SAndroid Build Coastguard WorkerSince the SQE is not a fast processor, and tiled rendering means that many draws
162*61046927SAndroid Build Coastguard Workerwon't even be used in many bins, since a5xx state updates can be batched up into
163*61046927SAndroid Build Coastguard Worker"draw states" that point to a fragment of CP packets.  At draw time, if the draw
164*61046927SAndroid Build Coastguard Workercall is going to actually execute (some primitive is visible in the current
165*61046927SAndroid Build Coastguard Workertile), the SQE goes through the ``GROUP_ID``\s and for any with an update since
166*61046927SAndroid Build Coastguard Workerthe last time they were executed, it executes the corresponding fragment.
167*61046927SAndroid Build Coastguard Worker
168*61046927SAndroid Build Coastguard WorkerStarting with a6xx, states can be tagged with whether they should be executed
169*61046927SAndroid Build Coastguard Workerat draw time for any of sysmem, binning, or tile rendering.  This allows a
170*61046927SAndroid Build Coastguard Workersingle command stream to be generated which can be executed in any of the modes,
171*61046927SAndroid Build Coastguard Workerunlike pre-a6xx where we had to generate separate command lists for the binning
172*61046927SAndroid Build Coastguard Workerand rendering phases.
173*61046927SAndroid Build Coastguard Worker
174*61046927SAndroid Build Coastguard WorkerNote that this means that the generated draw state has to always update all of
175*61046927SAndroid Build Coastguard Workerthe state you have chosen to pack into that ``GROUP_ID``, since any of your
176*61046927SAndroid Build Coastguard Workerprevious state changes in a previous draw state command may have been skipped.
177*61046927SAndroid Build Coastguard Worker
178*61046927SAndroid Build Coastguard WorkerPipelining (a6xx+)
179*61046927SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^^^^
180*61046927SAndroid Build Coastguard Worker
181*61046927SAndroid Build Coastguard WorkerMost CP commands write to registers.  In a6xx+, the registers are located in
182*61046927SAndroid Build Coastguard Workerclusters corresponding to the stage of the pipeline they are used from (see
183*61046927SAndroid Build Coastguard Worker``enum tu_stage`` for a list). To pipeline state updates and drawing, registers
184*61046927SAndroid Build Coastguard Workergenerally have two copies ("contexts") in their cluster, so previous draws can
185*61046927SAndroid Build Coastguard Workerbe working on the previous set of register state while the next draw's state is
186*61046927SAndroid Build Coastguard Workerbeing set up. You can find what registers go into which clusters by looking at
187*61046927SAndroid Build Coastguard Worker:command:`crashdec` output in the ``regs-name: CP_MEMPOOL`` section.
188*61046927SAndroid Build Coastguard Worker
189*61046927SAndroid Build Coastguard WorkerAs SQE processes register writes in the command stream, it sends them into a
190*61046927SAndroid Build Coastguard Workerper-cluster queue stored in ``CP_MEMPOOL``.  This allows the pipeline stages to
191*61046927SAndroid Build Coastguard Workerprocess their stream of register updates and events independent of each other
192*61046927SAndroid Build Coastguard Worker(so even with just 2 contexts in a stage, earlier stages can proceed on to later
193*61046927SAndroid Build Coastguard Workerdraws before later stages have caught up).
194*61046927SAndroid Build Coastguard Worker
195*61046927SAndroid Build Coastguard WorkerEach cluster has a per-context bit indicating that the context is done/free.
196*61046927SAndroid Build Coastguard WorkerRegister writes will stall on the context being done.
197*61046927SAndroid Build Coastguard Worker
198*61046927SAndroid Build Coastguard WorkerDuring a 3D draw command, SQE generates several internal events flow through the
199*61046927SAndroid Build Coastguard Workerpipeline:
200*61046927SAndroid Build Coastguard Worker
201*61046927SAndroid Build Coastguard Worker- ``CP_EVENT_START`` clears the done bit for the context when written to the
202*61046927SAndroid Build Coastguard Worker  cluster
203*61046927SAndroid Build Coastguard Worker- ``PC_EVENT_CMD``/``PC_DRAW_CMD``/``HLSQ_EVENT_CMD``/``HLSQ_DRAW_CMD`` kick off
204*61046927SAndroid Build Coastguard Worker  the actual event/drawing.
205*61046927SAndroid Build Coastguard Worker- ``CONTEXT_DONE`` event completes after the event/draw is complete and sets the
206*61046927SAndroid Build Coastguard Worker  done flag.
207*61046927SAndroid Build Coastguard Worker- ``CP_EVENT_END`` waits for the done flag on the next context, then copies all
208*61046927SAndroid Build Coastguard Worker  the registers that were dirtied in this context to that one.
209*61046927SAndroid Build Coastguard Worker
210*61046927SAndroid Build Coastguard WorkerThe 2D blit engine has its own ``CP_2D_EVENT_START``, ``CP_2D_EVENT_END``,
211*61046927SAndroid Build Coastguard Worker``CONTEXT_DONE_2D``, so 2D and 3D register contexts can do separate context
212*61046927SAndroid Build Coastguard Workerrollover.
213*61046927SAndroid Build Coastguard Worker
214*61046927SAndroid Build Coastguard WorkerBecause the clusters proceed independently of each other even across draws, if
215*61046927SAndroid Build Coastguard Workeryou need to synchronize an earlier cluster to the output of a later one, then
216*61046927SAndroid Build Coastguard Workeryou will need to ``CP_WAIT_FOR_IDLE`` after flushing and invalidating any
217*61046927SAndroid Build Coastguard Workernecessary caches.
218*61046927SAndroid Build Coastguard Worker
219*61046927SAndroid Build Coastguard WorkerAlso, note that some registers are not banked at all, and will require a
220*61046927SAndroid Build Coastguard Worker``CP_WAIT_FOR_IDLE`` for any previous usage of the register to complete.
221*61046927SAndroid Build Coastguard Worker
222*61046927SAndroid Build Coastguard WorkerIn a2xx-a4xx, there weren't per-stage clusters, and instead there were two
223*61046927SAndroid Build Coastguard Workerregister banks that were flipped between per draw.
224*61046927SAndroid Build Coastguard Worker
225*61046927SAndroid Build Coastguard WorkerBindless/Bindful Descriptors (a6xx+)
226*61046927SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
227*61046927SAndroid Build Coastguard Worker
228*61046927SAndroid Build Coastguard WorkerStarting with a6xx++, cat5 (texture) and cat6 (image/SSBO/UBO) instructions are
229*61046927SAndroid Build Coastguard Workerextended to support bindless descriptors.
230*61046927SAndroid Build Coastguard Worker
231*61046927SAndroid Build Coastguard WorkerIn the old bindful model, descriptors are separate for textures, samplers,
232*61046927SAndroid Build Coastguard WorkerUBOs, and IBOs (combined descriptor for images and SSBOs), with separate
233*61046927SAndroid Build Coastguard Workerregisters for the memory containing the array of descriptors, and/or different
234*61046927SAndroid Build Coastguard Worker``STATE_TYPE`` and ``STATE_BLOCK`` for ``CP_LOAD_STATE``/``_FRAG``/``_GEOM``
235*61046927SAndroid Build Coastguard Workerto pre-load the descriptors into cache.
236*61046927SAndroid Build Coastguard Worker
237*61046927SAndroid Build Coastguard Worker- textures - per-shader-stage
238*61046927SAndroid Build Coastguard Worker   - registers: ``SP_xS_TEX_CONST``/``SP_xS_TEX_COUNT``
239*61046927SAndroid Build Coastguard Worker   - state-type: ``ST6_CONSTANTS``
240*61046927SAndroid Build Coastguard Worker   - state-block: ``SB6_xS_TEX``
241*61046927SAndroid Build Coastguard Worker- samplers - per-shader-stage
242*61046927SAndroid Build Coastguard Worker   - registers: ``SP_xS_TEX_SAMP``
243*61046927SAndroid Build Coastguard Worker   - state-type: ``ST6_SHADER``
244*61046927SAndroid Build Coastguard Worker   - state-block: ``SB6_xS_TEX``
245*61046927SAndroid Build Coastguard Worker- UBOs - per-shader-stage
246*61046927SAndroid Build Coastguard Worker   - registers: none
247*61046927SAndroid Build Coastguard Worker   - state-type: ``ST6_UBO``
248*61046927SAndroid Build Coastguard Worker   - state-block: ``SB6_xS_SHADER``
249*61046927SAndroid Build Coastguard Worker- IBOs - global across shader 3d stages, separate for compute shader
250*61046927SAndroid Build Coastguard Worker   - registers: ``SP_IBO``/``SP_IBO_COUNT`` or ``SP_CS_IBO``/``SP_CS_IBO_COUNT``
251*61046927SAndroid Build Coastguard Worker   - state-type: ``ST6_SHADER``
252*61046927SAndroid Build Coastguard Worker   - state-block: ``ST6_IBO`` or ``ST6_CS_IBO`` for compute shaders
253*61046927SAndroid Build Coastguard Worker   - Note, unlike per-shader-stage descriptors, ``CP_LOAD_STATE6`` is used,
254*61046927SAndroid Build Coastguard Worker     as opposed to ``CP_LOAD_STATE6_GEOM`` or ``CP_LOAD_STATE6_FRAG``
255*61046927SAndroid Build Coastguard Worker     depending on shader stage.
256*61046927SAndroid Build Coastguard Worker
257*61046927SAndroid Build Coastguard Worker.. note::
258*61046927SAndroid Build Coastguard Worker   For the per-shader-stage registers and state-blocks the ``xS`` notation
259*61046927SAndroid Build Coastguard Worker   refers to per-shader-stage names, ex. ``SP_FS_TEX_CONST`` or ``SB6_DS_TEX``
260*61046927SAndroid Build Coastguard Worker
261*61046927SAndroid Build Coastguard WorkerTextures and IBOs (images) use *basically* the same 64byte descriptor format
262*61046927SAndroid Build Coastguard Workerwith some exceptions (for ex, for IBOs cubemaps are handles as 2d array).
263*61046927SAndroid Build Coastguard WorkerSSBOs are just untyped buffers, but otherwise use the same descriptors and
264*61046927SAndroid Build Coastguard Workerinstructions as images.  Samplers use a 16byte descriptor, and UBOs use an
265*61046927SAndroid Build Coastguard Worker8byte descriptor which packs the size in the upper 15 bits of the UBO address.
266*61046927SAndroid Build Coastguard Worker
267*61046927SAndroid Build Coastguard WorkerIn the bindless model, descriptors are split into 5 descriptor sets, which are
268*61046927SAndroid Build Coastguard Workerglobal across shader stages (but as with bindful IBO descriptors, separate for
269*61046927SAndroid Build Coastguard Worker3d stages vs compute stage).  Each HW descriptor is an array of descriptors
270*61046927SAndroid Build Coastguard Workerof configurable size (each descriptor set can be configured for a descriptor
271*61046927SAndroid Build Coastguard Workerpitch of 8bytes or 64bytes).  Each descriptor can be of arbitrary format (ie.
272*61046927SAndroid Build Coastguard WorkerUBOs/IBOs/textures/samplers interleaved), it's interpretation by the HW is
273*61046927SAndroid Build Coastguard Workerdetermined by the instruction that references the descriptor.  Each descriptor
274*61046927SAndroid Build Coastguard Workerset can contain at least 2^^16 descriptors.
275*61046927SAndroid Build Coastguard Worker
276*61046927SAndroid Build Coastguard WorkerThe HW is configured with the base address of the descriptor set via an array
277*61046927SAndroid Build Coastguard Workerof "BINDLESS_BASE" registers, ie ``SP_BINDLESS_BASE[n]``/``HLSQ_BINDLESS_BASE[n]``
278*61046927SAndroid Build Coastguard Workerfor 3d shader stages, or ``SP_CS_BINDLESS_BASE[n]``/``HLSQ_CS_BINDLESS_BASE[n]``
279*61046927SAndroid Build Coastguard Workerfor compute shaders, with the descriptor pitch encoded in the low bits.
280*61046927SAndroid Build Coastguard WorkerWhich of the descriptor sets is referenced is encoded via three bits in the
281*61046927SAndroid Build Coastguard Workerinstruction.  The address of the descriptor is calculated as::
282*61046927SAndroid Build Coastguard Worker
283*61046927SAndroid Build Coastguard Worker   descriptor_addr = (BINDLESS_BASE[n] & ~0x3) +
284*61046927SAndroid Build Coastguard Worker                     (idx * 4 * (2 << BINDLESS_BASE[n] & 0x3))
285*61046927SAndroid Build Coastguard Worker
286*61046927SAndroid Build Coastguard Worker
287*61046927SAndroid Build Coastguard Worker.. note::
288*61046927SAndroid Build Coastguard Worker   Turnip reserves one descriptor set for internal use and exposes the other
289*61046927SAndroid Build Coastguard Worker   four for the application via the Vulkan API.
290*61046927SAndroid Build Coastguard Worker
291*61046927SAndroid Build Coastguard WorkerSoftware Architecture
292*61046927SAndroid Build Coastguard Worker---------------------
293*61046927SAndroid Build Coastguard Worker
294*61046927SAndroid Build Coastguard WorkerFreedreno and Turnip use a shared core for shader compiler, image layout, and
295*61046927SAndroid Build Coastguard Workerregister and command stream definitions.  They implement separate state
296*61046927SAndroid Build Coastguard Workermanagement and command stream generation.
297*61046927SAndroid Build Coastguard Worker
298*61046927SAndroid Build Coastguard Worker.. toctree::
299*61046927SAndroid Build Coastguard Worker   :glob:
300*61046927SAndroid Build Coastguard Worker
301*61046927SAndroid Build Coastguard Worker   freedreno/*
302*61046927SAndroid Build Coastguard Worker
303*61046927SAndroid Build Coastguard WorkerGPU devcoredump
304*61046927SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^^^^
305*61046927SAndroid Build Coastguard Worker
306*61046927SAndroid Build Coastguard WorkerA kernel message from DRM of "gpu fault" can mean any sort of error reported by
307*61046927SAndroid Build Coastguard Workerthe GPU (including its internal hang detection).  If a fault in GPU address
308*61046927SAndroid Build Coastguard Workerspace happened, you should expect to find a message from the iommu, with the
309*61046927SAndroid Build Coastguard Workerfaulting address and a hardware unit involved:
310*61046927SAndroid Build Coastguard Worker
311*61046927SAndroid Build Coastguard Worker.. code-block:: text
312*61046927SAndroid Build Coastguard Worker
313*61046927SAndroid Build Coastguard Worker  *** gpu fault: ttbr0=000000001c941000 iova=000000010066a000 dir=READ type=TRANSLATION source=TP|VFD (0,0,0,1)
314*61046927SAndroid Build Coastguard Worker
315*61046927SAndroid Build Coastguard WorkerOn a GPU fault or hang, a GPU core dump is taken by the DRM driver and saved to
316*61046927SAndroid Build Coastguard Worker``/sys/devices/virtual/devcoredump/**/data``.  You can cp that file to a
317*61046927SAndroid Build Coastguard Worker:file:`crash.devcore` to save it, otherwise the kernel will expire it
318*61046927SAndroid Build Coastguard Workereventually. Echo 1 to the file to free the core early, as another core won't be
319*61046927SAndroid Build Coastguard Workertaken until then.
320*61046927SAndroid Build Coastguard Worker
321*61046927SAndroid Build Coastguard WorkerOnce you have your core file, you can use :command:`crashdec -f crash.devcore`
322*61046927SAndroid Build Coastguard Workerto decode it.  The output will have ``ESTIMATED CRASH LOCATION`` where we
323*61046927SAndroid Build Coastguard Workerestimate the CP to have stopped.  Note that it is expected that this will be
324*61046927SAndroid Build Coastguard Workersome distance past whatever state triggered the fault, given GPU pipelining, and
325*61046927SAndroid Build Coastguard Workerwill often be at some ``CP_REG_TO_MEM`` (which waits on previous WFIs) or
326*61046927SAndroid Build Coastguard Worker``CP_WAIT_FOR_ME`` (which waits for all register writes to land) or similar
327*61046927SAndroid Build Coastguard Workerevent. You can try running the workload with ``TU_DEBUG=flushall`` or
328*61046927SAndroid Build Coastguard Worker``FD_MESA_DEBUG=flush`` to try to close in on the failing commands.
329*61046927SAndroid Build Coastguard Worker
330*61046927SAndroid Build Coastguard WorkerYou can also find what commands were queued up to each cluster in the
331*61046927SAndroid Build Coastguard Worker``regs-name: CP_MEMPOOL`` section.
332*61046927SAndroid Build Coastguard Worker
333*61046927SAndroid Build Coastguard WorkerIf ``ESTIMATED CRASH LOCATION`` doesn't exist you could find ``CP_SQE_STAT``,
334*61046927SAndroid Build Coastguard Workerthough going here is the last resort and likely won't be helpful.
335*61046927SAndroid Build Coastguard Worker
336*61046927SAndroid Build Coastguard Worker.. code-block::
337*61046927SAndroid Build Coastguard Worker
338*61046927SAndroid Build Coastguard Worker  indexed-registers:
339*61046927SAndroid Build Coastguard Worker    - regs-name: CP_SQE_STAT
340*61046927SAndroid Build Coastguard Worker      dwords: 51
341*61046927SAndroid Build Coastguard Worker  	 PC: 00d7                                <-------------
342*61046927SAndroid Build Coastguard Worker  	PKT: CP_LOAD_STATE6_FRAG
343*61046927SAndroid Build Coastguard Worker  	$01: 70348003		$11: 00000000
344*61046927SAndroid Build Coastguard Worker  	$02: 20000000		$12: 00000022
345*61046927SAndroid Build Coastguard Worker
346*61046927SAndroid Build Coastguard WorkerThe ``PC`` value is an instruction address in the current firmware.
347*61046927SAndroid Build Coastguard WorkerYou would need to disassemble the firmware (/lib/firmware/qcom/aXXX_sqe.fw) via:
348*61046927SAndroid Build Coastguard Worker
349*61046927SAndroid Build Coastguard Worker.. code-block:: sh
350*61046927SAndroid Build Coastguard Worker
351*61046927SAndroid Build Coastguard Worker  afuc-disasm -v a650_sqe.fw > a650_sqe.fw.disasm
352*61046927SAndroid Build Coastguard Worker
353*61046927SAndroid Build Coastguard WorkerNow you should search for PC value in the disassembly, e.g.:
354*61046927SAndroid Build Coastguard Worker
355*61046927SAndroid Build Coastguard Worker.. code-block::
356*61046927SAndroid Build Coastguard Worker
357*61046927SAndroid Build Coastguard Worker  l018:	00d1: 08dd0001  add $addr, $06, 0x0001
358*61046927SAndroid Build Coastguard Worker       	00d2: 981ff806  mov $data, $data
359*61046927SAndroid Build Coastguard Worker       	00d3: 8a080001  mov $08, 0x0001 << 16
360*61046927SAndroid Build Coastguard Worker       	00d4: 3108ffff  or $08, $08, 0xffff
361*61046927SAndroid Build Coastguard Worker       	00d5: 9be8f805  and $data, $data, $08
362*61046927SAndroid Build Coastguard Worker       	00d6: 9806e806  mov $addr, $06
363*61046927SAndroid Build Coastguard Worker       	00d7: 9803f806  mov $data, $03           <------------- HERE
364*61046927SAndroid Build Coastguard Worker       	00d8: d8000000  waitin
365*61046927SAndroid Build Coastguard Worker       	00d9: 981f0806  mov $01, $data
366*61046927SAndroid Build Coastguard Worker
367*61046927SAndroid Build Coastguard Worker
368*61046927SAndroid Build Coastguard WorkerCommand Stream Capture
369*61046927SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^^^^^^^^
370*61046927SAndroid Build Coastguard Worker
371*61046927SAndroid Build Coastguard WorkerDuring Mesa development, it's often useful to look at the command streams we
372*61046927SAndroid Build Coastguard Workersend to the kernel.  We have an interface for the kernel to capture all
373*61046927SAndroid Build Coastguard Workersubmitted command streams:
374*61046927SAndroid Build Coastguard Worker
375*61046927SAndroid Build Coastguard Worker.. code-block:: sh
376*61046927SAndroid Build Coastguard Worker
377*61046927SAndroid Build Coastguard Worker  cat /sys/kernel/debug/dri/0/rd > cmdstream &
378*61046927SAndroid Build Coastguard Worker
379*61046927SAndroid Build Coastguard WorkerBy default, command stream capture does not capture texture/vertex/etc. data.
380*61046927SAndroid Build Coastguard WorkerYou can enable capturing all the BOs with:
381*61046927SAndroid Build Coastguard Worker
382*61046927SAndroid Build Coastguard Worker.. code-block:: sh
383*61046927SAndroid Build Coastguard Worker
384*61046927SAndroid Build Coastguard Worker  echo Y > /sys/module/msm/parameters/rd_full
385*61046927SAndroid Build Coastguard Worker
386*61046927SAndroid Build Coastguard WorkerNote that, since all command streams get captured, it is easy to run the system
387*61046927SAndroid Build Coastguard Workerout of memory doing this, so you probably don't want to enable it during play of
388*61046927SAndroid Build Coastguard Workera heavyweight game.  Instead, to capture a command stream within a game, you
389*61046927SAndroid Build Coastguard Workerprobably want to cause a crash in the GPU during a frame of interest so that a
390*61046927SAndroid Build Coastguard Workersingle GPU core dump is generated.  Emitting ``0xdeadbeef`` in the CS should be
391*61046927SAndroid Build Coastguard Workerenough to cause a fault.
392*61046927SAndroid Build Coastguard Worker
393*61046927SAndroid Build Coastguard Worker``fd_rd_output`` facilities provide support for generating the command stream
394*61046927SAndroid Build Coastguard Workercapture from inside Mesa. Different ``FD_RD_DUMP`` options are available:
395*61046927SAndroid Build Coastguard Worker
396*61046927SAndroid Build Coastguard Worker- ``enable`` simply enables dumping the command stream on each submit for a
397*61046927SAndroid Build Coastguard Worker  given logical device. When a more advanced option is specified, ``enable`` is
398*61046927SAndroid Build Coastguard Worker  implied as specified.
399*61046927SAndroid Build Coastguard Worker- ``combine`` will combine all dumps into a single file instead of writing the
400*61046927SAndroid Build Coastguard Worker  dump for each submit into a standalone file.
401*61046927SAndroid Build Coastguard Worker- ``full`` will dump every buffer object, which is necessary for replays of
402*61046927SAndroid Build Coastguard Worker  command streams (see below).
403*61046927SAndroid Build Coastguard Worker- ``trigger`` will establish a trigger file through which dumps can be better
404*61046927SAndroid Build Coastguard Worker  controlled. Writing a positive integer value into the file will enable dumping
405*61046927SAndroid Build Coastguard Worker  of that many subsequent submits. Writing -1 will enable dumping of submits
406*61046927SAndroid Build Coastguard Worker  until disabled. Writing 0 (or any other value) will disable dumps.
407*61046927SAndroid Build Coastguard Worker
408*61046927SAndroid Build Coastguard WorkerOutput dump files and trigger file (when enabled) are hard-coded to be placed
409*61046927SAndroid Build Coastguard Workerunder ``/tmp``, or ``/data/local/tmp`` under Android. `FD_RD_DUMP_TESTNAME` can
410*61046927SAndroid Build Coastguard Workerbe used to specify a more descriptive prefix for the output or trigger files.
411*61046927SAndroid Build Coastguard Worker
412*61046927SAndroid Build Coastguard WorkerFunctionality is generic to any Freedreno-based backend, but is currently only
413*61046927SAndroid Build Coastguard Workerintegrated in the MSM backend of Turnip. Using the existing ``TU_DEBUG=rd``
414*61046927SAndroid Build Coastguard Workeroption will translate to ``FD_RD_DUMP=enable``.
415*61046927SAndroid Build Coastguard Worker
416*61046927SAndroid Build Coastguard WorkerCapturing Hang RD
417*61046927SAndroid Build Coastguard Worker+++++++++++++++++
418*61046927SAndroid Build Coastguard Worker
419*61046927SAndroid Build Coastguard WorkerDevcore file doesn't contain all submitted command streams, only the hanging one.
420*61046927SAndroid Build Coastguard WorkerAdditionally it is geared towards analyzing the GPU state at the moment of the crash.
421*61046927SAndroid Build Coastguard Worker
422*61046927SAndroid Build Coastguard WorkerAlternatively, it's possible to obtain the whole submission with all command
423*61046927SAndroid Build Coastguard Workerstreams via ``/sys/kernel/debug/dri/0/hangrd``:
424*61046927SAndroid Build Coastguard Worker
425*61046927SAndroid Build Coastguard Worker.. code-block:: sh
426*61046927SAndroid Build Coastguard Worker
427*61046927SAndroid Build Coastguard Worker  sudo cat /sys/kernel/debug/dri/0/hangrd > logfile.rd // Do the cat _before_ the expected hang
428*61046927SAndroid Build Coastguard Worker
429*61046927SAndroid Build Coastguard WorkerThe format of hangrd is the same as in ordinary command stream capture.
430*61046927SAndroid Build Coastguard Worker``rd_full`` also has the same effect on it.
431*61046927SAndroid Build Coastguard Worker
432*61046927SAndroid Build Coastguard WorkerReplaying Command Stream
433*61046927SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^^^^^^^^^^
434*61046927SAndroid Build Coastguard Worker
435*61046927SAndroid Build Coastguard Worker``replay`` tool allows capturing and replaying ``rd`` to reproduce GPU faults.
436*61046927SAndroid Build Coastguard WorkerEspecially useful for transient GPU issues since it has much higher chances to
437*61046927SAndroid Build Coastguard Workerreproduce them.
438*61046927SAndroid Build Coastguard Worker
439*61046927SAndroid Build Coastguard WorkerDumping rendering results or even just memory is currently unsupported.
440*61046927SAndroid Build Coastguard Worker
441*61046927SAndroid Build Coastguard Worker- Replaying command streams requires kernel with ``MSM_INFO_SET_IOVA`` support.
442*61046927SAndroid Build Coastguard Worker- Requires ``rd`` capture to have full snapshots of the memory (``rd_full`` is enabled).
443*61046927SAndroid Build Coastguard Worker
444*61046927SAndroid Build Coastguard WorkerReplaying is done via ``replay`` tool:
445*61046927SAndroid Build Coastguard Worker
446*61046927SAndroid Build Coastguard Worker.. code-block:: sh
447*61046927SAndroid Build Coastguard Worker
448*61046927SAndroid Build Coastguard Worker  ./replay test_replay.rd
449*61046927SAndroid Build Coastguard Worker
450*61046927SAndroid Build Coastguard WorkerMore examples:
451*61046927SAndroid Build Coastguard Worker
452*61046927SAndroid Build Coastguard Worker.. code-block:: sh
453*61046927SAndroid Build Coastguard Worker
454*61046927SAndroid Build Coastguard Worker  ./replay --first=start_submit_n --last=last_submit_n test_replay.rd
455*61046927SAndroid Build Coastguard Worker
456*61046927SAndroid Build Coastguard Worker.. code-block:: sh
457*61046927SAndroid Build Coastguard Worker
458*61046927SAndroid Build Coastguard Worker  ./replay --override=0 test_replay.rd
459*61046927SAndroid Build Coastguard Worker
460*61046927SAndroid Build Coastguard WorkerEditing Command Stream (a6xx+)
461*61046927SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
462*61046927SAndroid Build Coastguard Worker
463*61046927SAndroid Build Coastguard WorkerWhile replaying a fault is useful in itself, modifying the capture to
464*61046927SAndroid Build Coastguard Workerunderstand what causes the fault could be even more useful.
465*61046927SAndroid Build Coastguard Worker
466*61046927SAndroid Build Coastguard Worker``rddecompiler`` decompiles a single cmdstream from ``rd`` into compilable C source.
467*61046927SAndroid Build Coastguard WorkerGiven the address space bounds the generated program creates a new ``rd`` which
468*61046927SAndroid Build Coastguard Workercould be used to override cmdstream with 'replay'. Generated ``rd`` is not replayable
469*61046927SAndroid Build Coastguard Workeron its own and depends on buffers provided by the source ``rd``.
470*61046927SAndroid Build Coastguard Worker
471*61046927SAndroid Build Coastguard WorkerC source could be compiled by putting it into src/freedreno/decode/generate-rd.cc.
472*61046927SAndroid Build Coastguard Worker
473*61046927SAndroid Build Coastguard WorkerThe workflow would look like this:
474*61046927SAndroid Build Coastguard Worker
475*61046927SAndroid Build Coastguard Worker1. Find the cmdstream № you want to edit;
476*61046927SAndroid Build Coastguard Worker2. Decompile it:
477*61046927SAndroid Build Coastguard Worker
478*61046927SAndroid Build Coastguard Worker.. code-block:: sh
479*61046927SAndroid Build Coastguard Worker
480*61046927SAndroid Build Coastguard Worker  ./rddecompiler -s %cmd_stream_n% example.rd > src/freedreno/decode/generate-rd.cc
481*61046927SAndroid Build Coastguard Worker
482*61046927SAndroid Build Coastguard Worker3. Edit the command stream;;
483*61046927SAndroid Build Coastguard Worker4. Compile and deploy freedreno tools;
484*61046927SAndroid Build Coastguard Worker5. Plug the generator into cmdstream replay:
485*61046927SAndroid Build Coastguard Worker
486*61046927SAndroid Build Coastguard Worker.. code-block:: sh
487*61046927SAndroid Build Coastguard Worker
488*61046927SAndroid Build Coastguard Worker  ./replay --override=%cmd_stream_№%
489*61046927SAndroid Build Coastguard Worker
490*61046927SAndroid Build Coastguard Worker6. Repeat 3-5.
491*61046927SAndroid Build Coastguard Worker
492*61046927SAndroid Build Coastguard WorkerGPU Hang Debugging
493*61046927SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^^^^
494*61046927SAndroid Build Coastguard Worker
495*61046927SAndroid Build Coastguard WorkerNot a guide for how to do it but mostly an enumeration of methods.
496*61046927SAndroid Build Coastguard Worker
497*61046927SAndroid Build Coastguard WorkerUseful ``TU_DEBUG`` (for Turnip) options to narrow down the hang cause:
498*61046927SAndroid Build Coastguard Worker
499*61046927SAndroid Build Coastguard Worker``sysmem``, ``gmem``, ``nobin``, ``forcebin``, ``noubwc``, ``nolrz``, ``flushall``, ``syncdraw``, ``rast_order``
500*61046927SAndroid Build Coastguard Worker
501*61046927SAndroid Build Coastguard WorkerUseful ``FD_MESA_DEBUG`` (for Freedreno) options:
502*61046927SAndroid Build Coastguard Worker
503*61046927SAndroid Build Coastguard Worker``sysmem``, ``gmem``, ``nobin``, ``noubwc``, ``nolrz``, ``notile``, ``dclear``, ``ddraw``, ``flush``, ``inorder``, ``noblit``
504*61046927SAndroid Build Coastguard Worker
505*61046927SAndroid Build Coastguard WorkerUseful ``IR3_SHADER_DEBUG`` options:
506*61046927SAndroid Build Coastguard Worker
507*61046927SAndroid Build Coastguard Worker``nouboopt``, ``spillall``, ``nopreamble``, ``nofp16``
508*61046927SAndroid Build Coastguard Worker
509*61046927SAndroid Build Coastguard WorkerUse Graphics Flight Recorder to narrow down the place which hangs,
510*61046927SAndroid Build Coastguard Workeruse our own breadcrumbs implementation in case of unrecoverable hangs.
511*61046927SAndroid Build Coastguard Worker
512*61046927SAndroid Build Coastguard WorkerIn case of faults use RenderDoc to find the problematic command. If it's
513*61046927SAndroid Build Coastguard Workera draw call, edit shader in RenderDoc to find whether it culprit is a shader.
514*61046927SAndroid Build Coastguard WorkerIf yes, bisect it.
515*61046927SAndroid Build Coastguard Worker
516*61046927SAndroid Build Coastguard WorkerIf editing the shader messes the assembly too much and the issue becomes unreproducible
517*61046927SAndroid Build Coastguard Workertry editing the assembly itself via ``IR3_SHADER_OVERRIDE_PATH``.
518*61046927SAndroid Build Coastguard Worker
519*61046927SAndroid Build Coastguard WorkerIf fault or hang is transient try capturing an ``rd`` and replay it. If issue
520*61046927SAndroid Build Coastguard Workeris reproduced - bisect the GPU packets until the culprit is found.
521*61046927SAndroid Build Coastguard Worker
522*61046927SAndroid Build Coastguard WorkerDo the above if culprit is not a shader.
523*61046927SAndroid Build Coastguard Worker
524*61046927SAndroid Build Coastguard WorkerThe hang recovery mechanism in Kernel is not perfect, in case of unrecoverable
525*61046927SAndroid Build Coastguard Workerhangs check whether the kernel is up to date and look for unmerged patches
526*61046927SAndroid Build Coastguard Workerwhich could improve the recovery.
527*61046927SAndroid Build Coastguard Worker
528*61046927SAndroid Build Coastguard WorkerGPU Breadcrumbs
529*61046927SAndroid Build Coastguard Worker+++++++++++++++
530*61046927SAndroid Build Coastguard Worker
531*61046927SAndroid Build Coastguard WorkerBreadcrumbs described below are available only in Turnip.
532*61046927SAndroid Build Coastguard Worker
533*61046927SAndroid Build Coastguard WorkerFreedreno has simpler breadcrumbs, in debug build writes breadcrumbs
534*61046927SAndroid Build Coastguard Workerinto ``CP_SCRATCH_REG[6]`` and per-tile breadcrumbs into ``CP_SCRATCH_REG[7]``,
535*61046927SAndroid Build Coastguard Workerin this way they are available in the devcoredump. TODO: generalize Tunip's
536*61046927SAndroid Build Coastguard Workerbreadcrumbs implementation.
537*61046927SAndroid Build Coastguard Worker
538*61046927SAndroid Build Coastguard WorkerThis is a simple implementations of breadcrumbs tracking of GPU progress
539*61046927SAndroid Build Coastguard Workerintended to be a last resort when debugging unrecoverable hangs.
540*61046927SAndroid Build Coastguard WorkerFor best results use Vulkan traces to have a predictable place of hang.
541*61046927SAndroid Build Coastguard Worker
542*61046927SAndroid Build Coastguard WorkerFor ordinary hangs as a more user-friendly solution use GFR
543*61046927SAndroid Build Coastguard Worker"Graphics Flight Recorder".
544*61046927SAndroid Build Coastguard Worker
545*61046927SAndroid Build Coastguard WorkerOr breadcrumbs implementation aims to handle cases where nothing can be done
546*61046927SAndroid Build Coastguard Workerafter the hang. In-driver breadcrumbs also allow more precise tracking since
547*61046927SAndroid Build Coastguard Workerwe could target a single GPU packet.
548*61046927SAndroid Build Coastguard Worker
549*61046927SAndroid Build Coastguard WorkerWhile breadcrumbs support gmem, try to reproduce the hang in a sysmem mode
550*61046927SAndroid Build Coastguard Workerbecause it would require much less breadcrumb writes and syncs.
551*61046927SAndroid Build Coastguard Worker
552*61046927SAndroid Build Coastguard WorkerBreadcrumbs settings:
553*61046927SAndroid Build Coastguard Worker
554*61046927SAndroid Build Coastguard Worker.. code-block:: sh
555*61046927SAndroid Build Coastguard Worker
556*61046927SAndroid Build Coastguard Worker  TU_BREADCRUMBS=%IP%:%PORT%,break=%BREAKPOINT%:%BREAKPOINT_HITS%
557*61046927SAndroid Build Coastguard Worker
558*61046927SAndroid Build Coastguard Worker``BREAKPOINT``
559*61046927SAndroid Build Coastguard Worker  The breadcrumb starting from which we require explicit ack.
560*61046927SAndroid Build Coastguard Worker``BREAKPOINT_HITS``
561*61046927SAndroid Build Coastguard Worker  How many times breakpoint should be reached for break to occur.
562*61046927SAndroid Build Coastguard Worker  Necessary for a gmem mode and re-usable cmdbuffers in both of which
563*61046927SAndroid Build Coastguard Worker  the same cmdstream could be executed several times.
564*61046927SAndroid Build Coastguard Worker
565*61046927SAndroid Build Coastguard WorkerA typical work flow would be:
566*61046927SAndroid Build Coastguard Worker
567*61046927SAndroid Build Coastguard Worker- Start listening for breadcrumbs on a remote host:
568*61046927SAndroid Build Coastguard Worker
569*61046927SAndroid Build Coastguard Worker.. code-block:: sh
570*61046927SAndroid Build Coastguard Worker
571*61046927SAndroid Build Coastguard Worker   nc -lvup $PORT | stdbuf -o0 xxd -pc -c 4 | awk -Wposix '{printf("%u:%u\n", "0x" $0, a[$0]++)}'
572*61046927SAndroid Build Coastguard Worker
573*61046927SAndroid Build Coastguard Worker- Start capturing command stream;
574*61046927SAndroid Build Coastguard Worker- Replay the hanging trace with:
575*61046927SAndroid Build Coastguard Worker
576*61046927SAndroid Build Coastguard Worker.. code-block:: sh
577*61046927SAndroid Build Coastguard Worker
578*61046927SAndroid Build Coastguard Worker   TU_BREADCRUMBS=$IP:$PORT,break=-1:0
579*61046927SAndroid Build Coastguard Worker
580*61046927SAndroid Build Coastguard Worker- Increase hangcheck period:
581*61046927SAndroid Build Coastguard Worker
582*61046927SAndroid Build Coastguard Worker.. code-block:: sh
583*61046927SAndroid Build Coastguard Worker
584*61046927SAndroid Build Coastguard Worker   echo -n 60000 > /sys/kernel/debug/dri/0/hangcheck_period_ms
585*61046927SAndroid Build Coastguard Worker
586*61046927SAndroid Build Coastguard Worker- After GPU hang note the last breadcrumb and relaunch trace with:
587*61046927SAndroid Build Coastguard Worker
588*61046927SAndroid Build Coastguard Worker.. code-block:: sh
589*61046927SAndroid Build Coastguard Worker
590*61046927SAndroid Build Coastguard Worker   TU_BREADCRUMBS=%IP%:%PORT%,break=%LAST_BREADCRUMB%:%HITS%
591*61046927SAndroid Build Coastguard Worker
592*61046927SAndroid Build Coastguard Worker- After the breakpoint is reached each breadcrumb would require
593*61046927SAndroid Build Coastguard Worker  explicit ack from the user. This way it's possible to find
594*61046927SAndroid Build Coastguard Worker  the last packet which didn't hang.
595*61046927SAndroid Build Coastguard Worker
596*61046927SAndroid Build Coastguard Worker- Find the packet in the decoded cmdstream.
597*61046927SAndroid Build Coastguard Worker
598*61046927SAndroid Build Coastguard WorkerDebugging random failures
599*61046927SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^^^^^^^^^^^
600*61046927SAndroid Build Coastguard Worker
601*61046927SAndroid Build Coastguard WorkerIn most cases random GPU faults and rendering artifacts are caused by some kind
602*61046927SAndroid Build Coastguard Workerof undefined behavior that falls under the following categories:
603*61046927SAndroid Build Coastguard Worker
604*61046927SAndroid Build Coastguard Worker- Usage of a stale reg value;
605*61046927SAndroid Build Coastguard Worker- Usage of stale memory (e.g. expecting it to be zeroed when it is not);
606*61046927SAndroid Build Coastguard Worker- Lack of the proper synchronization.
607*61046927SAndroid Build Coastguard Worker
608*61046927SAndroid Build Coastguard WorkerFinding instances of stale reg reads
609*61046927SAndroid Build Coastguard Worker++++++++++++++++++++++++++++++++++++
610*61046927SAndroid Build Coastguard Worker
611*61046927SAndroid Build Coastguard WorkerTurnip has a debug option to stomp the registers with invalid values to catch
612*61046927SAndroid Build Coastguard Workerthe cases where stale data is read.
613*61046927SAndroid Build Coastguard Worker
614*61046927SAndroid Build Coastguard Worker.. code-block:: sh
615*61046927SAndroid Build Coastguard Worker
616*61046927SAndroid Build Coastguard Worker  MESA_VK_ABORT_ON_DEVICE_LOSS=1 \
617*61046927SAndroid Build Coastguard Worker  TU_DEBUG_STALE_REGS_RANGE=0x00000c00,0x0000be01 \
618*61046927SAndroid Build Coastguard Worker  TU_DEBUG_STALE_REGS_FLAGS=cmdbuf,renderpass \
619*61046927SAndroid Build Coastguard Worker  ./app
620*61046927SAndroid Build Coastguard Worker
621*61046927SAndroid Build Coastguard Worker.. envvar:: TU_DEBUG_STALE_REGS_RANGE
622*61046927SAndroid Build Coastguard Worker
623*61046927SAndroid Build Coastguard Worker  the reg range in which registers would be stomped. Add ``inverse`` to the
624*61046927SAndroid Build Coastguard Worker  flags in order for this range to specify which registers NOT to stomp.
625*61046927SAndroid Build Coastguard Worker
626*61046927SAndroid Build Coastguard Worker.. envvar:: TU_DEBUG_STALE_REGS_FLAGS
627*61046927SAndroid Build Coastguard Worker
628*61046927SAndroid Build Coastguard Worker  ``cmdbuf``
629*61046927SAndroid Build Coastguard Worker    stomp registers at the start of each command buffer.
630*61046927SAndroid Build Coastguard Worker  ``renderpass``
631*61046927SAndroid Build Coastguard Worker    stomp registers before each render pass.
632*61046927SAndroid Build Coastguard Worker  ``inverse``
633*61046927SAndroid Build Coastguard Worker    changes ``TU_DEBUG_STALE_REGS_RANGE`` meaning to
634*61046927SAndroid Build Coastguard Worker    "regs that should NOT be stomped".
635*61046927SAndroid Build Coastguard Worker
636*61046927SAndroid Build Coastguard WorkerThe best way to pinpoint the reg which causes a failure is to bisect the regs
637*61046927SAndroid Build Coastguard Workerrange. In case when a fail is caused by combination of several registers
638*61046927SAndroid Build Coastguard Workerthe ``inverse`` flag may be set to find the reg which prevents the failure.
639