1*61046927SAndroid Build Coastguard WorkerFreedreno 2*61046927SAndroid Build Coastguard Worker========= 3*61046927SAndroid Build Coastguard Worker 4*61046927SAndroid Build Coastguard WorkerFreedreno GLES and GL driver for Adreno 2xx-6xx GPUs. It implements up to 5*61046927SAndroid Build Coastguard WorkerOpenGL ES 3.2 and desktop OpenGL 4.5. 6*61046927SAndroid Build Coastguard Worker 7*61046927SAndroid Build Coastguard WorkerSee the `Freedreno Wiki 8*61046927SAndroid Build Coastguard Worker<https://gitlab.freedesktop.org/freedreno/freedreno/-/wikis/home>`__ for more 9*61046927SAndroid Build Coastguard Workerdetails. 10*61046927SAndroid Build Coastguard Worker 11*61046927SAndroid Build Coastguard WorkerTurnip 12*61046927SAndroid Build Coastguard Worker------ 13*61046927SAndroid Build Coastguard Worker 14*61046927SAndroid Build Coastguard WorkerTurnip is a Vulkan 1.3 driver for Adreno 6xx GPUs. 15*61046927SAndroid Build Coastguard Worker 16*61046927SAndroid Build Coastguard WorkerThe current set of specific chip versions supported can be found in 17*61046927SAndroid Build Coastguard Worker:file:`src/freedreno/common/freedreno_devices.py`. The current set of features 18*61046927SAndroid Build Coastguard Workersupported can be found rendered at `Mesa Matrix <https://mesamatrix.net/>`__. 19*61046927SAndroid Build Coastguard WorkerThere are no plans to port to a5xx or earlier GPUs. 20*61046927SAndroid Build Coastguard Worker 21*61046927SAndroid Build Coastguard WorkerHardware architecture 22*61046927SAndroid Build Coastguard Worker--------------------- 23*61046927SAndroid Build Coastguard Worker 24*61046927SAndroid Build Coastguard WorkerAdreno is a mostly tile-mode renderer, but with the option to bypass tiling 25*61046927SAndroid Build Coastguard Worker("gmem") and render directly to system memory ("sysmem"). It is UMA, using 26*61046927SAndroid Build Coastguard Workermostly write combined memory but with the ability to map some buffers as cache 27*61046927SAndroid Build Coastguard Workercoherent with the CPU. 28*61046927SAndroid Build Coastguard Worker 29*61046927SAndroid Build Coastguard Worker.. toctree:: 30*61046927SAndroid Build Coastguard Worker :glob: 31*61046927SAndroid Build Coastguard Worker 32*61046927SAndroid Build Coastguard Worker freedreno/hw/* 33*61046927SAndroid Build Coastguard Worker 34*61046927SAndroid Build Coastguard WorkerHardware acronyms 35*61046927SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^^^ 36*61046927SAndroid Build Coastguard Worker 37*61046927SAndroid Build Coastguard Worker.. glossary:: 38*61046927SAndroid Build Coastguard Worker 39*61046927SAndroid Build Coastguard Worker Cluster 40*61046927SAndroid Build Coastguard Worker A group of hardware registers, often with multiple copies to allow 41*61046927SAndroid Build Coastguard Worker pipelining. There is an M:N relationship between hardware blocks that do 42*61046927SAndroid Build Coastguard Worker work and the clusters of registers for the state that hardware blocks use. 43*61046927SAndroid Build Coastguard Worker 44*61046927SAndroid Build Coastguard Worker CP 45*61046927SAndroid Build Coastguard Worker Command Processor. Reads the stream of state changes and draw commands 46*61046927SAndroid Build Coastguard Worker generated by the driver. 47*61046927SAndroid Build Coastguard Worker 48*61046927SAndroid Build Coastguard Worker PFP 49*61046927SAndroid Build Coastguard Worker Prefetch Parser. Adreno 2xx-4xx CP component. 50*61046927SAndroid Build Coastguard Worker 51*61046927SAndroid Build Coastguard Worker ME 52*61046927SAndroid Build Coastguard Worker Micro Engine. Adreno 2xx-4xx CP component after PFP, handles most PM4 commands. 53*61046927SAndroid Build Coastguard Worker 54*61046927SAndroid Build Coastguard Worker SQE 55*61046927SAndroid Build Coastguard Worker a6xx+ replacement for PFP/ME. This is the microcontroller that runs the 56*61046927SAndroid Build Coastguard Worker microcode (loaded from Linux) which actually processes the command stream 57*61046927SAndroid Build Coastguard Worker and writes to the hardware registers. See `afuc 58*61046927SAndroid Build Coastguard Worker <https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/freedreno/afuc/README.rst>`__. 59*61046927SAndroid Build Coastguard Worker 60*61046927SAndroid Build Coastguard Worker ROQ 61*61046927SAndroid Build Coastguard Worker DMA engine used by the SQE for reading memory, with some prefetch buffering. 62*61046927SAndroid Build Coastguard Worker Mostly reads in the command stream, but also serves for 63*61046927SAndroid Build Coastguard Worker ``CP_MEMCPY``/``CP_MEM_TO_REG`` and visibility stream reads. 64*61046927SAndroid Build Coastguard Worker 65*61046927SAndroid Build Coastguard Worker SP 66*61046927SAndroid Build Coastguard Worker Shader Processor. Unified, scalar shader engine. One or more, depending on 67*61046927SAndroid Build Coastguard Worker GPU and tier. 68*61046927SAndroid Build Coastguard Worker 69*61046927SAndroid Build Coastguard Worker TP 70*61046927SAndroid Build Coastguard Worker Texture Processor. 71*61046927SAndroid Build Coastguard Worker 72*61046927SAndroid Build Coastguard Worker UCHE 73*61046927SAndroid Build Coastguard Worker Unified L2 Cache. 32KB on A330, unclear how big now. 74*61046927SAndroid Build Coastguard Worker 75*61046927SAndroid Build Coastguard Worker CCU 76*61046927SAndroid Build Coastguard Worker Color Cache Unit. 77*61046927SAndroid Build Coastguard Worker 78*61046927SAndroid Build Coastguard Worker VSC 79*61046927SAndroid Build Coastguard Worker Visibility Stream Compressor 80*61046927SAndroid Build Coastguard Worker 81*61046927SAndroid Build Coastguard Worker PVS 82*61046927SAndroid Build Coastguard Worker Primitive Visibility Stream 83*61046927SAndroid Build Coastguard Worker 84*61046927SAndroid Build Coastguard Worker FE 85*61046927SAndroid Build Coastguard Worker Front End? Index buffer and vertex attribute fetch cluster. Includes PC, 86*61046927SAndroid Build Coastguard Worker VFD, VPC. 87*61046927SAndroid Build Coastguard Worker 88*61046927SAndroid Build Coastguard Worker VFD 89*61046927SAndroid Build Coastguard Worker Vertex Fetch and Decode 90*61046927SAndroid Build Coastguard Worker 91*61046927SAndroid Build Coastguard Worker VPC 92*61046927SAndroid Build Coastguard Worker Varying/Position Cache? Hardware block that stores shaded vertex data for 93*61046927SAndroid Build Coastguard Worker primitive assembly. 94*61046927SAndroid Build Coastguard Worker 95*61046927SAndroid Build Coastguard Worker HLSQ 96*61046927SAndroid Build Coastguard Worker High Level Sequencer. Manages state for the SPs, batches up PS invocations 97*61046927SAndroid Build Coastguard Worker between primitives, is involved in preemption. 98*61046927SAndroid Build Coastguard Worker 99*61046927SAndroid Build Coastguard Worker PC_VS 100*61046927SAndroid Build Coastguard Worker Cluster where varyings are read from VPC and assembled into primitives to 101*61046927SAndroid Build Coastguard Worker feed GRAS. 102*61046927SAndroid Build Coastguard Worker 103*61046927SAndroid Build Coastguard Worker VS 104*61046927SAndroid Build Coastguard Worker Vertex Shader. Responsible for generating VS/GS/tess invocations 105*61046927SAndroid Build Coastguard Worker 106*61046927SAndroid Build Coastguard Worker GRAS 107*61046927SAndroid Build Coastguard Worker Rasterizer. Responsible for generating PS invocations from primitives, also 108*61046927SAndroid Build Coastguard Worker does LRZ 109*61046927SAndroid Build Coastguard Worker 110*61046927SAndroid Build Coastguard Worker PS 111*61046927SAndroid Build Coastguard Worker Pixel Shader. 112*61046927SAndroid Build Coastguard Worker 113*61046927SAndroid Build Coastguard Worker RB 114*61046927SAndroid Build Coastguard Worker Render Backend. Performs both early and late Z testing, blending, and 115*61046927SAndroid Build Coastguard Worker attachment stores of output of the PS. 116*61046927SAndroid Build Coastguard Worker 117*61046927SAndroid Build Coastguard Worker GMEM 118*61046927SAndroid Build Coastguard Worker Roughly 128KB-1MB of memory on the GPU (SKU-dependent), used to store 119*61046927SAndroid Build Coastguard Worker attachments during tiled rendering 120*61046927SAndroid Build Coastguard Worker 121*61046927SAndroid Build Coastguard Worker LRZ 122*61046927SAndroid Build Coastguard Worker Low Resolution Z. A low resolution area of the depth buffer that can be 123*61046927SAndroid Build Coastguard Worker initialized during the binning pass to contain the worst-case (farthest) Z 124*61046927SAndroid Build Coastguard Worker values in a block, and then used to early reject fragments during 125*61046927SAndroid Build Coastguard Worker rasterization. 126*61046927SAndroid Build Coastguard Worker 127*61046927SAndroid Build Coastguard WorkerCache hierarchy 128*61046927SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^ 129*61046927SAndroid Build Coastguard Worker 130*61046927SAndroid Build Coastguard WorkerThe a6xx GPUs have two main caches: CCU and UCHE. 131*61046927SAndroid Build Coastguard Worker 132*61046927SAndroid Build Coastguard WorkerUCHE (Unified L2 Cache) is the cache behind the vertex fetch, VSC writes, 133*61046927SAndroid Build Coastguard Workertexture L1, LRZ, and storage image accesses (``ldib``/``stib``). Misses and 134*61046927SAndroid Build Coastguard Workerflushes access system memory. 135*61046927SAndroid Build Coastguard Worker 136*61046927SAndroid Build Coastguard WorkerThe CCU is the separate cache used by 2D blits and sysmem render target access 137*61046927SAndroid Build Coastguard Worker(and also for resolves to system memory when in GMEM mode). Its memory comes 138*61046927SAndroid Build Coastguard Workerfrom a carveout of GMEM controlled by ``RB_CCU_CNTL``, with a varying amount 139*61046927SAndroid Build Coastguard Workerreserved based on whether we're in a render pass using GMEM for attachment 140*61046927SAndroid Build Coastguard Workerstorage, or we're doing sysmem rendering. Cache entries have the attachment 141*61046927SAndroid Build Coastguard Workernumber and layer mixed into the cache tag in some way, likely so that a 142*61046927SAndroid Build Coastguard Workerfragment's access is spread through the cache even if the attachments are the 143*61046927SAndroid Build Coastguard Workersame size and alignments in address space. This means that the cache must be 144*61046927SAndroid Build Coastguard Workerflushed and invalidated between memory being used for one attachment and another 145*61046927SAndroid Build Coastguard Worker(notably depth vs color, but also MRT color). 146*61046927SAndroid Build Coastguard Worker 147*61046927SAndroid Build Coastguard WorkerThe Texture Processors (TP) additionally have a small L1 cache (1KB on A330, 148*61046927SAndroid Build Coastguard Workerunclear how big now) before accessing UCHE. This cache is used for normal 149*61046927SAndroid Build Coastguard Workersampling like ``sam``` and ``isam`` (and the compiler will make read-only 150*61046927SAndroid Build Coastguard Workerstorage image access through it as well). It is not coherent with UCHE (may get 151*61046927SAndroid Build Coastguard Workerstale results when you ``sam`` after ``stib``), but must get flushed per draw or 152*61046927SAndroid Build Coastguard Workersomething because you don't need a manual invalidate between draws storing to an 153*61046927SAndroid Build Coastguard Workerimage and draws sampling from a texture. 154*61046927SAndroid Build Coastguard Worker 155*61046927SAndroid Build Coastguard WorkerThe command processor (CP) does not read from either of these caches, and 156*61046927SAndroid Build Coastguard Workerinstead uses FIFOs in the ROQ to avoid stalls reading from system memory. 157*61046927SAndroid Build Coastguard Worker 158*61046927SAndroid Build Coastguard WorkerDraw states 159*61046927SAndroid Build Coastguard Worker^^^^^^^^^^^ 160*61046927SAndroid Build Coastguard Worker 161*61046927SAndroid Build Coastguard WorkerSince the SQE is not a fast processor, and tiled rendering means that many draws 162*61046927SAndroid Build Coastguard Workerwon't even be used in many bins, since a5xx state updates can be batched up into 163*61046927SAndroid Build Coastguard Worker"draw states" that point to a fragment of CP packets. At draw time, if the draw 164*61046927SAndroid Build Coastguard Workercall is going to actually execute (some primitive is visible in the current 165*61046927SAndroid Build Coastguard Workertile), the SQE goes through the ``GROUP_ID``\s and for any with an update since 166*61046927SAndroid Build Coastguard Workerthe last time they were executed, it executes the corresponding fragment. 167*61046927SAndroid Build Coastguard Worker 168*61046927SAndroid Build Coastguard WorkerStarting with a6xx, states can be tagged with whether they should be executed 169*61046927SAndroid Build Coastguard Workerat draw time for any of sysmem, binning, or tile rendering. This allows a 170*61046927SAndroid Build Coastguard Workersingle command stream to be generated which can be executed in any of the modes, 171*61046927SAndroid Build Coastguard Workerunlike pre-a6xx where we had to generate separate command lists for the binning 172*61046927SAndroid Build Coastguard Workerand rendering phases. 173*61046927SAndroid Build Coastguard Worker 174*61046927SAndroid Build Coastguard WorkerNote that this means that the generated draw state has to always update all of 175*61046927SAndroid Build Coastguard Workerthe state you have chosen to pack into that ``GROUP_ID``, since any of your 176*61046927SAndroid Build Coastguard Workerprevious state changes in a previous draw state command may have been skipped. 177*61046927SAndroid Build Coastguard Worker 178*61046927SAndroid Build Coastguard WorkerPipelining (a6xx+) 179*61046927SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^^^^ 180*61046927SAndroid Build Coastguard Worker 181*61046927SAndroid Build Coastguard WorkerMost CP commands write to registers. In a6xx+, the registers are located in 182*61046927SAndroid Build Coastguard Workerclusters corresponding to the stage of the pipeline they are used from (see 183*61046927SAndroid Build Coastguard Worker``enum tu_stage`` for a list). To pipeline state updates and drawing, registers 184*61046927SAndroid Build Coastguard Workergenerally have two copies ("contexts") in their cluster, so previous draws can 185*61046927SAndroid Build Coastguard Workerbe working on the previous set of register state while the next draw's state is 186*61046927SAndroid Build Coastguard Workerbeing set up. You can find what registers go into which clusters by looking at 187*61046927SAndroid Build Coastguard Worker:command:`crashdec` output in the ``regs-name: CP_MEMPOOL`` section. 188*61046927SAndroid Build Coastguard Worker 189*61046927SAndroid Build Coastguard WorkerAs SQE processes register writes in the command stream, it sends them into a 190*61046927SAndroid Build Coastguard Workerper-cluster queue stored in ``CP_MEMPOOL``. This allows the pipeline stages to 191*61046927SAndroid Build Coastguard Workerprocess their stream of register updates and events independent of each other 192*61046927SAndroid Build Coastguard Worker(so even with just 2 contexts in a stage, earlier stages can proceed on to later 193*61046927SAndroid Build Coastguard Workerdraws before later stages have caught up). 194*61046927SAndroid Build Coastguard Worker 195*61046927SAndroid Build Coastguard WorkerEach cluster has a per-context bit indicating that the context is done/free. 196*61046927SAndroid Build Coastguard WorkerRegister writes will stall on the context being done. 197*61046927SAndroid Build Coastguard Worker 198*61046927SAndroid Build Coastguard WorkerDuring a 3D draw command, SQE generates several internal events flow through the 199*61046927SAndroid Build Coastguard Workerpipeline: 200*61046927SAndroid Build Coastguard Worker 201*61046927SAndroid Build Coastguard Worker- ``CP_EVENT_START`` clears the done bit for the context when written to the 202*61046927SAndroid Build Coastguard Worker cluster 203*61046927SAndroid Build Coastguard Worker- ``PC_EVENT_CMD``/``PC_DRAW_CMD``/``HLSQ_EVENT_CMD``/``HLSQ_DRAW_CMD`` kick off 204*61046927SAndroid Build Coastguard Worker the actual event/drawing. 205*61046927SAndroid Build Coastguard Worker- ``CONTEXT_DONE`` event completes after the event/draw is complete and sets the 206*61046927SAndroid Build Coastguard Worker done flag. 207*61046927SAndroid Build Coastguard Worker- ``CP_EVENT_END`` waits for the done flag on the next context, then copies all 208*61046927SAndroid Build Coastguard Worker the registers that were dirtied in this context to that one. 209*61046927SAndroid Build Coastguard Worker 210*61046927SAndroid Build Coastguard WorkerThe 2D blit engine has its own ``CP_2D_EVENT_START``, ``CP_2D_EVENT_END``, 211*61046927SAndroid Build Coastguard Worker``CONTEXT_DONE_2D``, so 2D and 3D register contexts can do separate context 212*61046927SAndroid Build Coastguard Workerrollover. 213*61046927SAndroid Build Coastguard Worker 214*61046927SAndroid Build Coastguard WorkerBecause the clusters proceed independently of each other even across draws, if 215*61046927SAndroid Build Coastguard Workeryou need to synchronize an earlier cluster to the output of a later one, then 216*61046927SAndroid Build Coastguard Workeryou will need to ``CP_WAIT_FOR_IDLE`` after flushing and invalidating any 217*61046927SAndroid Build Coastguard Workernecessary caches. 218*61046927SAndroid Build Coastguard Worker 219*61046927SAndroid Build Coastguard WorkerAlso, note that some registers are not banked at all, and will require a 220*61046927SAndroid Build Coastguard Worker``CP_WAIT_FOR_IDLE`` for any previous usage of the register to complete. 221*61046927SAndroid Build Coastguard Worker 222*61046927SAndroid Build Coastguard WorkerIn a2xx-a4xx, there weren't per-stage clusters, and instead there were two 223*61046927SAndroid Build Coastguard Workerregister banks that were flipped between per draw. 224*61046927SAndroid Build Coastguard Worker 225*61046927SAndroid Build Coastguard WorkerBindless/Bindful Descriptors (a6xx+) 226*61046927SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 227*61046927SAndroid Build Coastguard Worker 228*61046927SAndroid Build Coastguard WorkerStarting with a6xx++, cat5 (texture) and cat6 (image/SSBO/UBO) instructions are 229*61046927SAndroid Build Coastguard Workerextended to support bindless descriptors. 230*61046927SAndroid Build Coastguard Worker 231*61046927SAndroid Build Coastguard WorkerIn the old bindful model, descriptors are separate for textures, samplers, 232*61046927SAndroid Build Coastguard WorkerUBOs, and IBOs (combined descriptor for images and SSBOs), with separate 233*61046927SAndroid Build Coastguard Workerregisters for the memory containing the array of descriptors, and/or different 234*61046927SAndroid Build Coastguard Worker``STATE_TYPE`` and ``STATE_BLOCK`` for ``CP_LOAD_STATE``/``_FRAG``/``_GEOM`` 235*61046927SAndroid Build Coastguard Workerto pre-load the descriptors into cache. 236*61046927SAndroid Build Coastguard Worker 237*61046927SAndroid Build Coastguard Worker- textures - per-shader-stage 238*61046927SAndroid Build Coastguard Worker - registers: ``SP_xS_TEX_CONST``/``SP_xS_TEX_COUNT`` 239*61046927SAndroid Build Coastguard Worker - state-type: ``ST6_CONSTANTS`` 240*61046927SAndroid Build Coastguard Worker - state-block: ``SB6_xS_TEX`` 241*61046927SAndroid Build Coastguard Worker- samplers - per-shader-stage 242*61046927SAndroid Build Coastguard Worker - registers: ``SP_xS_TEX_SAMP`` 243*61046927SAndroid Build Coastguard Worker - state-type: ``ST6_SHADER`` 244*61046927SAndroid Build Coastguard Worker - state-block: ``SB6_xS_TEX`` 245*61046927SAndroid Build Coastguard Worker- UBOs - per-shader-stage 246*61046927SAndroid Build Coastguard Worker - registers: none 247*61046927SAndroid Build Coastguard Worker - state-type: ``ST6_UBO`` 248*61046927SAndroid Build Coastguard Worker - state-block: ``SB6_xS_SHADER`` 249*61046927SAndroid Build Coastguard Worker- IBOs - global across shader 3d stages, separate for compute shader 250*61046927SAndroid Build Coastguard Worker - registers: ``SP_IBO``/``SP_IBO_COUNT`` or ``SP_CS_IBO``/``SP_CS_IBO_COUNT`` 251*61046927SAndroid Build Coastguard Worker - state-type: ``ST6_SHADER`` 252*61046927SAndroid Build Coastguard Worker - state-block: ``ST6_IBO`` or ``ST6_CS_IBO`` for compute shaders 253*61046927SAndroid Build Coastguard Worker - Note, unlike per-shader-stage descriptors, ``CP_LOAD_STATE6`` is used, 254*61046927SAndroid Build Coastguard Worker as opposed to ``CP_LOAD_STATE6_GEOM`` or ``CP_LOAD_STATE6_FRAG`` 255*61046927SAndroid Build Coastguard Worker depending on shader stage. 256*61046927SAndroid Build Coastguard Worker 257*61046927SAndroid Build Coastguard Worker.. note:: 258*61046927SAndroid Build Coastguard Worker For the per-shader-stage registers and state-blocks the ``xS`` notation 259*61046927SAndroid Build Coastguard Worker refers to per-shader-stage names, ex. ``SP_FS_TEX_CONST`` or ``SB6_DS_TEX`` 260*61046927SAndroid Build Coastguard Worker 261*61046927SAndroid Build Coastguard WorkerTextures and IBOs (images) use *basically* the same 64byte descriptor format 262*61046927SAndroid Build Coastguard Workerwith some exceptions (for ex, for IBOs cubemaps are handles as 2d array). 263*61046927SAndroid Build Coastguard WorkerSSBOs are just untyped buffers, but otherwise use the same descriptors and 264*61046927SAndroid Build Coastguard Workerinstructions as images. Samplers use a 16byte descriptor, and UBOs use an 265*61046927SAndroid Build Coastguard Worker8byte descriptor which packs the size in the upper 15 bits of the UBO address. 266*61046927SAndroid Build Coastguard Worker 267*61046927SAndroid Build Coastguard WorkerIn the bindless model, descriptors are split into 5 descriptor sets, which are 268*61046927SAndroid Build Coastguard Workerglobal across shader stages (but as with bindful IBO descriptors, separate for 269*61046927SAndroid Build Coastguard Worker3d stages vs compute stage). Each HW descriptor is an array of descriptors 270*61046927SAndroid Build Coastguard Workerof configurable size (each descriptor set can be configured for a descriptor 271*61046927SAndroid Build Coastguard Workerpitch of 8bytes or 64bytes). Each descriptor can be of arbitrary format (ie. 272*61046927SAndroid Build Coastguard WorkerUBOs/IBOs/textures/samplers interleaved), it's interpretation by the HW is 273*61046927SAndroid Build Coastguard Workerdetermined by the instruction that references the descriptor. Each descriptor 274*61046927SAndroid Build Coastguard Workerset can contain at least 2^^16 descriptors. 275*61046927SAndroid Build Coastguard Worker 276*61046927SAndroid Build Coastguard WorkerThe HW is configured with the base address of the descriptor set via an array 277*61046927SAndroid Build Coastguard Workerof "BINDLESS_BASE" registers, ie ``SP_BINDLESS_BASE[n]``/``HLSQ_BINDLESS_BASE[n]`` 278*61046927SAndroid Build Coastguard Workerfor 3d shader stages, or ``SP_CS_BINDLESS_BASE[n]``/``HLSQ_CS_BINDLESS_BASE[n]`` 279*61046927SAndroid Build Coastguard Workerfor compute shaders, with the descriptor pitch encoded in the low bits. 280*61046927SAndroid Build Coastguard WorkerWhich of the descriptor sets is referenced is encoded via three bits in the 281*61046927SAndroid Build Coastguard Workerinstruction. The address of the descriptor is calculated as:: 282*61046927SAndroid Build Coastguard Worker 283*61046927SAndroid Build Coastguard Worker descriptor_addr = (BINDLESS_BASE[n] & ~0x3) + 284*61046927SAndroid Build Coastguard Worker (idx * 4 * (2 << BINDLESS_BASE[n] & 0x3)) 285*61046927SAndroid Build Coastguard Worker 286*61046927SAndroid Build Coastguard Worker 287*61046927SAndroid Build Coastguard Worker.. note:: 288*61046927SAndroid Build Coastguard Worker Turnip reserves one descriptor set for internal use and exposes the other 289*61046927SAndroid Build Coastguard Worker four for the application via the Vulkan API. 290*61046927SAndroid Build Coastguard Worker 291*61046927SAndroid Build Coastguard WorkerSoftware Architecture 292*61046927SAndroid Build Coastguard Worker--------------------- 293*61046927SAndroid Build Coastguard Worker 294*61046927SAndroid Build Coastguard WorkerFreedreno and Turnip use a shared core for shader compiler, image layout, and 295*61046927SAndroid Build Coastguard Workerregister and command stream definitions. They implement separate state 296*61046927SAndroid Build Coastguard Workermanagement and command stream generation. 297*61046927SAndroid Build Coastguard Worker 298*61046927SAndroid Build Coastguard Worker.. toctree:: 299*61046927SAndroid Build Coastguard Worker :glob: 300*61046927SAndroid Build Coastguard Worker 301*61046927SAndroid Build Coastguard Worker freedreno/* 302*61046927SAndroid Build Coastguard Worker 303*61046927SAndroid Build Coastguard WorkerGPU devcoredump 304*61046927SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^^^^ 305*61046927SAndroid Build Coastguard Worker 306*61046927SAndroid Build Coastguard WorkerA kernel message from DRM of "gpu fault" can mean any sort of error reported by 307*61046927SAndroid Build Coastguard Workerthe GPU (including its internal hang detection). If a fault in GPU address 308*61046927SAndroid Build Coastguard Workerspace happened, you should expect to find a message from the iommu, with the 309*61046927SAndroid Build Coastguard Workerfaulting address and a hardware unit involved: 310*61046927SAndroid Build Coastguard Worker 311*61046927SAndroid Build Coastguard Worker.. code-block:: text 312*61046927SAndroid Build Coastguard Worker 313*61046927SAndroid Build Coastguard Worker *** gpu fault: ttbr0=000000001c941000 iova=000000010066a000 dir=READ type=TRANSLATION source=TP|VFD (0,0,0,1) 314*61046927SAndroid Build Coastguard Worker 315*61046927SAndroid Build Coastguard WorkerOn a GPU fault or hang, a GPU core dump is taken by the DRM driver and saved to 316*61046927SAndroid Build Coastguard Worker``/sys/devices/virtual/devcoredump/**/data``. You can cp that file to a 317*61046927SAndroid Build Coastguard Worker:file:`crash.devcore` to save it, otherwise the kernel will expire it 318*61046927SAndroid Build Coastguard Workereventually. Echo 1 to the file to free the core early, as another core won't be 319*61046927SAndroid Build Coastguard Workertaken until then. 320*61046927SAndroid Build Coastguard Worker 321*61046927SAndroid Build Coastguard WorkerOnce you have your core file, you can use :command:`crashdec -f crash.devcore` 322*61046927SAndroid Build Coastguard Workerto decode it. The output will have ``ESTIMATED CRASH LOCATION`` where we 323*61046927SAndroid Build Coastguard Workerestimate the CP to have stopped. Note that it is expected that this will be 324*61046927SAndroid Build Coastguard Workersome distance past whatever state triggered the fault, given GPU pipelining, and 325*61046927SAndroid Build Coastguard Workerwill often be at some ``CP_REG_TO_MEM`` (which waits on previous WFIs) or 326*61046927SAndroid Build Coastguard Worker``CP_WAIT_FOR_ME`` (which waits for all register writes to land) or similar 327*61046927SAndroid Build Coastguard Workerevent. You can try running the workload with ``TU_DEBUG=flushall`` or 328*61046927SAndroid Build Coastguard Worker``FD_MESA_DEBUG=flush`` to try to close in on the failing commands. 329*61046927SAndroid Build Coastguard Worker 330*61046927SAndroid Build Coastguard WorkerYou can also find what commands were queued up to each cluster in the 331*61046927SAndroid Build Coastguard Worker``regs-name: CP_MEMPOOL`` section. 332*61046927SAndroid Build Coastguard Worker 333*61046927SAndroid Build Coastguard WorkerIf ``ESTIMATED CRASH LOCATION`` doesn't exist you could find ``CP_SQE_STAT``, 334*61046927SAndroid Build Coastguard Workerthough going here is the last resort and likely won't be helpful. 335*61046927SAndroid Build Coastguard Worker 336*61046927SAndroid Build Coastguard Worker.. code-block:: 337*61046927SAndroid Build Coastguard Worker 338*61046927SAndroid Build Coastguard Worker indexed-registers: 339*61046927SAndroid Build Coastguard Worker - regs-name: CP_SQE_STAT 340*61046927SAndroid Build Coastguard Worker dwords: 51 341*61046927SAndroid Build Coastguard Worker PC: 00d7 <------------- 342*61046927SAndroid Build Coastguard Worker PKT: CP_LOAD_STATE6_FRAG 343*61046927SAndroid Build Coastguard Worker $01: 70348003 $11: 00000000 344*61046927SAndroid Build Coastguard Worker $02: 20000000 $12: 00000022 345*61046927SAndroid Build Coastguard Worker 346*61046927SAndroid Build Coastguard WorkerThe ``PC`` value is an instruction address in the current firmware. 347*61046927SAndroid Build Coastguard WorkerYou would need to disassemble the firmware (/lib/firmware/qcom/aXXX_sqe.fw) via: 348*61046927SAndroid Build Coastguard Worker 349*61046927SAndroid Build Coastguard Worker.. code-block:: sh 350*61046927SAndroid Build Coastguard Worker 351*61046927SAndroid Build Coastguard Worker afuc-disasm -v a650_sqe.fw > a650_sqe.fw.disasm 352*61046927SAndroid Build Coastguard Worker 353*61046927SAndroid Build Coastguard WorkerNow you should search for PC value in the disassembly, e.g.: 354*61046927SAndroid Build Coastguard Worker 355*61046927SAndroid Build Coastguard Worker.. code-block:: 356*61046927SAndroid Build Coastguard Worker 357*61046927SAndroid Build Coastguard Worker l018: 00d1: 08dd0001 add $addr, $06, 0x0001 358*61046927SAndroid Build Coastguard Worker 00d2: 981ff806 mov $data, $data 359*61046927SAndroid Build Coastguard Worker 00d3: 8a080001 mov $08, 0x0001 << 16 360*61046927SAndroid Build Coastguard Worker 00d4: 3108ffff or $08, $08, 0xffff 361*61046927SAndroid Build Coastguard Worker 00d5: 9be8f805 and $data, $data, $08 362*61046927SAndroid Build Coastguard Worker 00d6: 9806e806 mov $addr, $06 363*61046927SAndroid Build Coastguard Worker 00d7: 9803f806 mov $data, $03 <------------- HERE 364*61046927SAndroid Build Coastguard Worker 00d8: d8000000 waitin 365*61046927SAndroid Build Coastguard Worker 00d9: 981f0806 mov $01, $data 366*61046927SAndroid Build Coastguard Worker 367*61046927SAndroid Build Coastguard Worker 368*61046927SAndroid Build Coastguard WorkerCommand Stream Capture 369*61046927SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^^^^^^^^ 370*61046927SAndroid Build Coastguard Worker 371*61046927SAndroid Build Coastguard WorkerDuring Mesa development, it's often useful to look at the command streams we 372*61046927SAndroid Build Coastguard Workersend to the kernel. We have an interface for the kernel to capture all 373*61046927SAndroid Build Coastguard Workersubmitted command streams: 374*61046927SAndroid Build Coastguard Worker 375*61046927SAndroid Build Coastguard Worker.. code-block:: sh 376*61046927SAndroid Build Coastguard Worker 377*61046927SAndroid Build Coastguard Worker cat /sys/kernel/debug/dri/0/rd > cmdstream & 378*61046927SAndroid Build Coastguard Worker 379*61046927SAndroid Build Coastguard WorkerBy default, command stream capture does not capture texture/vertex/etc. data. 380*61046927SAndroid Build Coastguard WorkerYou can enable capturing all the BOs with: 381*61046927SAndroid Build Coastguard Worker 382*61046927SAndroid Build Coastguard Worker.. code-block:: sh 383*61046927SAndroid Build Coastguard Worker 384*61046927SAndroid Build Coastguard Worker echo Y > /sys/module/msm/parameters/rd_full 385*61046927SAndroid Build Coastguard Worker 386*61046927SAndroid Build Coastguard WorkerNote that, since all command streams get captured, it is easy to run the system 387*61046927SAndroid Build Coastguard Workerout of memory doing this, so you probably don't want to enable it during play of 388*61046927SAndroid Build Coastguard Workera heavyweight game. Instead, to capture a command stream within a game, you 389*61046927SAndroid Build Coastguard Workerprobably want to cause a crash in the GPU during a frame of interest so that a 390*61046927SAndroid Build Coastguard Workersingle GPU core dump is generated. Emitting ``0xdeadbeef`` in the CS should be 391*61046927SAndroid Build Coastguard Workerenough to cause a fault. 392*61046927SAndroid Build Coastguard Worker 393*61046927SAndroid Build Coastguard Worker``fd_rd_output`` facilities provide support for generating the command stream 394*61046927SAndroid Build Coastguard Workercapture from inside Mesa. Different ``FD_RD_DUMP`` options are available: 395*61046927SAndroid Build Coastguard Worker 396*61046927SAndroid Build Coastguard Worker- ``enable`` simply enables dumping the command stream on each submit for a 397*61046927SAndroid Build Coastguard Worker given logical device. When a more advanced option is specified, ``enable`` is 398*61046927SAndroid Build Coastguard Worker implied as specified. 399*61046927SAndroid Build Coastguard Worker- ``combine`` will combine all dumps into a single file instead of writing the 400*61046927SAndroid Build Coastguard Worker dump for each submit into a standalone file. 401*61046927SAndroid Build Coastguard Worker- ``full`` will dump every buffer object, which is necessary for replays of 402*61046927SAndroid Build Coastguard Worker command streams (see below). 403*61046927SAndroid Build Coastguard Worker- ``trigger`` will establish a trigger file through which dumps can be better 404*61046927SAndroid Build Coastguard Worker controlled. Writing a positive integer value into the file will enable dumping 405*61046927SAndroid Build Coastguard Worker of that many subsequent submits. Writing -1 will enable dumping of submits 406*61046927SAndroid Build Coastguard Worker until disabled. Writing 0 (or any other value) will disable dumps. 407*61046927SAndroid Build Coastguard Worker 408*61046927SAndroid Build Coastguard WorkerOutput dump files and trigger file (when enabled) are hard-coded to be placed 409*61046927SAndroid Build Coastguard Workerunder ``/tmp``, or ``/data/local/tmp`` under Android. `FD_RD_DUMP_TESTNAME` can 410*61046927SAndroid Build Coastguard Workerbe used to specify a more descriptive prefix for the output or trigger files. 411*61046927SAndroid Build Coastguard Worker 412*61046927SAndroid Build Coastguard WorkerFunctionality is generic to any Freedreno-based backend, but is currently only 413*61046927SAndroid Build Coastguard Workerintegrated in the MSM backend of Turnip. Using the existing ``TU_DEBUG=rd`` 414*61046927SAndroid Build Coastguard Workeroption will translate to ``FD_RD_DUMP=enable``. 415*61046927SAndroid Build Coastguard Worker 416*61046927SAndroid Build Coastguard WorkerCapturing Hang RD 417*61046927SAndroid Build Coastguard Worker+++++++++++++++++ 418*61046927SAndroid Build Coastguard Worker 419*61046927SAndroid Build Coastguard WorkerDevcore file doesn't contain all submitted command streams, only the hanging one. 420*61046927SAndroid Build Coastguard WorkerAdditionally it is geared towards analyzing the GPU state at the moment of the crash. 421*61046927SAndroid Build Coastguard Worker 422*61046927SAndroid Build Coastguard WorkerAlternatively, it's possible to obtain the whole submission with all command 423*61046927SAndroid Build Coastguard Workerstreams via ``/sys/kernel/debug/dri/0/hangrd``: 424*61046927SAndroid Build Coastguard Worker 425*61046927SAndroid Build Coastguard Worker.. code-block:: sh 426*61046927SAndroid Build Coastguard Worker 427*61046927SAndroid Build Coastguard Worker sudo cat /sys/kernel/debug/dri/0/hangrd > logfile.rd // Do the cat _before_ the expected hang 428*61046927SAndroid Build Coastguard Worker 429*61046927SAndroid Build Coastguard WorkerThe format of hangrd is the same as in ordinary command stream capture. 430*61046927SAndroid Build Coastguard Worker``rd_full`` also has the same effect on it. 431*61046927SAndroid Build Coastguard Worker 432*61046927SAndroid Build Coastguard WorkerReplaying Command Stream 433*61046927SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^^^^^^^^^^ 434*61046927SAndroid Build Coastguard Worker 435*61046927SAndroid Build Coastguard Worker``replay`` tool allows capturing and replaying ``rd`` to reproduce GPU faults. 436*61046927SAndroid Build Coastguard WorkerEspecially useful for transient GPU issues since it has much higher chances to 437*61046927SAndroid Build Coastguard Workerreproduce them. 438*61046927SAndroid Build Coastguard Worker 439*61046927SAndroid Build Coastguard WorkerDumping rendering results or even just memory is currently unsupported. 440*61046927SAndroid Build Coastguard Worker 441*61046927SAndroid Build Coastguard Worker- Replaying command streams requires kernel with ``MSM_INFO_SET_IOVA`` support. 442*61046927SAndroid Build Coastguard Worker- Requires ``rd`` capture to have full snapshots of the memory (``rd_full`` is enabled). 443*61046927SAndroid Build Coastguard Worker 444*61046927SAndroid Build Coastguard WorkerReplaying is done via ``replay`` tool: 445*61046927SAndroid Build Coastguard Worker 446*61046927SAndroid Build Coastguard Worker.. code-block:: sh 447*61046927SAndroid Build Coastguard Worker 448*61046927SAndroid Build Coastguard Worker ./replay test_replay.rd 449*61046927SAndroid Build Coastguard Worker 450*61046927SAndroid Build Coastguard WorkerMore examples: 451*61046927SAndroid Build Coastguard Worker 452*61046927SAndroid Build Coastguard Worker.. code-block:: sh 453*61046927SAndroid Build Coastguard Worker 454*61046927SAndroid Build Coastguard Worker ./replay --first=start_submit_n --last=last_submit_n test_replay.rd 455*61046927SAndroid Build Coastguard Worker 456*61046927SAndroid Build Coastguard Worker.. code-block:: sh 457*61046927SAndroid Build Coastguard Worker 458*61046927SAndroid Build Coastguard Worker ./replay --override=0 test_replay.rd 459*61046927SAndroid Build Coastguard Worker 460*61046927SAndroid Build Coastguard WorkerEditing Command Stream (a6xx+) 461*61046927SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 462*61046927SAndroid Build Coastguard Worker 463*61046927SAndroid Build Coastguard WorkerWhile replaying a fault is useful in itself, modifying the capture to 464*61046927SAndroid Build Coastguard Workerunderstand what causes the fault could be even more useful. 465*61046927SAndroid Build Coastguard Worker 466*61046927SAndroid Build Coastguard Worker``rddecompiler`` decompiles a single cmdstream from ``rd`` into compilable C source. 467*61046927SAndroid Build Coastguard WorkerGiven the address space bounds the generated program creates a new ``rd`` which 468*61046927SAndroid Build Coastguard Workercould be used to override cmdstream with 'replay'. Generated ``rd`` is not replayable 469*61046927SAndroid Build Coastguard Workeron its own and depends on buffers provided by the source ``rd``. 470*61046927SAndroid Build Coastguard Worker 471*61046927SAndroid Build Coastguard WorkerC source could be compiled by putting it into src/freedreno/decode/generate-rd.cc. 472*61046927SAndroid Build Coastguard Worker 473*61046927SAndroid Build Coastguard WorkerThe workflow would look like this: 474*61046927SAndroid Build Coastguard Worker 475*61046927SAndroid Build Coastguard Worker1. Find the cmdstream № you want to edit; 476*61046927SAndroid Build Coastguard Worker2. Decompile it: 477*61046927SAndroid Build Coastguard Worker 478*61046927SAndroid Build Coastguard Worker.. code-block:: sh 479*61046927SAndroid Build Coastguard Worker 480*61046927SAndroid Build Coastguard Worker ./rddecompiler -s %cmd_stream_n% example.rd > src/freedreno/decode/generate-rd.cc 481*61046927SAndroid Build Coastguard Worker 482*61046927SAndroid Build Coastguard Worker3. Edit the command stream;; 483*61046927SAndroid Build Coastguard Worker4. Compile and deploy freedreno tools; 484*61046927SAndroid Build Coastguard Worker5. Plug the generator into cmdstream replay: 485*61046927SAndroid Build Coastguard Worker 486*61046927SAndroid Build Coastguard Worker.. code-block:: sh 487*61046927SAndroid Build Coastguard Worker 488*61046927SAndroid Build Coastguard Worker ./replay --override=%cmd_stream_№% 489*61046927SAndroid Build Coastguard Worker 490*61046927SAndroid Build Coastguard Worker6. Repeat 3-5. 491*61046927SAndroid Build Coastguard Worker 492*61046927SAndroid Build Coastguard WorkerGPU Hang Debugging 493*61046927SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^^^^ 494*61046927SAndroid Build Coastguard Worker 495*61046927SAndroid Build Coastguard WorkerNot a guide for how to do it but mostly an enumeration of methods. 496*61046927SAndroid Build Coastguard Worker 497*61046927SAndroid Build Coastguard WorkerUseful ``TU_DEBUG`` (for Turnip) options to narrow down the hang cause: 498*61046927SAndroid Build Coastguard Worker 499*61046927SAndroid Build Coastguard Worker``sysmem``, ``gmem``, ``nobin``, ``forcebin``, ``noubwc``, ``nolrz``, ``flushall``, ``syncdraw``, ``rast_order`` 500*61046927SAndroid Build Coastguard Worker 501*61046927SAndroid Build Coastguard WorkerUseful ``FD_MESA_DEBUG`` (for Freedreno) options: 502*61046927SAndroid Build Coastguard Worker 503*61046927SAndroid Build Coastguard Worker``sysmem``, ``gmem``, ``nobin``, ``noubwc``, ``nolrz``, ``notile``, ``dclear``, ``ddraw``, ``flush``, ``inorder``, ``noblit`` 504*61046927SAndroid Build Coastguard Worker 505*61046927SAndroid Build Coastguard WorkerUseful ``IR3_SHADER_DEBUG`` options: 506*61046927SAndroid Build Coastguard Worker 507*61046927SAndroid Build Coastguard Worker``nouboopt``, ``spillall``, ``nopreamble``, ``nofp16`` 508*61046927SAndroid Build Coastguard Worker 509*61046927SAndroid Build Coastguard WorkerUse Graphics Flight Recorder to narrow down the place which hangs, 510*61046927SAndroid Build Coastguard Workeruse our own breadcrumbs implementation in case of unrecoverable hangs. 511*61046927SAndroid Build Coastguard Worker 512*61046927SAndroid Build Coastguard WorkerIn case of faults use RenderDoc to find the problematic command. If it's 513*61046927SAndroid Build Coastguard Workera draw call, edit shader in RenderDoc to find whether it culprit is a shader. 514*61046927SAndroid Build Coastguard WorkerIf yes, bisect it. 515*61046927SAndroid Build Coastguard Worker 516*61046927SAndroid Build Coastguard WorkerIf editing the shader messes the assembly too much and the issue becomes unreproducible 517*61046927SAndroid Build Coastguard Workertry editing the assembly itself via ``IR3_SHADER_OVERRIDE_PATH``. 518*61046927SAndroid Build Coastguard Worker 519*61046927SAndroid Build Coastguard WorkerIf fault or hang is transient try capturing an ``rd`` and replay it. If issue 520*61046927SAndroid Build Coastguard Workeris reproduced - bisect the GPU packets until the culprit is found. 521*61046927SAndroid Build Coastguard Worker 522*61046927SAndroid Build Coastguard WorkerDo the above if culprit is not a shader. 523*61046927SAndroid Build Coastguard Worker 524*61046927SAndroid Build Coastguard WorkerThe hang recovery mechanism in Kernel is not perfect, in case of unrecoverable 525*61046927SAndroid Build Coastguard Workerhangs check whether the kernel is up to date and look for unmerged patches 526*61046927SAndroid Build Coastguard Workerwhich could improve the recovery. 527*61046927SAndroid Build Coastguard Worker 528*61046927SAndroid Build Coastguard WorkerGPU Breadcrumbs 529*61046927SAndroid Build Coastguard Worker+++++++++++++++ 530*61046927SAndroid Build Coastguard Worker 531*61046927SAndroid Build Coastguard WorkerBreadcrumbs described below are available only in Turnip. 532*61046927SAndroid Build Coastguard Worker 533*61046927SAndroid Build Coastguard WorkerFreedreno has simpler breadcrumbs, in debug build writes breadcrumbs 534*61046927SAndroid Build Coastguard Workerinto ``CP_SCRATCH_REG[6]`` and per-tile breadcrumbs into ``CP_SCRATCH_REG[7]``, 535*61046927SAndroid Build Coastguard Workerin this way they are available in the devcoredump. TODO: generalize Tunip's 536*61046927SAndroid Build Coastguard Workerbreadcrumbs implementation. 537*61046927SAndroid Build Coastguard Worker 538*61046927SAndroid Build Coastguard WorkerThis is a simple implementations of breadcrumbs tracking of GPU progress 539*61046927SAndroid Build Coastguard Workerintended to be a last resort when debugging unrecoverable hangs. 540*61046927SAndroid Build Coastguard WorkerFor best results use Vulkan traces to have a predictable place of hang. 541*61046927SAndroid Build Coastguard Worker 542*61046927SAndroid Build Coastguard WorkerFor ordinary hangs as a more user-friendly solution use GFR 543*61046927SAndroid Build Coastguard Worker"Graphics Flight Recorder". 544*61046927SAndroid Build Coastguard Worker 545*61046927SAndroid Build Coastguard WorkerOr breadcrumbs implementation aims to handle cases where nothing can be done 546*61046927SAndroid Build Coastguard Workerafter the hang. In-driver breadcrumbs also allow more precise tracking since 547*61046927SAndroid Build Coastguard Workerwe could target a single GPU packet. 548*61046927SAndroid Build Coastguard Worker 549*61046927SAndroid Build Coastguard WorkerWhile breadcrumbs support gmem, try to reproduce the hang in a sysmem mode 550*61046927SAndroid Build Coastguard Workerbecause it would require much less breadcrumb writes and syncs. 551*61046927SAndroid Build Coastguard Worker 552*61046927SAndroid Build Coastguard WorkerBreadcrumbs settings: 553*61046927SAndroid Build Coastguard Worker 554*61046927SAndroid Build Coastguard Worker.. code-block:: sh 555*61046927SAndroid Build Coastguard Worker 556*61046927SAndroid Build Coastguard Worker TU_BREADCRUMBS=%IP%:%PORT%,break=%BREAKPOINT%:%BREAKPOINT_HITS% 557*61046927SAndroid Build Coastguard Worker 558*61046927SAndroid Build Coastguard Worker``BREAKPOINT`` 559*61046927SAndroid Build Coastguard Worker The breadcrumb starting from which we require explicit ack. 560*61046927SAndroid Build Coastguard Worker``BREAKPOINT_HITS`` 561*61046927SAndroid Build Coastguard Worker How many times breakpoint should be reached for break to occur. 562*61046927SAndroid Build Coastguard Worker Necessary for a gmem mode and re-usable cmdbuffers in both of which 563*61046927SAndroid Build Coastguard Worker the same cmdstream could be executed several times. 564*61046927SAndroid Build Coastguard Worker 565*61046927SAndroid Build Coastguard WorkerA typical work flow would be: 566*61046927SAndroid Build Coastguard Worker 567*61046927SAndroid Build Coastguard Worker- Start listening for breadcrumbs on a remote host: 568*61046927SAndroid Build Coastguard Worker 569*61046927SAndroid Build Coastguard Worker.. code-block:: sh 570*61046927SAndroid Build Coastguard Worker 571*61046927SAndroid Build Coastguard Worker nc -lvup $PORT | stdbuf -o0 xxd -pc -c 4 | awk -Wposix '{printf("%u:%u\n", "0x" $0, a[$0]++)}' 572*61046927SAndroid Build Coastguard Worker 573*61046927SAndroid Build Coastguard Worker- Start capturing command stream; 574*61046927SAndroid Build Coastguard Worker- Replay the hanging trace with: 575*61046927SAndroid Build Coastguard Worker 576*61046927SAndroid Build Coastguard Worker.. code-block:: sh 577*61046927SAndroid Build Coastguard Worker 578*61046927SAndroid Build Coastguard Worker TU_BREADCRUMBS=$IP:$PORT,break=-1:0 579*61046927SAndroid Build Coastguard Worker 580*61046927SAndroid Build Coastguard Worker- Increase hangcheck period: 581*61046927SAndroid Build Coastguard Worker 582*61046927SAndroid Build Coastguard Worker.. code-block:: sh 583*61046927SAndroid Build Coastguard Worker 584*61046927SAndroid Build Coastguard Worker echo -n 60000 > /sys/kernel/debug/dri/0/hangcheck_period_ms 585*61046927SAndroid Build Coastguard Worker 586*61046927SAndroid Build Coastguard Worker- After GPU hang note the last breadcrumb and relaunch trace with: 587*61046927SAndroid Build Coastguard Worker 588*61046927SAndroid Build Coastguard Worker.. code-block:: sh 589*61046927SAndroid Build Coastguard Worker 590*61046927SAndroid Build Coastguard Worker TU_BREADCRUMBS=%IP%:%PORT%,break=%LAST_BREADCRUMB%:%HITS% 591*61046927SAndroid Build Coastguard Worker 592*61046927SAndroid Build Coastguard Worker- After the breakpoint is reached each breadcrumb would require 593*61046927SAndroid Build Coastguard Worker explicit ack from the user. This way it's possible to find 594*61046927SAndroid Build Coastguard Worker the last packet which didn't hang. 595*61046927SAndroid Build Coastguard Worker 596*61046927SAndroid Build Coastguard Worker- Find the packet in the decoded cmdstream. 597*61046927SAndroid Build Coastguard Worker 598*61046927SAndroid Build Coastguard WorkerDebugging random failures 599*61046927SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^^^^^^^^^^^ 600*61046927SAndroid Build Coastguard Worker 601*61046927SAndroid Build Coastguard WorkerIn most cases random GPU faults and rendering artifacts are caused by some kind 602*61046927SAndroid Build Coastguard Workerof undefined behavior that falls under the following categories: 603*61046927SAndroid Build Coastguard Worker 604*61046927SAndroid Build Coastguard Worker- Usage of a stale reg value; 605*61046927SAndroid Build Coastguard Worker- Usage of stale memory (e.g. expecting it to be zeroed when it is not); 606*61046927SAndroid Build Coastguard Worker- Lack of the proper synchronization. 607*61046927SAndroid Build Coastguard Worker 608*61046927SAndroid Build Coastguard WorkerFinding instances of stale reg reads 609*61046927SAndroid Build Coastguard Worker++++++++++++++++++++++++++++++++++++ 610*61046927SAndroid Build Coastguard Worker 611*61046927SAndroid Build Coastguard WorkerTurnip has a debug option to stomp the registers with invalid values to catch 612*61046927SAndroid Build Coastguard Workerthe cases where stale data is read. 613*61046927SAndroid Build Coastguard Worker 614*61046927SAndroid Build Coastguard Worker.. code-block:: sh 615*61046927SAndroid Build Coastguard Worker 616*61046927SAndroid Build Coastguard Worker MESA_VK_ABORT_ON_DEVICE_LOSS=1 \ 617*61046927SAndroid Build Coastguard Worker TU_DEBUG_STALE_REGS_RANGE=0x00000c00,0x0000be01 \ 618*61046927SAndroid Build Coastguard Worker TU_DEBUG_STALE_REGS_FLAGS=cmdbuf,renderpass \ 619*61046927SAndroid Build Coastguard Worker ./app 620*61046927SAndroid Build Coastguard Worker 621*61046927SAndroid Build Coastguard Worker.. envvar:: TU_DEBUG_STALE_REGS_RANGE 622*61046927SAndroid Build Coastguard Worker 623*61046927SAndroid Build Coastguard Worker the reg range in which registers would be stomped. Add ``inverse`` to the 624*61046927SAndroid Build Coastguard Worker flags in order for this range to specify which registers NOT to stomp. 625*61046927SAndroid Build Coastguard Worker 626*61046927SAndroid Build Coastguard Worker.. envvar:: TU_DEBUG_STALE_REGS_FLAGS 627*61046927SAndroid Build Coastguard Worker 628*61046927SAndroid Build Coastguard Worker ``cmdbuf`` 629*61046927SAndroid Build Coastguard Worker stomp registers at the start of each command buffer. 630*61046927SAndroid Build Coastguard Worker ``renderpass`` 631*61046927SAndroid Build Coastguard Worker stomp registers before each render pass. 632*61046927SAndroid Build Coastguard Worker ``inverse`` 633*61046927SAndroid Build Coastguard Worker changes ``TU_DEBUG_STALE_REGS_RANGE`` meaning to 634*61046927SAndroid Build Coastguard Worker "regs that should NOT be stomped". 635*61046927SAndroid Build Coastguard Worker 636*61046927SAndroid Build Coastguard WorkerThe best way to pinpoint the reg which causes a failure is to bisect the regs 637*61046927SAndroid Build Coastguard Workerrange. In case when a fail is caused by combination of several registers 638*61046927SAndroid Build Coastguard Workerthe ``inverse`` flag may be set to find the reg which prevents the failure. 639