1*61046927SAndroid Build Coastguard WorkerLow Resolution Z Buffer 2*61046927SAndroid Build Coastguard Worker======================= 3*61046927SAndroid Build Coastguard Worker 4*61046927SAndroid Build Coastguard WorkerThis doc is based on a6xx HW reverse engineering, a5xx should be similar to 5*61046927SAndroid Build Coastguard Workera6xx before gen3. 6*61046927SAndroid Build Coastguard Worker 7*61046927SAndroid Build Coastguard WorkerLow Resolution Z buffer is very similar to a depth prepass that helps 8*61046927SAndroid Build Coastguard Workerthe HW to avoid executing the fragment shader on those fragments that will 9*61046927SAndroid Build Coastguard Workerbe subsequently discarded by the depth test afterwards. 10*61046927SAndroid Build Coastguard Worker 11*61046927SAndroid Build Coastguard WorkerThe interesting part of this feature is that it allows applications 12*61046927SAndroid Build Coastguard Workerto submit the vertices in any order. 13*61046927SAndroid Build Coastguard Worker 14*61046927SAndroid Build Coastguard WorkerCiting official Adreno documentation: 15*61046927SAndroid Build Coastguard Worker 16*61046927SAndroid Build Coastguard Worker:: 17*61046927SAndroid Build Coastguard Worker 18*61046927SAndroid Build Coastguard Worker [A Low Resolution Z (LRZ)] pass is also referred to as draw order independent 19*61046927SAndroid Build Coastguard Worker depth rejection. During the binning pass, a low resolution Z-buffer is constructed, 20*61046927SAndroid Build Coastguard Worker and can reject LRZ-tile wide contributions to boost binning performance. This LRZ 21*61046927SAndroid Build Coastguard Worker is then used during the rendering pass to reject pixels efficiently before testing 22*61046927SAndroid Build Coastguard Worker against the full resolution Z-buffer. 23*61046927SAndroid Build Coastguard Worker 24*61046927SAndroid Build Coastguard WorkerLimitations 25*61046927SAndroid Build Coastguard Worker----------- 26*61046927SAndroid Build Coastguard Worker 27*61046927SAndroid Build Coastguard WorkerThere are two main limitations of LRZ: 28*61046927SAndroid Build Coastguard Worker 29*61046927SAndroid Build Coastguard Worker- Since LRZ is an early depth test, such test cannot be used when late-z is required; 30*61046927SAndroid Build Coastguard Worker- LRZ buffer could be formed only in one direction, changing depth comparison directions 31*61046927SAndroid Build Coastguard Worker without disabling LRZ would lead to a malformed LRZ buffer. 32*61046927SAndroid Build Coastguard Worker 33*61046927SAndroid Build Coastguard WorkerPre-a650 (before gen3) 34*61046927SAndroid Build Coastguard Worker---------------------- 35*61046927SAndroid Build Coastguard Worker 36*61046927SAndroid Build Coastguard WorkerThe direction is fully tracked on CPU. In render pass LRZ starts with 37*61046927SAndroid Build Coastguard Workerunknown direction, the direction is set first time when depth write occurs 38*61046927SAndroid Build Coastguard Workerand if it does change afterwards then the direction becomes invalid and LRZ is 39*61046927SAndroid Build Coastguard Workerdisabled for the rest of the render pass. 40*61046927SAndroid Build Coastguard Worker 41*61046927SAndroid Build Coastguard WorkerSince the direction is not tracked by the GPU, it's impossible to know whether 42*61046927SAndroid Build Coastguard WorkerLRZ is enabled during construction of secondary command buffers. 43*61046927SAndroid Build Coastguard Worker 44*61046927SAndroid Build Coastguard WorkerFor the same reason, it's impossible to reuse LRZ between render passes. 45*61046927SAndroid Build Coastguard Worker 46*61046927SAndroid Build Coastguard WorkerA650+ (gen3+) 47*61046927SAndroid Build Coastguard Worker------------- 48*61046927SAndroid Build Coastguard Worker 49*61046927SAndroid Build Coastguard WorkerNow LRZ direction can be tracked on GPU. There are two parts: 50*61046927SAndroid Build Coastguard Worker 51*61046927SAndroid Build Coastguard Worker- Direction byte which stores current LRZ direction - ``GRAS_LRZ_CNTL.DIR``. 52*61046927SAndroid Build Coastguard Worker- Parameters of the last used depth view - ``GRAS_LRZ_DEPTH_VIEW``. 53*61046927SAndroid Build Coastguard Worker 54*61046927SAndroid Build Coastguard WorkerThe idea is the same as when LRZ tracked on CPU: when ``GRAS_LRZ_CNTL`` 55*61046927SAndroid Build Coastguard Workeris used, its direction is compared to the previously known direction 56*61046927SAndroid Build Coastguard Workerand direction byte is set to disabled when directions are incompatible. 57*61046927SAndroid Build Coastguard Worker 58*61046927SAndroid Build Coastguard WorkerAdditionally, to reuse LRZ between render passes, ``GRAS_LRZ_CNTL`` checks 59*61046927SAndroid Build Coastguard Workerif the current value of ``GRAS_LRZ_DEPTH_VIEW`` is equal to the value 60*61046927SAndroid Build Coastguard Workerstored in the buffer. If not, LRZ is disabled. This is necessary 61*61046927SAndroid Build Coastguard Workerbecause depth buffer may have several layers and mip levels, while the 62*61046927SAndroid Build Coastguard WorkerLRZ buffer represents only a single layer + mip level. 63*61046927SAndroid Build Coastguard Worker 64*61046927SAndroid Build Coastguard WorkerA7XX 65*61046927SAndroid Build Coastguard Worker------------- 66*61046927SAndroid Build Coastguard Worker 67*61046927SAndroid Build Coastguard WorkerA7XX introduces the concept of bidirectional LRZ where there are two LRZ 68*61046927SAndroid Build Coastguard Workerbuffers, one for each direction. This way LRZ doesn't need to be disabled 69*61046927SAndroid Build Coastguard Workerwhen the direction changes, by default, this behavior is disabled but the 70*61046927SAndroid Build Coastguard WorkerLRZ buffers have to be allocated with this space in mind as fast clears 71*61046927SAndroid Build Coastguard Workerwill always write metadata for both. 72*61046927SAndroid Build Coastguard Worker 73*61046927SAndroid Build Coastguard WorkerAdditionally, there are now two seperate LRZ buffers (on top of one for 74*61046927SAndroid Build Coastguard Workereach direction, a total of four) - due to concurrent binning, one can be 75*61046927SAndroid Build Coastguard Workerused for binning and the other for rendering concurrently. These can be 76*61046927SAndroid Build Coastguard Workerflipped between via the `LRZ_FLIP_BUFFER` event which can be put inside 77*61046927SAndroid Build Coastguard Workera conditional block for either the BV or BR. 78*61046927SAndroid Build Coastguard Worker 79*61046927SAndroid Build Coastguard WorkerLRZ Fast-Clear 80*61046927SAndroid Build Coastguard Worker-------------- 81*61046927SAndroid Build Coastguard Worker 82*61046927SAndroid Build Coastguard WorkerThe LRZ fast-clear buffer is initialized to zeroes and read/written 83*61046927SAndroid Build Coastguard Workerwhen ``GRAS_LRZ_CNTL.FC_ENABLE`` is set. It appears to store 1b/block. 84*61046927SAndroid Build Coastguard Worker``0`` means block has original depth clear value, and ``1`` means that the 85*61046927SAndroid Build Coastguard Workercorresponding block in LRZ has been modified. 86*61046927SAndroid Build Coastguard Worker 87*61046927SAndroid Build Coastguard WorkerLRZ fast-clear conservatively clears LRZ buffer. At the point where LRZ is 88*61046927SAndroid Build Coastguard Workerwritten the LRZ block which corresponds to a single fast-clear bit is cleared: 89*61046927SAndroid Build Coastguard Worker 90*61046927SAndroid Build Coastguard Worker- To ``0.0`` if depth comparison is ``GREATER`` 91*61046927SAndroid Build Coastguard Worker- To ``1.0`` if depth comparison is ``LESS`` 92*61046927SAndroid Build Coastguard Worker 93*61046927SAndroid Build Coastguard WorkerThis way it's always valid to fast-clear. 94*61046927SAndroid Build Coastguard Worker 95*61046927SAndroid Build Coastguard WorkerOn A7XX, the original depth clear value can be specified exactly allowing for 96*61046927SAndroid Build Coastguard Workerfast-clear to any value rather than just ``1.0`` or ``0.0``. 97*61046927SAndroid Build Coastguard Worker 98*61046927SAndroid Build Coastguard WorkerLRZ Feedback 99*61046927SAndroid Build Coastguard Worker------------- 100*61046927SAndroid Build Coastguard Worker 101*61046927SAndroid Build Coastguard WorkerSome draws do write depth but cannot contribute to LRZ during the BINNING pass 102*61046927SAndroid Build Coastguard Workere.g. when fragment shader has "discard" in it, however they can contribute to LRZ 103*61046927SAndroid Build Coastguard Workerduring the RENDERING pass via LRZ feedback mechanism. This may allow the draws 104*61046927SAndroid Build Coastguard Workerthat follow to depth test against the updated LRZ, this is especially important 105*61046927SAndroid Build Coastguard Workerif such "bad" draws were at the start of the renderpass. 106*61046927SAndroid Build Coastguard Worker 107*61046927SAndroid Build Coastguard WorkerLRZ feedback happens during the RENDERING pass when ``LRZ_FEEDBACK_ZMODE_MASK`` 108*61046927SAndroid Build Coastguard Workeris set, if draw has a6xx_ztest_mode that has corresponding flag set in 109*61046927SAndroid Build Coastguard Worker``LRZ_FEEDBACK_ZMODE_MASK`` - its depth values would be used for feedback. 110*61046927SAndroid Build Coastguard Worker 111*61046927SAndroid Build Coastguard WorkerLRZ feedback alongside with LRZ testing also works during sysmem rendering. 112*61046927SAndroid Build Coastguard Worker 113*61046927SAndroid Build Coastguard WorkerLRZ Precision 114*61046927SAndroid Build Coastguard Worker------------- 115*61046927SAndroid Build Coastguard Worker 116*61046927SAndroid Build Coastguard WorkerLRZ always uses ``Z16_UNORM``. The epsilon for it is ``1.f / (1 << 16)`` which is 117*61046927SAndroid Build Coastguard Workernot enough to represent all values of ``Z32_UNORM`` or ``Z32_FLOAT``. 118*61046927SAndroid Build Coastguard WorkerThis especially raises questions in context of fast-clear, if fast-clear 119*61046927SAndroid Build Coastguard Workeruses a value which cannot be precisely represented by LRZ - we wouldn't 120*61046927SAndroid Build Coastguard Workerbe able to round it in the correct direction since direction is tracked 121*61046927SAndroid Build Coastguard Workeron GPU. 122*61046927SAndroid Build Coastguard Worker 123*61046927SAndroid Build Coastguard WorkerHowever, it seems that depth comparisons with LRZ values have some "slack" 124*61046927SAndroid Build Coastguard Workerand nothing special should be done for such depth clear values. 125*61046927SAndroid Build Coastguard Worker 126*61046927SAndroid Build Coastguard WorkerHow it was tested: 127*61046927SAndroid Build Coastguard Worker 128*61046927SAndroid Build Coastguard Worker- Clear ``Z32_FLOAT`` attachment to ``1.f / (1 << 17)`` 129*61046927SAndroid Build Coastguard Worker 130*61046927SAndroid Build Coastguard Worker - LRZ buffer contains all zeroes. 131*61046927SAndroid Build Coastguard Worker 132*61046927SAndroid Build Coastguard Worker- Do draws and check whether all samples are passing: 133*61046927SAndroid Build Coastguard Worker 134*61046927SAndroid Build Coastguard Worker - ``OP_GREATER`` with ``(1.f / (1 << 17) + float32_epsilon)`` - passing; 135*61046927SAndroid Build Coastguard Worker - ``OP_GREATER`` with ``(1.f / (1 << 17) - float32_epsilon)`` - not passing; 136*61046927SAndroid Build Coastguard Worker - ``OP_LESS`` with ``(1.f / (1 << 17) - float32_epsilon)`` - samples; 137*61046927SAndroid Build Coastguard Worker - ``OP_LESS`` with ``(1.f / (1 << 17) + float32_epsilon)``- not passing; 138*61046927SAndroid Build Coastguard Worker - ``OP_LESS_OR_EQ`` with ``(1.f / (1 << 17) + float32_epsilon)`` - not passing. 139*61046927SAndroid Build Coastguard Worker 140*61046927SAndroid Build Coastguard WorkerIn all cases resulting LRZ buffer is all zeroes and LRZ direction is updated. 141*61046927SAndroid Build Coastguard Worker 142*61046927SAndroid Build Coastguard WorkerLRZ Caches 143*61046927SAndroid Build Coastguard Worker---------- 144*61046927SAndroid Build Coastguard Worker 145*61046927SAndroid Build Coastguard Worker``LRZ_FLUSH`` flushes and invalidates LRZ caches, there are two caches: 146*61046927SAndroid Build Coastguard Worker 147*61046927SAndroid Build Coastguard Worker- Cache for fast-clear buffer; 148*61046927SAndroid Build Coastguard Worker- Cache for direction byte + depth view params. 149*61046927SAndroid Build Coastguard Worker 150*61046927SAndroid Build Coastguard WorkerThey could be cleared by ``LRZ_CLEAR``. To become visible in GPU memory 151*61046927SAndroid Build Coastguard Workerthe caches should be flushed with ``LRZ_FLUSH`` afterwards. 152*61046927SAndroid Build Coastguard Worker 153*61046927SAndroid Build Coastguard Worker``GRAS_LRZ_CNTL`` reads from these caches. 154