xref: /aosp_15_r20/external/mesa3d/docs/drivers/freedreno/hw/lrz.rst (revision 6104692788411f58d303aa86923a9ff6ecaded22)
1*61046927SAndroid Build Coastguard WorkerLow Resolution Z Buffer
2*61046927SAndroid Build Coastguard Worker=======================
3*61046927SAndroid Build Coastguard Worker
4*61046927SAndroid Build Coastguard WorkerThis doc is based on a6xx HW reverse engineering, a5xx should be similar to
5*61046927SAndroid Build Coastguard Workera6xx before gen3.
6*61046927SAndroid Build Coastguard Worker
7*61046927SAndroid Build Coastguard WorkerLow Resolution Z buffer is very similar to a depth prepass that helps
8*61046927SAndroid Build Coastguard Workerthe HW to avoid executing the fragment shader on those fragments that will
9*61046927SAndroid Build Coastguard Workerbe subsequently discarded by the depth test afterwards.
10*61046927SAndroid Build Coastguard Worker
11*61046927SAndroid Build Coastguard WorkerThe interesting part of this feature is that it allows applications
12*61046927SAndroid Build Coastguard Workerto submit the vertices in any order.
13*61046927SAndroid Build Coastguard Worker
14*61046927SAndroid Build Coastguard WorkerCiting official Adreno documentation:
15*61046927SAndroid Build Coastguard Worker
16*61046927SAndroid Build Coastguard Worker::
17*61046927SAndroid Build Coastguard Worker
18*61046927SAndroid Build Coastguard Worker  [A Low Resolution Z (LRZ)] pass is also referred to as draw order independent
19*61046927SAndroid Build Coastguard Worker  depth rejection. During the binning pass, a low resolution Z-buffer is constructed,
20*61046927SAndroid Build Coastguard Worker  and can reject LRZ-tile wide contributions to boost binning performance. This LRZ
21*61046927SAndroid Build Coastguard Worker  is then used during the rendering pass to reject pixels efficiently before testing
22*61046927SAndroid Build Coastguard Worker  against the full resolution Z-buffer.
23*61046927SAndroid Build Coastguard Worker
24*61046927SAndroid Build Coastguard WorkerLimitations
25*61046927SAndroid Build Coastguard Worker-----------
26*61046927SAndroid Build Coastguard Worker
27*61046927SAndroid Build Coastguard WorkerThere are two main limitations of LRZ:
28*61046927SAndroid Build Coastguard Worker
29*61046927SAndroid Build Coastguard Worker- Since LRZ is an early depth test, such test cannot be used when late-z is required;
30*61046927SAndroid Build Coastguard Worker- LRZ buffer could be formed only in one direction, changing depth comparison directions
31*61046927SAndroid Build Coastguard Worker  without disabling LRZ would lead to a malformed LRZ buffer.
32*61046927SAndroid Build Coastguard Worker
33*61046927SAndroid Build Coastguard WorkerPre-a650 (before gen3)
34*61046927SAndroid Build Coastguard Worker----------------------
35*61046927SAndroid Build Coastguard Worker
36*61046927SAndroid Build Coastguard WorkerThe direction is fully tracked on CPU. In render pass LRZ starts with
37*61046927SAndroid Build Coastguard Workerunknown direction, the direction is set first time when depth write occurs
38*61046927SAndroid Build Coastguard Workerand if it does change afterwards then the direction becomes invalid and LRZ is
39*61046927SAndroid Build Coastguard Workerdisabled for the rest of the render pass.
40*61046927SAndroid Build Coastguard Worker
41*61046927SAndroid Build Coastguard WorkerSince the direction is not tracked by the GPU, it's impossible to know whether
42*61046927SAndroid Build Coastguard WorkerLRZ is enabled during construction of secondary command buffers.
43*61046927SAndroid Build Coastguard Worker
44*61046927SAndroid Build Coastguard WorkerFor the same reason, it's impossible to reuse LRZ between render passes.
45*61046927SAndroid Build Coastguard Worker
46*61046927SAndroid Build Coastguard WorkerA650+ (gen3+)
47*61046927SAndroid Build Coastguard Worker-------------
48*61046927SAndroid Build Coastguard Worker
49*61046927SAndroid Build Coastguard WorkerNow LRZ direction can be tracked on GPU. There are two parts:
50*61046927SAndroid Build Coastguard Worker
51*61046927SAndroid Build Coastguard Worker- Direction byte which stores current LRZ direction - ``GRAS_LRZ_CNTL.DIR``.
52*61046927SAndroid Build Coastguard Worker- Parameters of the last used depth view - ``GRAS_LRZ_DEPTH_VIEW``.
53*61046927SAndroid Build Coastguard Worker
54*61046927SAndroid Build Coastguard WorkerThe idea is the same as when LRZ tracked on CPU: when ``GRAS_LRZ_CNTL``
55*61046927SAndroid Build Coastguard Workeris used, its direction is compared to the previously known direction
56*61046927SAndroid Build Coastguard Workerand direction byte is set to disabled when directions are incompatible.
57*61046927SAndroid Build Coastguard Worker
58*61046927SAndroid Build Coastguard WorkerAdditionally, to reuse LRZ between render passes, ``GRAS_LRZ_CNTL`` checks
59*61046927SAndroid Build Coastguard Workerif the current value of ``GRAS_LRZ_DEPTH_VIEW`` is equal to the value
60*61046927SAndroid Build Coastguard Workerstored in the buffer. If not, LRZ is disabled. This is necessary
61*61046927SAndroid Build Coastguard Workerbecause depth buffer may have several layers and mip levels, while the
62*61046927SAndroid Build Coastguard WorkerLRZ buffer represents only a single layer + mip level.
63*61046927SAndroid Build Coastguard Worker
64*61046927SAndroid Build Coastguard WorkerA7XX
65*61046927SAndroid Build Coastguard Worker-------------
66*61046927SAndroid Build Coastguard Worker
67*61046927SAndroid Build Coastguard WorkerA7XX introduces the concept of bidirectional LRZ where there are two LRZ
68*61046927SAndroid Build Coastguard Workerbuffers, one for each direction. This way LRZ doesn't need to be disabled
69*61046927SAndroid Build Coastguard Workerwhen the direction changes, by default, this behavior is disabled but the
70*61046927SAndroid Build Coastguard WorkerLRZ buffers have to be allocated with this space in mind as fast clears
71*61046927SAndroid Build Coastguard Workerwill always write metadata for both.
72*61046927SAndroid Build Coastguard Worker
73*61046927SAndroid Build Coastguard WorkerAdditionally, there are now two seperate LRZ buffers (on top of one for
74*61046927SAndroid Build Coastguard Workereach direction, a total of four) - due to concurrent binning, one can be
75*61046927SAndroid Build Coastguard Workerused for binning and the other for rendering concurrently. These can be
76*61046927SAndroid Build Coastguard Workerflipped between via the `LRZ_FLIP_BUFFER` event which can be put inside
77*61046927SAndroid Build Coastguard Workera conditional block for either the BV or BR.
78*61046927SAndroid Build Coastguard Worker
79*61046927SAndroid Build Coastguard WorkerLRZ Fast-Clear
80*61046927SAndroid Build Coastguard Worker--------------
81*61046927SAndroid Build Coastguard Worker
82*61046927SAndroid Build Coastguard WorkerThe LRZ fast-clear buffer is initialized to zeroes and read/written
83*61046927SAndroid Build Coastguard Workerwhen ``GRAS_LRZ_CNTL.FC_ENABLE`` is set. It appears to store 1b/block.
84*61046927SAndroid Build Coastguard Worker``0`` means block has original depth clear value, and ``1`` means that the
85*61046927SAndroid Build Coastguard Workercorresponding block in LRZ has been modified.
86*61046927SAndroid Build Coastguard Worker
87*61046927SAndroid Build Coastguard WorkerLRZ fast-clear conservatively clears LRZ buffer. At the point where LRZ is
88*61046927SAndroid Build Coastguard Workerwritten the LRZ block which corresponds to a single fast-clear bit is cleared:
89*61046927SAndroid Build Coastguard Worker
90*61046927SAndroid Build Coastguard Worker- To ``0.0`` if depth comparison is ``GREATER``
91*61046927SAndroid Build Coastguard Worker- To ``1.0`` if depth comparison is ``LESS``
92*61046927SAndroid Build Coastguard Worker
93*61046927SAndroid Build Coastguard WorkerThis way it's always valid to fast-clear.
94*61046927SAndroid Build Coastguard Worker
95*61046927SAndroid Build Coastguard WorkerOn A7XX, the original depth clear value can be specified exactly allowing for
96*61046927SAndroid Build Coastguard Workerfast-clear to any value rather than just ``1.0`` or ``0.0``.
97*61046927SAndroid Build Coastguard Worker
98*61046927SAndroid Build Coastguard WorkerLRZ Feedback
99*61046927SAndroid Build Coastguard Worker-------------
100*61046927SAndroid Build Coastguard Worker
101*61046927SAndroid Build Coastguard WorkerSome draws do write depth but cannot contribute to LRZ during the BINNING pass
102*61046927SAndroid Build Coastguard Workere.g. when fragment shader has "discard" in it, however they can contribute to LRZ
103*61046927SAndroid Build Coastguard Workerduring the RENDERING pass via LRZ feedback mechanism. This may allow the draws
104*61046927SAndroid Build Coastguard Workerthat follow to depth test against the updated LRZ, this is especially important
105*61046927SAndroid Build Coastguard Workerif such "bad" draws were at the start of the renderpass.
106*61046927SAndroid Build Coastguard Worker
107*61046927SAndroid Build Coastguard WorkerLRZ feedback happens during the RENDERING pass when ``LRZ_FEEDBACK_ZMODE_MASK``
108*61046927SAndroid Build Coastguard Workeris set, if draw has a6xx_ztest_mode that has corresponding flag set in
109*61046927SAndroid Build Coastguard Worker``LRZ_FEEDBACK_ZMODE_MASK`` - its depth values would be used for feedback.
110*61046927SAndroid Build Coastguard Worker
111*61046927SAndroid Build Coastguard WorkerLRZ feedback alongside with LRZ testing also works during sysmem rendering.
112*61046927SAndroid Build Coastguard Worker
113*61046927SAndroid Build Coastguard WorkerLRZ Precision
114*61046927SAndroid Build Coastguard Worker-------------
115*61046927SAndroid Build Coastguard Worker
116*61046927SAndroid Build Coastguard WorkerLRZ always uses ``Z16_UNORM``. The epsilon for it is ``1.f / (1 << 16)`` which is
117*61046927SAndroid Build Coastguard Workernot enough to represent all values of ``Z32_UNORM`` or ``Z32_FLOAT``.
118*61046927SAndroid Build Coastguard WorkerThis especially raises questions in context of fast-clear, if fast-clear
119*61046927SAndroid Build Coastguard Workeruses a value which cannot be precisely represented by LRZ - we wouldn't
120*61046927SAndroid Build Coastguard Workerbe able to round it in the correct direction since direction is tracked
121*61046927SAndroid Build Coastguard Workeron GPU.
122*61046927SAndroid Build Coastguard Worker
123*61046927SAndroid Build Coastguard WorkerHowever, it seems that depth comparisons with LRZ values have some "slack"
124*61046927SAndroid Build Coastguard Workerand nothing special should be done for such depth clear values.
125*61046927SAndroid Build Coastguard Worker
126*61046927SAndroid Build Coastguard WorkerHow it was tested:
127*61046927SAndroid Build Coastguard Worker
128*61046927SAndroid Build Coastguard Worker- Clear ``Z32_FLOAT`` attachment to ``1.f / (1 << 17)``
129*61046927SAndroid Build Coastguard Worker
130*61046927SAndroid Build Coastguard Worker  - LRZ buffer contains all zeroes.
131*61046927SAndroid Build Coastguard Worker
132*61046927SAndroid Build Coastguard Worker- Do draws and check whether all samples are passing:
133*61046927SAndroid Build Coastguard Worker
134*61046927SAndroid Build Coastguard Worker  - ``OP_GREATER`` with ``(1.f / (1 << 17) + float32_epsilon)`` - passing;
135*61046927SAndroid Build Coastguard Worker  - ``OP_GREATER`` with ``(1.f / (1 << 17) - float32_epsilon)`` - not passing;
136*61046927SAndroid Build Coastguard Worker  - ``OP_LESS`` with ``(1.f / (1 << 17) - float32_epsilon)`` - samples;
137*61046927SAndroid Build Coastguard Worker  - ``OP_LESS`` with ``(1.f / (1 << 17) + float32_epsilon)``- not passing;
138*61046927SAndroid Build Coastguard Worker  - ``OP_LESS_OR_EQ`` with ``(1.f / (1 << 17) + float32_epsilon)`` - not passing.
139*61046927SAndroid Build Coastguard Worker
140*61046927SAndroid Build Coastguard WorkerIn all cases resulting LRZ buffer is all zeroes and LRZ direction is updated.
141*61046927SAndroid Build Coastguard Worker
142*61046927SAndroid Build Coastguard WorkerLRZ Caches
143*61046927SAndroid Build Coastguard Worker----------
144*61046927SAndroid Build Coastguard Worker
145*61046927SAndroid Build Coastguard Worker``LRZ_FLUSH`` flushes and invalidates LRZ caches, there are two caches:
146*61046927SAndroid Build Coastguard Worker
147*61046927SAndroid Build Coastguard Worker- Cache for fast-clear buffer;
148*61046927SAndroid Build Coastguard Worker- Cache for direction byte + depth view params.
149*61046927SAndroid Build Coastguard Worker
150*61046927SAndroid Build Coastguard WorkerThey could be cleared by ``LRZ_CLEAR``. To become visible in GPU memory
151*61046927SAndroid Build Coastguard Workerthe caches should be flushed with ``LRZ_FLUSH`` afterwards.
152*61046927SAndroid Build Coastguard Worker
153*61046927SAndroid Build Coastguard Worker``GRAS_LRZ_CNTL`` reads from these caches.
154