1*61046927SAndroid Build Coastguard Worker# Special registers 2*61046927SAndroid Build Coastguard Worker 3*61046927SAndroid Build Coastguard Worker`r0l` is the hardware nesting counter. 4*61046927SAndroid Build Coastguard Worker 5*61046927SAndroid Build Coastguard Worker`r1` is the hardware link register. 6*61046927SAndroid Build Coastguard Worker 7*61046927SAndroid Build Coastguard Worker`r5` and `r6` are preloaded in vertex shaders to the vertex ID and instance ID. 8*61046927SAndroid Build Coastguard Worker 9*61046927SAndroid Build Coastguard Worker# ABI 10*61046927SAndroid Build Coastguard Worker 11*61046927SAndroid Build Coastguard WorkerThe following section describes the ABI used by non-monolithic programs. 12*61046927SAndroid Build Coastguard Worker 13*61046927SAndroid Build Coastguard Worker## Vertex 14*61046927SAndroid Build Coastguard Worker 15*61046927SAndroid Build Coastguard WorkerRegisters have the following layout at the beginning of the vertex shader 16*61046927SAndroid Build Coastguard Worker(written by the vertex prolog): 17*61046927SAndroid Build Coastguard Worker 18*61046927SAndroid Build Coastguard Worker* `r0-r4` and `r7` undefined. This avoids preloading into the nesting counter or 19*61046927SAndroid Build Coastguard Worker having unaligned values. The prolog is free to use these registers as 20*61046927SAndroid Build Coastguard Worker temporaries. 21*61046927SAndroid Build Coastguard Worker* `r5-r6` retain their usual meanings, even if the vertex shader is running as a 22*61046927SAndroid Build Coastguard Worker hardware compute shader. This allows software index fetch code to run in the 23*61046927SAndroid Build Coastguard Worker prolog without contaminating the main shader key. 24*61046927SAndroid Build Coastguard Worker* `r8` onwards contains 128-bit uniform vectors for each attribute. 25*61046927SAndroid Build Coastguard Worker Accommodates 30 attributes without spilling, exceeding the 16 attribute API 26*61046927SAndroid Build Coastguard Worker minimum. For 32 attributes, we will need to use function calls or the stack. 27*61046927SAndroid Build Coastguard Worker 28*61046927SAndroid Build Coastguard WorkerOne useful property is that the GPR usage of the combined program is equal to 29*61046927SAndroid Build Coastguard Workerthe GPR usage of the main shader. The prolog cannot write higher registers than 30*61046927SAndroid Build Coastguard Workerread by the main shader. 31*61046927SAndroid Build Coastguard Worker 32*61046927SAndroid Build Coastguard WorkerVertex prologs do not have any uniform registers allocated for preamble 33*61046927SAndroid Build Coastguard Workeroptimization or constant promotion, as this adds complexity without any 34*61046927SAndroid Build Coastguard Workerlegitimate use case. 35*61046927SAndroid Build Coastguard Worker 36*61046927SAndroid Build Coastguard WorkerFor a vertex shader reading $n$ attributes, the following layout is used: 37*61046927SAndroid Build Coastguard Worker 38*61046927SAndroid Build Coastguard Worker* The first $n$ 64-bit uniforms are the base addresses of each attribute. 39*61046927SAndroid Build Coastguard Worker* The next $n$ 32-bit uniforms are the associated clamps (sizes). Presently 40*61046927SAndroid Build Coastguard Worker robustness is always used. 41*61046927SAndroid Build Coastguard Worker* The next 2x32-bit uniform is the base vertex and base instance. This must 42*61046927SAndroid Build Coastguard Worker always be reserved because it is unknown at vertex shader compile-time whether 43*61046927SAndroid Build Coastguard Worker any attribute will use instancing. Reserving also the base vertex allows us to 44*61046927SAndroid Build Coastguard Worker push both conveniently with a single USC Uniform word. 45*61046927SAndroid Build Coastguard Worker* The next 16-bit is the draw ID. 46*61046927SAndroid Build Coastguard Worker* For a hardware compute shader, the next 48-bit is padding. 47*61046927SAndroid Build Coastguard Worker* For a hardware compute shader, the next 64-bit uniform is a pointer to the 48*61046927SAndroid Build Coastguard Worker input assembly buffer. 49*61046927SAndroid Build Coastguard Worker 50*61046927SAndroid Build Coastguard WorkerIn total, the first $6n + 5$ 16-bit uniform slots are reserved for a hardware 51*61046927SAndroid Build Coastguard Workervertex shader, or $6n + 12$ for a hardware compute shader. 52*61046927SAndroid Build Coastguard Worker 53*61046927SAndroid Build Coastguard Worker## Fragment 54*61046927SAndroid Build Coastguard Worker 55*61046927SAndroid Build Coastguard WorkerWhen sample shading is enabled in a non-monolithic fragment shader, the fragment 56*61046927SAndroid Build Coastguard Workershader has the following register inputs: 57*61046927SAndroid Build Coastguard Worker 58*61046927SAndroid Build Coastguard Worker* `r0l = 0`. This is the hardware nesting counter. 59*61046927SAndroid Build Coastguard Worker* `r0h` is the mask of samples currently being shaded. This usually equals to 60*61046927SAndroid Build Coastguard Worker `1 << sample ID`, for "true" per-sample shading. 61*61046927SAndroid Build Coastguard Worker 62*61046927SAndroid Build Coastguard WorkerWhen sample shading is disabled, no register inputs are defined. The fragment 63*61046927SAndroid Build Coastguard Workerprolog (if present) may clobber whatever registers it pleases. 64*61046927SAndroid Build Coastguard Worker 65*61046927SAndroid Build Coastguard WorkerRegisters have the following layout at the end of the fragment shader (read by 66*61046927SAndroid Build Coastguard Workerthe fragment epilog): 67*61046927SAndroid Build Coastguard Worker 68*61046927SAndroid Build Coastguard Worker* `r0l = 0` if sample shading is enabled. This is implicitly true. 69*61046927SAndroid Build Coastguard Worker* `r0h` preserved if sample shading is enabled. 70*61046927SAndroid Build Coastguard Worker* `r2` and `r3l` contain the emitted depth/stencil respectively, if 71*61046927SAndroid Build Coastguard Worker depth and/or stencil are written by the fragment shader. Depth/stencil writes 72*61046927SAndroid Build Coastguard Worker must be deferred to the epilog for correctness when the epilog can discard 73*61046927SAndroid Build Coastguard Worker (i.e. when alpha-to-coverage is enabled). 74*61046927SAndroid Build Coastguard Worker* `r3h` contains the logically emitted sample mask, if the fragment shader uses 75*61046927SAndroid Build Coastguard Worker forced early tests. This predicates the epilog's stores. 76*61046927SAndroid Build Coastguard Worker* The vec4 of 32-bit registers beginning at `r(4 * (i + 1))` contains the colour 77*61046927SAndroid Build Coastguard Worker output for render target `i`. When dual source blending is enabled, there is 78*61046927SAndroid Build Coastguard Worker only a single render target and the dual source colour is treated as the 79*61046927SAndroid Build Coastguard Worker second render target (registers r8-r11). 80*61046927SAndroid Build Coastguard Worker 81*61046927SAndroid Build Coastguard WorkerUniform registers have the following layout: 82*61046927SAndroid Build Coastguard Worker 83*61046927SAndroid Build Coastguard Worker* u0_u1: 64-bit render target texture heap 84*61046927SAndroid Build Coastguard Worker* u2...u5: Blend constant 85*61046927SAndroid Build Coastguard Worker* u6_u7: Root descriptor, so we can fetch the 64-bit fragment invocation counter 86*61046927SAndroid Build Coastguard Worker address and (OpenGL only) the 64-bit polygon stipple address 87