1*61046927SAndroid Build Coastguard Worker# Notes on opcodes 2*61046927SAndroid Build Coastguard Worker 3*61046927SAndroid Build Coastguard Worker_Notes mainly by Connor Abbott extracted from the disassembler_ 4*61046927SAndroid Build Coastguard Worker 5*61046927SAndroid Build Coastguard WorkerLOG_FREXPM: 6*61046927SAndroid Build Coastguard Worker 7*61046927SAndroid Build Coastguard Worker // From the ARM patent US20160364209A1: 8*61046927SAndroid Build Coastguard Worker // "Decompose v (the input) into numbers x1 and s such that v = x1 * 2^s, 9*61046927SAndroid Build Coastguard Worker // and x1 is a floating point value in a predetermined range where the 10*61046927SAndroid Build Coastguard Worker // value 1 is within the range and not at one extremity of the range (e.g. 11*61046927SAndroid Build Coastguard Worker // choose a range where 1 is towards middle of range)." 12*61046927SAndroid Build Coastguard Worker // 13*61046927SAndroid Build Coastguard Worker // This computes x1. 14*61046927SAndroid Build Coastguard Worker 15*61046927SAndroid Build Coastguard WorkerFRCP_FREXPM: 16*61046927SAndroid Build Coastguard Worker 17*61046927SAndroid Build Coastguard Worker // Given a floating point number m * 2^e, returns m * 2^{-1}. This is 18*61046927SAndroid Build Coastguard Worker // exactly the same as the mantissa part of frexp(). 19*61046927SAndroid Build Coastguard Worker 20*61046927SAndroid Build Coastguard WorkerFSQRT_FREXPM: 21*61046927SAndroid Build Coastguard Worker // Given a floating point number m * 2^e, returns m * 2^{-2} if e is even, 22*61046927SAndroid Build Coastguard Worker // and m * 2^{-1} if e is odd. In other words, scales by powers of 4 until 23*61046927SAndroid Build Coastguard Worker // within the range [0.25, 1). Used for square-root and reciprocal 24*61046927SAndroid Build Coastguard Worker // square-root. 25*61046927SAndroid Build Coastguard Worker 26*61046927SAndroid Build Coastguard Worker 27*61046927SAndroid Build Coastguard Worker 28*61046927SAndroid Build Coastguard Worker 29*61046927SAndroid Build Coastguard WorkerFRCP_FREXPE: 30*61046927SAndroid Build Coastguard Worker // Given a floating point number m * 2^e, computes -e - 1 as an integer. 31*61046927SAndroid Build Coastguard Worker // Zero and infinity/NaN return 0. 32*61046927SAndroid Build Coastguard Worker 33*61046927SAndroid Build Coastguard WorkerFSQRT_FREXPE: 34*61046927SAndroid Build Coastguard Worker // Computes floor(e/2) + 1. 35*61046927SAndroid Build Coastguard Worker 36*61046927SAndroid Build Coastguard WorkerFRSQ_FREXPE: 37*61046927SAndroid Build Coastguard Worker // Given a floating point number m * 2^e, computes -floor(e/2) - 1 as an 38*61046927SAndroid Build Coastguard Worker // integer. 39*61046927SAndroid Build Coastguard Worker 40*61046927SAndroid Build Coastguard WorkerLSHIFT_ADD_LOW32: 41*61046927SAndroid Build Coastguard Worker // These instructions in the FMA slot, together with LSHIFT_ADD_HIGH32.i32 42*61046927SAndroid Build Coastguard Worker // in the ADD slot, allow one to do a 64-bit addition with an extra small 43*61046927SAndroid Build Coastguard Worker // shift on one of the sources. There are three possible scenarios: 44*61046927SAndroid Build Coastguard Worker // 45*61046927SAndroid Build Coastguard Worker // 1) Full 64-bit addition. Do: 46*61046927SAndroid Build Coastguard Worker // out.x = LSHIFT_ADD_LOW32.i64 src1.x, src2.x, shift 47*61046927SAndroid Build Coastguard Worker // out.y = LSHIFT_ADD_HIGH32.i32 src1.y, src2.y 48*61046927SAndroid Build Coastguard Worker // 49*61046927SAndroid Build Coastguard Worker // The shift amount is applied to src2 before adding. The shift amount, and 50*61046927SAndroid Build Coastguard Worker // any extra bits from src2 plus the overflow bit, are sent directly from 51*61046927SAndroid Build Coastguard Worker // FMA to ADD instead of being passed explicitly. Hence, these two must be 52*61046927SAndroid Build Coastguard Worker // bundled together into the same instruction. 53*61046927SAndroid Build Coastguard Worker // 54*61046927SAndroid Build Coastguard Worker // 2) Add a 64-bit value src1 to a zero-extended 32-bit value src2. Do: 55*61046927SAndroid Build Coastguard Worker // out.x = LSHIFT_ADD_LOW32.u32 src1.x, src2, shift 56*61046927SAndroid Build Coastguard Worker // out.y = LSHIFT_ADD_HIGH32.i32 src1.x, 0 57*61046927SAndroid Build Coastguard Worker // 58*61046927SAndroid Build Coastguard Worker // Note that in this case, the second argument to LSHIFT_ADD_HIGH32 is 59*61046927SAndroid Build Coastguard Worker // ignored, so it can actually be anything. As before, the shift is applied 60*61046927SAndroid Build Coastguard Worker // to src2 before adding. 61*61046927SAndroid Build Coastguard Worker // 62*61046927SAndroid Build Coastguard Worker // 3) Add a 64-bit value to a sign-extended 32-bit value src2. Do: 63*61046927SAndroid Build Coastguard Worker // out.x = LSHIFT_ADD_LOW32.i32 src1.x, src2, shift 64*61046927SAndroid Build Coastguard Worker // out.y = LSHIFT_ADD_HIGH32.i32 src1.x, 0 65*61046927SAndroid Build Coastguard Worker // 66*61046927SAndroid Build Coastguard Worker // The only difference is the .i32 instead of .u32. Otherwise, this is 67*61046927SAndroid Build Coastguard Worker // exactly the same as before. 68*61046927SAndroid Build Coastguard Worker // 69*61046927SAndroid Build Coastguard Worker // In all these instructions, the shift amount is stored where the third 70*61046927SAndroid Build Coastguard Worker // source would be, so the shift has to be a small immediate from 0 to 7. 71*61046927SAndroid Build Coastguard Worker // This is fine for the expected use-case of these instructions, which is 72*61046927SAndroid Build Coastguard Worker // manipulating 64-bit pointers. 73*61046927SAndroid Build Coastguard Worker // 74*61046927SAndroid Build Coastguard Worker // These instructions can also be combined with various load/store 75*61046927SAndroid Build Coastguard Worker // instructions which normally take a 64-bit pointer in order to add a 76*61046927SAndroid Build Coastguard Worker // 32-bit or 64-bit offset to the pointer before doing the operation, 77*61046927SAndroid Build Coastguard Worker // optionally shifting the offset. The load/store op implicity does 78*61046927SAndroid Build Coastguard Worker // LSHIFT_ADD_HIGH32.i32 internally. Letting ptr be the pointer, and offset 79*61046927SAndroid Build Coastguard Worker // the desired offset, the cases go as follows: 80*61046927SAndroid Build Coastguard Worker // 81*61046927SAndroid Build Coastguard Worker // 1) Add a 64-bit offset: 82*61046927SAndroid Build Coastguard Worker // LSHIFT_ADD_LOW32.i64 ptr.x, offset.x, shift 83*61046927SAndroid Build Coastguard Worker // ld_st_op ptr.y, offset.y, ... 84*61046927SAndroid Build Coastguard Worker // 85*61046927SAndroid Build Coastguard Worker // Note that the output of LSHIFT_ADD_LOW32.i64 is not used, instead being 86*61046927SAndroid Build Coastguard Worker // implicitly sent to the load/store op to serve as the low 32 bits of the 87*61046927SAndroid Build Coastguard Worker // pointer. 88*61046927SAndroid Build Coastguard Worker // 89*61046927SAndroid Build Coastguard Worker // 2) Add a 32-bit unsigned offset: 90*61046927SAndroid Build Coastguard Worker // temp = LSHIFT_ADD_LOW32.u32 ptr.x, offset, shift 91*61046927SAndroid Build Coastguard Worker // ld_st_op temp, ptr.y, ... 92*61046927SAndroid Build Coastguard Worker // 93*61046927SAndroid Build Coastguard Worker // Now, the low 32 bits of offset << shift + ptr are passed explicitly to 94*61046927SAndroid Build Coastguard Worker // the ld_st_op, to match the case where there is no offset and ld_st_op is 95*61046927SAndroid Build Coastguard Worker // called directly. 96*61046927SAndroid Build Coastguard Worker // 97*61046927SAndroid Build Coastguard Worker // 3) Add a 32-bit signed offset: 98*61046927SAndroid Build Coastguard Worker // temp = LSHIFT_ADD_LOW32.i32 ptr.x, offset, shift 99*61046927SAndroid Build Coastguard Worker // ld_st_op temp, ptr.y, ... 100*61046927SAndroid Build Coastguard Worker // 101*61046927SAndroid Build Coastguard Worker // Again, the same as the unsigned case except for the offset. 102*61046927SAndroid Build Coastguard Worker 103*61046927SAndroid Build Coastguard Worker--- 104*61046927SAndroid Build Coastguard Worker 105*61046927SAndroid Build Coastguard WorkerADD ops.. 106*61046927SAndroid Build Coastguard Worker 107*61046927SAndroid Build Coastguard WorkerF16_TO_F32.X: // take the low 16 bits, and expand it to a 32-bit float 108*61046927SAndroid Build Coastguard WorkerF16_TO_F32.Y: // take the high 16 bits, and expand it to a 32-bit float 109*61046927SAndroid Build Coastguard Worker 110*61046927SAndroid Build Coastguard WorkerMOV: 111*61046927SAndroid Build Coastguard Worker // Logically, this should be SWZ.XY, but that's equivalent to a move, and 112*61046927SAndroid Build Coastguard Worker // this seems to be the canonical way the blob generates a MOV. 113*61046927SAndroid Build Coastguard Worker 114*61046927SAndroid Build Coastguard Worker 115*61046927SAndroid Build Coastguard WorkerFRCP_FREXPM: 116*61046927SAndroid Build Coastguard Worker // Given a floating point number m * 2^e, returns m ^ 2^{-1}. 117*61046927SAndroid Build Coastguard Worker 118*61046927SAndroid Build Coastguard WorkerFLOG_FREXPE: 119*61046927SAndroid Build Coastguard Worker // From the ARM patent US20160364209A1: 120*61046927SAndroid Build Coastguard Worker // "Decompose v (the input) into numbers x1 and s such that v = x1 * 2^s, 121*61046927SAndroid Build Coastguard Worker // and x1 is a floating point value in a predetermined range where the 122*61046927SAndroid Build Coastguard Worker // value 1 is within the range and not at one extremity of the range (e.g. 123*61046927SAndroid Build Coastguard Worker // choose a range where 1 is towards middle of range)." 124*61046927SAndroid Build Coastguard Worker // 125*61046927SAndroid Build Coastguard Worker // This computes s. 126*61046927SAndroid Build Coastguard Worker 127*61046927SAndroid Build Coastguard WorkerLD_UBO.v4i32 128*61046927SAndroid Build Coastguard Worker // src0 = offset, src1 = binding 129*61046927SAndroid Build Coastguard Worker 130*61046927SAndroid Build Coastguard WorkerFRCP_FAST.f32: 131*61046927SAndroid Build Coastguard Worker // *_FAST does not exist on G71 (added to G51, G72, and everything after) 132*61046927SAndroid Build Coastguard Worker 133*61046927SAndroid Build Coastguard WorkerFRCP_TABLE 134*61046927SAndroid Build Coastguard Worker // Given a floating point number m * 2^e, produces a table-based 135*61046927SAndroid Build Coastguard Worker // approximation of 2/m using the top 17 bits. Includes special cases for 136*61046927SAndroid Build Coastguard Worker // infinity, NaN, and zero, and copies the sign bit. 137*61046927SAndroid Build Coastguard Worker 138*61046927SAndroid Build Coastguard WorkerFRCP_FAST.f16.X 139*61046927SAndroid Build Coastguard Worker // Exists on G71 140*61046927SAndroid Build Coastguard Worker 141*61046927SAndroid Build Coastguard WorkerFRSQ_TABLE: 142*61046927SAndroid Build Coastguard Worker // A similar table for inverse square root, using the high 17 bits of the 143*61046927SAndroid Build Coastguard Worker // mantissa as well as the low bit of the exponent. 144*61046927SAndroid Build Coastguard Worker 145*61046927SAndroid Build Coastguard WorkerFRCP_APPROX: 146*61046927SAndroid Build Coastguard Worker // Used in the argument reduction for log. Given a floating-point number 147*61046927SAndroid Build Coastguard Worker // m * 2^e, uses the top 4 bits of m to produce an approximation to 1/m 148*61046927SAndroid Build Coastguard Worker // with the exponent forced to 0 and only the top 5 bits are nonzero. 0, 149*61046927SAndroid Build Coastguard Worker // infinity, and NaN all return 1.0. 150*61046927SAndroid Build Coastguard Worker // See the ARM patent for more information. 151*61046927SAndroid Build Coastguard Worker 152*61046927SAndroid Build Coastguard WorkerMUX: 153*61046927SAndroid Build Coastguard Worker // For each bit i, return src2[i] ? src0[i] : src1[i]. In other words, this 154*61046927SAndroid Build Coastguard Worker // is the same as (src2 & src0) | (~src2 & src1). 155*61046927SAndroid Build Coastguard Worker 156*61046927SAndroid Build Coastguard WorkerST_VAR: 157*61046927SAndroid Build Coastguard Worker // store a varying given the address and datatype from LD_VAR_ADDR 158*61046927SAndroid Build Coastguard Worker 159*61046927SAndroid Build Coastguard WorkerLD_VAR_ADDR: 160*61046927SAndroid Build Coastguard Worker // Compute varying address and datatype (for storing in the vertex shader), 161*61046927SAndroid Build Coastguard Worker // and store the vec3 result in the data register. The result is passed as 162*61046927SAndroid Build Coastguard Worker // the 3 normal arguments to ST_VAR. 163*61046927SAndroid Build Coastguard Worker 164*61046927SAndroid Build Coastguard WorkerDISCARD 165*61046927SAndroid Build Coastguard Worker // Conditional discards (discard_if) in NIR. Compares the first two 166*61046927SAndroid Build Coastguard Worker // sources and discards if the result is true 167*61046927SAndroid Build Coastguard Worker 168*61046927SAndroid Build Coastguard WorkerATEST.f32: 169*61046927SAndroid Build Coastguard Worker // Implements alpha-to-coverage, as well as possibly the late depth and 170*61046927SAndroid Build Coastguard Worker // stencil tests. The first source is the existing sample mask in R60 171*61046927SAndroid Build Coastguard Worker // (possibly modified by gl_SampleMask), and the second source is the alpha 172*61046927SAndroid Build Coastguard Worker // value. The sample mask is written right away based on the 173*61046927SAndroid Build Coastguard Worker // alpha-to-coverage result using the normal register write mechanism, 174*61046927SAndroid Build Coastguard Worker // since that doesn't need to read from any memory, and then written again 175*61046927SAndroid Build Coastguard Worker // later based on the result of the stencil and depth tests using the 176*61046927SAndroid Build Coastguard Worker // special register. 177*61046927SAndroid Build Coastguard Worker 178*61046927SAndroid Build Coastguard WorkerBLEND: 179*61046927SAndroid Build Coastguard Worker // This takes the sample coverage mask (computed by ATEST above) as a 180*61046927SAndroid Build Coastguard Worker // regular argument, in addition to the vec4 color in the special register. 181