1*9880d681SAndroid Build Coastguard WorkerCode Generation Notes for MSA 2*9880d681SAndroid Build Coastguard Worker============================= 3*9880d681SAndroid Build Coastguard Worker 4*9880d681SAndroid Build Coastguard WorkerIntrinsics are lowered to SelectionDAG nodes where possible in order to enable 5*9880d681SAndroid Build Coastguard Workeroptimisation, reduce the size of the ISel matcher, and reduce repetition in 6*9880d681SAndroid Build Coastguard Workerthe implementation. In a small number of cases, this can cause different 7*9880d681SAndroid Build Coastguard Worker(semantically equivalent) instructions to be used in place of the requested 8*9880d681SAndroid Build Coastguard Workerinstruction, even when no optimisation has taken place. 9*9880d681SAndroid Build Coastguard Worker 10*9880d681SAndroid Build Coastguard WorkerInstructions 11*9880d681SAndroid Build Coastguard Worker============ 12*9880d681SAndroid Build Coastguard Worker 13*9880d681SAndroid Build Coastguard WorkerThis section describes any quirks of instruction selection for MSA. For 14*9880d681SAndroid Build Coastguard Workerexample, two instructions might be equally valid for some given IR and one is 15*9880d681SAndroid Build Coastguard Workerchosen in preference to the other. 16*9880d681SAndroid Build Coastguard Worker 17*9880d681SAndroid Build Coastguard Workerbclri.b: 18*9880d681SAndroid Build Coastguard Worker It is not possible to emit bclri.b since andi.b covers exactly the 19*9880d681SAndroid Build Coastguard Worker same cases. andi.b should use fractionally less power than bclri.b in 20*9880d681SAndroid Build Coastguard Worker most hardware implementations so it is used in preference to bclri.b. 21*9880d681SAndroid Build Coastguard Worker 22*9880d681SAndroid Build Coastguard Workervshf.w: 23*9880d681SAndroid Build Coastguard Worker It is not possible to emit vshf.w when the shuffle description is 24*9880d681SAndroid Build Coastguard Worker constant since shf.w covers exactly the same cases. shf.w is used 25*9880d681SAndroid Build Coastguard Worker instead. It is also impossible for the shuffle description to be 26*9880d681SAndroid Build Coastguard Worker unknown at compile-time due to the definition of shufflevector in 27*9880d681SAndroid Build Coastguard Worker LLVM IR. 28*9880d681SAndroid Build Coastguard Worker 29*9880d681SAndroid Build Coastguard Workervshf.[bhwd] 30*9880d681SAndroid Build Coastguard Worker When the shuffle description describes a splat operation, splat.[bhwd] 31*9880d681SAndroid Build Coastguard Worker instructions will be selected instead of vshf.[bhwd]. Unlike the ilv*, 32*9880d681SAndroid Build Coastguard Worker and pck* instructions, this is matched from MipsISD::VSHF instead of 33*9880d681SAndroid Build Coastguard Worker a special-case MipsISD node. 34*9880d681SAndroid Build Coastguard Worker 35*9880d681SAndroid Build Coastguard Workerilvl.d, pckev.d: 36*9880d681SAndroid Build Coastguard Worker It is not possible to emit ilvl.d, or pckev.d since ilvev.d covers the 37*9880d681SAndroid Build Coastguard Worker same shuffle. ilvev.d will be emitted instead. 38*9880d681SAndroid Build Coastguard Worker 39*9880d681SAndroid Build Coastguard Workerilvr.d, ilvod.d, pckod.d: 40*9880d681SAndroid Build Coastguard Worker It is not possible to emit ilvr.d, or pckod.d since ilvod.d covers the 41*9880d681SAndroid Build Coastguard Worker same shuffle. ilvod.d will be emitted instead. 42*9880d681SAndroid Build Coastguard Worker 43*9880d681SAndroid Build Coastguard Workersplat.[bhwd] 44*9880d681SAndroid Build Coastguard Worker The intrinsic will work as expected. However, unlike other intrinsics 45*9880d681SAndroid Build Coastguard Worker it lowers directly to MipsISD::VSHF instead of using common IR. 46*9880d681SAndroid Build Coastguard Worker 47*9880d681SAndroid Build Coastguard Workersplati.w: 48*9880d681SAndroid Build Coastguard Worker It is not possible to emit splati.w since shf.w covers the same cases. 49*9880d681SAndroid Build Coastguard Worker shf.w will be emitted instead. 50*9880d681SAndroid Build Coastguard Worker 51*9880d681SAndroid Build Coastguard Workercopy_s.w: 52*9880d681SAndroid Build Coastguard Worker On MIPS32, the copy_u.d intrinsic will emit this instruction instead of 53*9880d681SAndroid Build Coastguard Worker copy_u.w. This is semantically equivalent since the general-purpose 54*9880d681SAndroid Build Coastguard Worker register file is 32-bits wide. 55*9880d681SAndroid Build Coastguard Worker 56*9880d681SAndroid Build Coastguard Workerbinsri.[bhwd], binsli.[bhwd]: 57*9880d681SAndroid Build Coastguard Worker These two operations are equivalent to each other with the operands 58*9880d681SAndroid Build Coastguard Worker swapped and condition inverted. The compiler may use either one as 59*9880d681SAndroid Build Coastguard Worker appropriate. 60*9880d681SAndroid Build Coastguard Worker Furthermore, the compiler may use bsel.[bhwd] for some masks that do 61*9880d681SAndroid Build Coastguard Worker not survive the legalization process (this is a bug and will be fixed). 62*9880d681SAndroid Build Coastguard Worker 63*9880d681SAndroid Build Coastguard Workerbmnz.v, bmz.v, bsel.v: 64*9880d681SAndroid Build Coastguard Worker These three operations differ only in the operand that is tied to the 65*9880d681SAndroid Build Coastguard Worker result and the order of the operands. 66*9880d681SAndroid Build Coastguard Worker It is (currently) not possible to emit bmz.v, or bsel.v since bmnz.v is 67*9880d681SAndroid Build Coastguard Worker the same operation and will be emitted instead. 68*9880d681SAndroid Build Coastguard Worker In future, the compiler may choose between these three instructions 69*9880d681SAndroid Build Coastguard Worker according to register allocation. 70*9880d681SAndroid Build Coastguard Worker These three operations can be very confusing so here is a mapping 71*9880d681SAndroid Build Coastguard Worker between the instructions and the vselect node in one place: 72*9880d681SAndroid Build Coastguard Worker bmz.v wd, ws, wt/i8 -> (vselect wt/i8, wd, ws) 73*9880d681SAndroid Build Coastguard Worker bmnz.v wd, ws, wt/i8 -> (vselect wt/i8, ws, wd) 74*9880d681SAndroid Build Coastguard Worker bsel.v wd, ws, wt/i8 -> (vselect wd, wt/i8, ws) 75*9880d681SAndroid Build Coastguard Worker 76*9880d681SAndroid Build Coastguard Workerbmnzi.b, bmzi.b: 77*9880d681SAndroid Build Coastguard Worker Like their non-immediate counterparts, bmnzi.v and bmzi.v are the same 78*9880d681SAndroid Build Coastguard Worker operation with the operands swapped. bmnzi.v will (currently) be emitted 79*9880d681SAndroid Build Coastguard Worker for both cases. 80*9880d681SAndroid Build Coastguard Worker 81*9880d681SAndroid Build Coastguard Workerbseli.v: 82*9880d681SAndroid Build Coastguard Worker Unlike the non-immediate versions, bseli.v is distinguishable from 83*9880d681SAndroid Build Coastguard Worker bmnzi.b and bmzi.b and can be emitted. 84