1*9880d681SAndroid Build Coastguard Worker//===---------------------------------------------------------------------===// 2*9880d681SAndroid Build Coastguard Worker// Random notes about and ideas for the SystemZ backend. 3*9880d681SAndroid Build Coastguard Worker//===---------------------------------------------------------------------===// 4*9880d681SAndroid Build Coastguard Worker 5*9880d681SAndroid Build Coastguard WorkerThe initial backend is deliberately restricted to z10. We should add support 6*9880d681SAndroid Build Coastguard Workerfor later architectures at some point. 7*9880d681SAndroid Build Coastguard Worker 8*9880d681SAndroid Build Coastguard Worker-- 9*9880d681SAndroid Build Coastguard Worker 10*9880d681SAndroid Build Coastguard WorkerIf an inline asm ties an i32 "r" result to an i64 input, the input 11*9880d681SAndroid Build Coastguard Workerwill be treated as an i32, leaving the upper bits uninitialised. 12*9880d681SAndroid Build Coastguard WorkerFor example: 13*9880d681SAndroid Build Coastguard Worker 14*9880d681SAndroid Build Coastguard Workerdefine void @f4(i32 *%dst) { 15*9880d681SAndroid Build Coastguard Worker %val = call i32 asm "blah $0", "=r,0" (i64 103) 16*9880d681SAndroid Build Coastguard Worker store i32 %val, i32 *%dst 17*9880d681SAndroid Build Coastguard Worker ret void 18*9880d681SAndroid Build Coastguard Worker} 19*9880d681SAndroid Build Coastguard Worker 20*9880d681SAndroid Build Coastguard Workerfrom CodeGen/SystemZ/asm-09.ll will use LHI rather than LGHI. 21*9880d681SAndroid Build Coastguard Workerto load 103. This seems to be a general target-independent problem. 22*9880d681SAndroid Build Coastguard Worker 23*9880d681SAndroid Build Coastguard Worker-- 24*9880d681SAndroid Build Coastguard Worker 25*9880d681SAndroid Build Coastguard WorkerThe tuning of the choice between LOAD ADDRESS (LA) and addition in 26*9880d681SAndroid Build Coastguard WorkerSystemZISelDAGToDAG.cpp is suspect. It should be tweaked based on 27*9880d681SAndroid Build Coastguard Workerperformance measurements. 28*9880d681SAndroid Build Coastguard Worker 29*9880d681SAndroid Build Coastguard Worker-- 30*9880d681SAndroid Build Coastguard Worker 31*9880d681SAndroid Build Coastguard WorkerThere is no scheduling support. 32*9880d681SAndroid Build Coastguard Worker 33*9880d681SAndroid Build Coastguard Worker-- 34*9880d681SAndroid Build Coastguard Worker 35*9880d681SAndroid Build Coastguard WorkerWe don't use the BRANCH ON INDEX instructions. 36*9880d681SAndroid Build Coastguard Worker 37*9880d681SAndroid Build Coastguard Worker-- 38*9880d681SAndroid Build Coastguard Worker 39*9880d681SAndroid Build Coastguard WorkerWe only use MVC, XC and CLC for constant-length block operations. 40*9880d681SAndroid Build Coastguard WorkerWe could extend them to variable-length operations too, 41*9880d681SAndroid Build Coastguard Workerusing EXECUTE RELATIVE LONG. 42*9880d681SAndroid Build Coastguard Worker 43*9880d681SAndroid Build Coastguard WorkerMVCIN, MVCLE and CLCLE may be worthwhile too. 44*9880d681SAndroid Build Coastguard Worker 45*9880d681SAndroid Build Coastguard Worker-- 46*9880d681SAndroid Build Coastguard Worker 47*9880d681SAndroid Build Coastguard WorkerWe don't use CUSE or the TRANSLATE family of instructions for string 48*9880d681SAndroid Build Coastguard Workeroperations. The TRANSLATE ones are probably more difficult to exploit. 49*9880d681SAndroid Build Coastguard Worker 50*9880d681SAndroid Build Coastguard Worker-- 51*9880d681SAndroid Build Coastguard Worker 52*9880d681SAndroid Build Coastguard WorkerWe don't take full advantage of builtins like fabsl because the calling 53*9880d681SAndroid Build Coastguard Workerconventions require f128s to be returned by invisible reference. 54*9880d681SAndroid Build Coastguard Worker 55*9880d681SAndroid Build Coastguard Worker-- 56*9880d681SAndroid Build Coastguard Worker 57*9880d681SAndroid Build Coastguard WorkerADD LOGICAL WITH SIGNED IMMEDIATE could be useful when we need to 58*9880d681SAndroid Build Coastguard Workerproduce a carry. SUBTRACT LOGICAL IMMEDIATE could be useful when we 59*9880d681SAndroid Build Coastguard Workerneed to produce a borrow. (Note that there are no memory forms of 60*9880d681SAndroid Build Coastguard WorkerADD LOGICAL WITH CARRY and SUBTRACT LOGICAL WITH BORROW, so the high 61*9880d681SAndroid Build Coastguard Workerpart of 128-bit memory operations would probably need to be done 62*9880d681SAndroid Build Coastguard Workervia a register.) 63*9880d681SAndroid Build Coastguard Worker 64*9880d681SAndroid Build Coastguard Worker-- 65*9880d681SAndroid Build Coastguard Worker 66*9880d681SAndroid Build Coastguard WorkerWe don't use ICM or STCM. 67*9880d681SAndroid Build Coastguard Worker 68*9880d681SAndroid Build Coastguard Worker-- 69*9880d681SAndroid Build Coastguard Worker 70*9880d681SAndroid Build Coastguard WorkerDAGCombiner doesn't yet fold truncations of extended loads. Functions like: 71*9880d681SAndroid Build Coastguard Worker 72*9880d681SAndroid Build Coastguard Worker unsigned long f (unsigned long x, unsigned short *y) 73*9880d681SAndroid Build Coastguard Worker { 74*9880d681SAndroid Build Coastguard Worker return (x << 32) | *y; 75*9880d681SAndroid Build Coastguard Worker } 76*9880d681SAndroid Build Coastguard Worker 77*9880d681SAndroid Build Coastguard Workertherefore end up as: 78*9880d681SAndroid Build Coastguard Worker 79*9880d681SAndroid Build Coastguard Worker sllg %r2, %r2, 32 80*9880d681SAndroid Build Coastguard Worker llgh %r0, 0(%r3) 81*9880d681SAndroid Build Coastguard Worker lr %r2, %r0 82*9880d681SAndroid Build Coastguard Worker br %r14 83*9880d681SAndroid Build Coastguard Worker 84*9880d681SAndroid Build Coastguard Workerbut truncating the load would give: 85*9880d681SAndroid Build Coastguard Worker 86*9880d681SAndroid Build Coastguard Worker sllg %r2, %r2, 32 87*9880d681SAndroid Build Coastguard Worker lh %r2, 0(%r3) 88*9880d681SAndroid Build Coastguard Worker br %r14 89*9880d681SAndroid Build Coastguard Worker 90*9880d681SAndroid Build Coastguard Worker-- 91*9880d681SAndroid Build Coastguard Worker 92*9880d681SAndroid Build Coastguard WorkerFunctions like: 93*9880d681SAndroid Build Coastguard Worker 94*9880d681SAndroid Build Coastguard Workerdefine i64 @f1(i64 %a) { 95*9880d681SAndroid Build Coastguard Worker %and = and i64 %a, 1 96*9880d681SAndroid Build Coastguard Worker ret i64 %and 97*9880d681SAndroid Build Coastguard Worker} 98*9880d681SAndroid Build Coastguard Worker 99*9880d681SAndroid Build Coastguard Workerought to be implemented as: 100*9880d681SAndroid Build Coastguard Worker 101*9880d681SAndroid Build Coastguard Worker lhi %r0, 1 102*9880d681SAndroid Build Coastguard Worker ngr %r2, %r0 103*9880d681SAndroid Build Coastguard Worker br %r14 104*9880d681SAndroid Build Coastguard Worker 105*9880d681SAndroid Build Coastguard Workerbut two-address optimizations reverse the order of the AND and force: 106*9880d681SAndroid Build Coastguard Worker 107*9880d681SAndroid Build Coastguard Worker lhi %r0, 1 108*9880d681SAndroid Build Coastguard Worker ngr %r0, %r2 109*9880d681SAndroid Build Coastguard Worker lgr %r2, %r0 110*9880d681SAndroid Build Coastguard Worker br %r14 111*9880d681SAndroid Build Coastguard Worker 112*9880d681SAndroid Build Coastguard WorkerCodeGen/SystemZ/and-04.ll has several examples of this. 113*9880d681SAndroid Build Coastguard Worker 114*9880d681SAndroid Build Coastguard Worker-- 115*9880d681SAndroid Build Coastguard Worker 116*9880d681SAndroid Build Coastguard WorkerOut-of-range displacements are usually handled by loading the full 117*9880d681SAndroid Build Coastguard Workeraddress into a register. In many cases it would be better to create 118*9880d681SAndroid Build Coastguard Workeran anchor point instead. E.g. for: 119*9880d681SAndroid Build Coastguard Worker 120*9880d681SAndroid Build Coastguard Workerdefine void @f4a(i128 *%aptr, i64 %base) { 121*9880d681SAndroid Build Coastguard Worker %addr = add i64 %base, 524288 122*9880d681SAndroid Build Coastguard Worker %bptr = inttoptr i64 %addr to i128 * 123*9880d681SAndroid Build Coastguard Worker %a = load volatile i128 *%aptr 124*9880d681SAndroid Build Coastguard Worker %b = load i128 *%bptr 125*9880d681SAndroid Build Coastguard Worker %add = add i128 %a, %b 126*9880d681SAndroid Build Coastguard Worker store i128 %add, i128 *%aptr 127*9880d681SAndroid Build Coastguard Worker ret void 128*9880d681SAndroid Build Coastguard Worker} 129*9880d681SAndroid Build Coastguard Worker 130*9880d681SAndroid Build Coastguard Worker(from CodeGen/SystemZ/int-add-08.ll) we load %base+524288 and %base+524296 131*9880d681SAndroid Build Coastguard Workerinto separate registers, rather than using %base+524288 as a base for both. 132*9880d681SAndroid Build Coastguard Worker 133*9880d681SAndroid Build Coastguard Worker-- 134*9880d681SAndroid Build Coastguard Worker 135*9880d681SAndroid Build Coastguard WorkerDynamic stack allocations round the size to 8 bytes and then allocate 136*9880d681SAndroid Build Coastguard Workerthat rounded amount. It would be simpler to subtract the unrounded 137*9880d681SAndroid Build Coastguard Workersize from the copy of the stack pointer and then align the result. 138*9880d681SAndroid Build Coastguard WorkerSee CodeGen/SystemZ/alloca-01.ll for an example. 139*9880d681SAndroid Build Coastguard Worker 140*9880d681SAndroid Build Coastguard Worker-- 141*9880d681SAndroid Build Coastguard Worker 142*9880d681SAndroid Build Coastguard WorkerIf needed, we can support 16-byte atomics using LPQ, STPQ and CSDG. 143*9880d681SAndroid Build Coastguard Worker 144*9880d681SAndroid Build Coastguard Worker-- 145*9880d681SAndroid Build Coastguard Worker 146*9880d681SAndroid Build Coastguard WorkerWe might want to model all access registers and use them to spill 147*9880d681SAndroid Build Coastguard Worker32-bit values. 148*9880d681SAndroid Build Coastguard Worker 149*9880d681SAndroid Build Coastguard Worker-- 150*9880d681SAndroid Build Coastguard Worker 151*9880d681SAndroid Build Coastguard WorkerWe might want to use the 'overflow' condition of eg. AR to support 152*9880d681SAndroid Build Coastguard Workerllvm.sadd.with.overflow.i32 and related instructions - the generated code 153*9880d681SAndroid Build Coastguard Workerfor signed overflow check is currently quite bad. This would improve 154*9880d681SAndroid Build Coastguard Workerthe results of using -ftrapv. 155