1*9880d681SAndroid Build Coastguard Worker//===----------------------------------------------------------------------===// 2*9880d681SAndroid Build Coastguard Worker// Representing sign/zero extension of function results 3*9880d681SAndroid Build Coastguard Worker//===----------------------------------------------------------------------===// 4*9880d681SAndroid Build Coastguard Worker 5*9880d681SAndroid Build Coastguard WorkerMar 25, 2009 - Initial Revision 6*9880d681SAndroid Build Coastguard Worker 7*9880d681SAndroid Build Coastguard WorkerMost ABIs specify that functions which return small integers do so in a 8*9880d681SAndroid Build Coastguard Workerspecific integer GPR. This is an efficient way to go, but raises the question: 9*9880d681SAndroid Build Coastguard Workerif the returned value is smaller than the register, what do the high bits hold? 10*9880d681SAndroid Build Coastguard Worker 11*9880d681SAndroid Build Coastguard WorkerThere are three (interesting) possible answers: undefined, zero extended, or 12*9880d681SAndroid Build Coastguard Workersign extended. The number of bits in question depends on the data-type that 13*9880d681SAndroid Build Coastguard Workerthe front-end is referencing (typically i1/i8/i16/i32). 14*9880d681SAndroid Build Coastguard Worker 15*9880d681SAndroid Build Coastguard WorkerKnowing the answer to this is important for two reasons: 1) we want to be able 16*9880d681SAndroid Build Coastguard Workerto implement the ABI correctly. If we need to sign extend the result according 17*9880d681SAndroid Build Coastguard Workerto the ABI, we really really do need to do this to preserve correctness. 2) 18*9880d681SAndroid Build Coastguard Workerthis information is often useful for optimization purposes, and we want the 19*9880d681SAndroid Build Coastguard Workermid-level optimizers to be able to process this (e.g. eliminate redundant 20*9880d681SAndroid Build Coastguard Workerextensions). 21*9880d681SAndroid Build Coastguard Worker 22*9880d681SAndroid Build Coastguard WorkerFor example, lets pretend that X86 requires the caller to properly extend the 23*9880d681SAndroid Build Coastguard Workerresult of a return (I'm not sure this is the case, but the argument doesn't 24*9880d681SAndroid Build Coastguard Workerdepend on this). Given this, we should compile this: 25*9880d681SAndroid Build Coastguard Worker 26*9880d681SAndroid Build Coastguard Workerint a(); 27*9880d681SAndroid Build Coastguard Workershort b() { return a(); } 28*9880d681SAndroid Build Coastguard Worker 29*9880d681SAndroid Build Coastguard Workerinto: 30*9880d681SAndroid Build Coastguard Worker 31*9880d681SAndroid Build Coastguard Worker_b: 32*9880d681SAndroid Build Coastguard Worker subl $12, %esp 33*9880d681SAndroid Build Coastguard Worker call L_a$stub 34*9880d681SAndroid Build Coastguard Worker addl $12, %esp 35*9880d681SAndroid Build Coastguard Worker cwtl 36*9880d681SAndroid Build Coastguard Worker ret 37*9880d681SAndroid Build Coastguard Worker 38*9880d681SAndroid Build Coastguard WorkerAn optimization example is that we should be able to eliminate the explicit 39*9880d681SAndroid Build Coastguard Workersign extension in this example: 40*9880d681SAndroid Build Coastguard Worker 41*9880d681SAndroid Build Coastguard Workershort y(); 42*9880d681SAndroid Build Coastguard Workerint z() { 43*9880d681SAndroid Build Coastguard Worker return ((int)y() << 16) >> 16; 44*9880d681SAndroid Build Coastguard Worker} 45*9880d681SAndroid Build Coastguard Worker 46*9880d681SAndroid Build Coastguard Worker_z: 47*9880d681SAndroid Build Coastguard Worker subl $12, %esp 48*9880d681SAndroid Build Coastguard Worker call _y 49*9880d681SAndroid Build Coastguard Worker ;; movswl %ax, %eax -> not needed because eax is already sext'd 50*9880d681SAndroid Build Coastguard Worker addl $12, %esp 51*9880d681SAndroid Build Coastguard Worker ret 52*9880d681SAndroid Build Coastguard Worker 53*9880d681SAndroid Build Coastguard Worker//===----------------------------------------------------------------------===// 54*9880d681SAndroid Build Coastguard Worker// What we have right now. 55*9880d681SAndroid Build Coastguard Worker//===----------------------------------------------------------------------===// 56*9880d681SAndroid Build Coastguard Worker 57*9880d681SAndroid Build Coastguard WorkerCurrently, these sorts of things are modelled by compiling a function to return 58*9880d681SAndroid Build Coastguard Workerthe small type and a signext/zeroext marker is used. For example, we compile 59*9880d681SAndroid Build Coastguard WorkerZ into: 60*9880d681SAndroid Build Coastguard Worker 61*9880d681SAndroid Build Coastguard Workerdefine i32 @z() nounwind { 62*9880d681SAndroid Build Coastguard Workerentry: 63*9880d681SAndroid Build Coastguard Worker %0 = tail call signext i16 (...)* @y() nounwind 64*9880d681SAndroid Build Coastguard Worker %1 = sext i16 %0 to i32 65*9880d681SAndroid Build Coastguard Worker ret i32 %1 66*9880d681SAndroid Build Coastguard Worker} 67*9880d681SAndroid Build Coastguard Worker 68*9880d681SAndroid Build Coastguard Workerand b into: 69*9880d681SAndroid Build Coastguard Worker 70*9880d681SAndroid Build Coastguard Workerdefine signext i16 @b() nounwind { 71*9880d681SAndroid Build Coastguard Workerentry: 72*9880d681SAndroid Build Coastguard Worker %0 = tail call i32 (...)* @a() nounwind ; <i32> [#uses=1] 73*9880d681SAndroid Build Coastguard Worker %retval12 = trunc i32 %0 to i16 ; <i16> [#uses=1] 74*9880d681SAndroid Build Coastguard Worker ret i16 %retval12 75*9880d681SAndroid Build Coastguard Worker} 76*9880d681SAndroid Build Coastguard Worker 77*9880d681SAndroid Build Coastguard WorkerThis has some problems: 1) the actual precise semantics are really poorly 78*9880d681SAndroid Build Coastguard Workerdefined (see PR3779). 2) some targets might want the caller to extend, some 79*9880d681SAndroid Build Coastguard Workermight want the callee to extend 3) the mid-level optimizer doesn't know the 80*9880d681SAndroid Build Coastguard Workersize of the GPR, so it doesn't know that %0 is sign extended up to 32-bits 81*9880d681SAndroid Build Coastguard Workerhere, and even if it did, it could not eliminate the sext. 4) the code 82*9880d681SAndroid Build Coastguard Workergenerator has historically assumed that the result is extended to i32, which is 83*9880d681SAndroid Build Coastguard Workera problem on PIC16 (and is also probably wrong on alpha and other 64-bit 84*9880d681SAndroid Build Coastguard Workertargets). 85*9880d681SAndroid Build Coastguard Worker 86*9880d681SAndroid Build Coastguard Worker//===----------------------------------------------------------------------===// 87*9880d681SAndroid Build Coastguard Worker// The proposal 88*9880d681SAndroid Build Coastguard Worker//===----------------------------------------------------------------------===// 89*9880d681SAndroid Build Coastguard Worker 90*9880d681SAndroid Build Coastguard WorkerI suggest that we have the front-end fully lower out the ABI issues here to 91*9880d681SAndroid Build Coastguard WorkerLLVM IR. This makes it 100% explicit what is going on and means that there is 92*9880d681SAndroid Build Coastguard Workerno cause for confusion. For example, the cases above should compile into: 93*9880d681SAndroid Build Coastguard Worker 94*9880d681SAndroid Build Coastguard Workerdefine i32 @z() nounwind { 95*9880d681SAndroid Build Coastguard Workerentry: 96*9880d681SAndroid Build Coastguard Worker %0 = tail call i32 (...)* @y() nounwind 97*9880d681SAndroid Build Coastguard Worker %1 = trunc i32 %0 to i16 98*9880d681SAndroid Build Coastguard Worker %2 = sext i16 %1 to i32 99*9880d681SAndroid Build Coastguard Worker ret i32 %2 100*9880d681SAndroid Build Coastguard Worker} 101*9880d681SAndroid Build Coastguard Workerdefine i32 @b() nounwind { 102*9880d681SAndroid Build Coastguard Workerentry: 103*9880d681SAndroid Build Coastguard Worker %0 = tail call i32 (...)* @a() nounwind 104*9880d681SAndroid Build Coastguard Worker %retval12 = trunc i32 %0 to i16 105*9880d681SAndroid Build Coastguard Worker %tmp = sext i16 %retval12 to i32 106*9880d681SAndroid Build Coastguard Worker ret i32 %tmp 107*9880d681SAndroid Build Coastguard Worker} 108*9880d681SAndroid Build Coastguard Worker 109*9880d681SAndroid Build Coastguard WorkerIn this model, no functions will return an i1/i8/i16 (and on a x86-64 target 110*9880d681SAndroid Build Coastguard Workerthat extends results to i64, no i32). This solves the ambiguity issue, allows us 111*9880d681SAndroid Build Coastguard Workerto fully describe all possible ABIs, and now allows the optimizers to reason 112*9880d681SAndroid Build Coastguard Workerabout and eliminate these extensions. 113*9880d681SAndroid Build Coastguard Worker 114*9880d681SAndroid Build Coastguard WorkerThe one thing that is missing is the ability for the front-end and optimizer to 115*9880d681SAndroid Build Coastguard Workerspecify/infer the guarantees provided by the ABI to allow other optimizations. 116*9880d681SAndroid Build Coastguard WorkerFor example, in the y/z case, since y is known to return a sign extended value, 117*9880d681SAndroid Build Coastguard Workerthe trunc/sext in z should be eliminable. 118*9880d681SAndroid Build Coastguard Worker 119*9880d681SAndroid Build Coastguard WorkerThis can be done by introducing new sext/zext attributes which mean "I know 120*9880d681SAndroid Build Coastguard Workerthat the result of the function is sign extended at least N bits. Given this, 121*9880d681SAndroid Build Coastguard Workerand given that it is stuck on the y function, the mid-level optimizer could 122*9880d681SAndroid Build Coastguard Workereasily eliminate the extensions etc with existing functionality. 123*9880d681SAndroid Build Coastguard Worker 124*9880d681SAndroid Build Coastguard WorkerThe major disadvantage of doing this sort of thing is that it makes the ABI 125*9880d681SAndroid Build Coastguard Workerlowering stuff even more explicit in the front-end, and that we would like to 126*9880d681SAndroid Build Coastguard Workereventually move to having the code generator do more of this work. However, 127*9880d681SAndroid Build Coastguard Workerthe sad truth of the matter is that this is a) unlikely to happen anytime in 128*9880d681SAndroid Build Coastguard Workerthe near future, and b) this is no worse than we have now with the existing 129*9880d681SAndroid Build Coastguard Workerattributes. 130*9880d681SAndroid Build Coastguard Worker 131*9880d681SAndroid Build Coastguard WorkerC compilers fundamentally have to reason about the target in many ways. 132*9880d681SAndroid Build Coastguard WorkerThis is ugly and horrible, but a fact of life. 133*9880d681SAndroid Build Coastguard Worker 134