1*9880d681SAndroid Build Coastguard Worker========================================== 2*9880d681SAndroid Build Coastguard WorkerDesign and Usage of the InAlloca Attribute 3*9880d681SAndroid Build Coastguard Worker========================================== 4*9880d681SAndroid Build Coastguard Worker 5*9880d681SAndroid Build Coastguard WorkerIntroduction 6*9880d681SAndroid Build Coastguard Worker============ 7*9880d681SAndroid Build Coastguard Worker 8*9880d681SAndroid Build Coastguard WorkerThe :ref:`inalloca <attr_inalloca>` attribute is designed to allow 9*9880d681SAndroid Build Coastguard Workertaking the address of an aggregate argument that is being passed by 10*9880d681SAndroid Build Coastguard Workervalue through memory. Primarily, this feature is required for 11*9880d681SAndroid Build Coastguard Workercompatibility with the Microsoft C++ ABI. Under that ABI, class 12*9880d681SAndroid Build Coastguard Workerinstances that are passed by value are constructed directly into 13*9880d681SAndroid Build Coastguard Workerargument stack memory. Prior to the addition of inalloca, calls in LLVM 14*9880d681SAndroid Build Coastguard Workerwere indivisible instructions. There was no way to perform intermediate 15*9880d681SAndroid Build Coastguard Workerwork, such as object construction, between the first stack adjustment 16*9880d681SAndroid Build Coastguard Workerand the final control transfer. With inalloca, all arguments passed in 17*9880d681SAndroid Build Coastguard Workermemory are modelled as a single alloca, which can be stored to prior to 18*9880d681SAndroid Build Coastguard Workerthe call. Unfortunately, this complicated feature comes with a large 19*9880d681SAndroid Build Coastguard Workerset of restrictions designed to bound the lifetime of the argument 20*9880d681SAndroid Build Coastguard Workermemory around the call. 21*9880d681SAndroid Build Coastguard Worker 22*9880d681SAndroid Build Coastguard WorkerFor now, it is recommended that frontends and optimizers avoid producing 23*9880d681SAndroid Build Coastguard Workerthis construct, primarily because it forces the use of a base pointer. 24*9880d681SAndroid Build Coastguard WorkerThis feature may grow in the future to allow general mid-level 25*9880d681SAndroid Build Coastguard Workeroptimization, but for now, it should be regarded as less efficient than 26*9880d681SAndroid Build Coastguard Workerpassing by value with a copy. 27*9880d681SAndroid Build Coastguard Worker 28*9880d681SAndroid Build Coastguard WorkerIntended Usage 29*9880d681SAndroid Build Coastguard Worker============== 30*9880d681SAndroid Build Coastguard Worker 31*9880d681SAndroid Build Coastguard WorkerThe example below is the intended LLVM IR lowering for some C++ code 32*9880d681SAndroid Build Coastguard Workerthat passes two default-constructed ``Foo`` objects to ``g`` in the 33*9880d681SAndroid Build Coastguard Worker32-bit Microsoft C++ ABI. 34*9880d681SAndroid Build Coastguard Worker 35*9880d681SAndroid Build Coastguard Worker.. code-block:: c++ 36*9880d681SAndroid Build Coastguard Worker 37*9880d681SAndroid Build Coastguard Worker // Foo is non-trivial. 38*9880d681SAndroid Build Coastguard Worker struct Foo { int a, b; Foo(); ~Foo(); Foo(const Foo &); }; 39*9880d681SAndroid Build Coastguard Worker void g(Foo a, Foo b); 40*9880d681SAndroid Build Coastguard Worker void f() { 41*9880d681SAndroid Build Coastguard Worker g(Foo(), Foo()); 42*9880d681SAndroid Build Coastguard Worker } 43*9880d681SAndroid Build Coastguard Worker 44*9880d681SAndroid Build Coastguard Worker.. code-block:: llvm 45*9880d681SAndroid Build Coastguard Worker 46*9880d681SAndroid Build Coastguard Worker %struct.Foo = type { i32, i32 } 47*9880d681SAndroid Build Coastguard Worker declare void @Foo_ctor(%struct.Foo* %this) 48*9880d681SAndroid Build Coastguard Worker declare void @Foo_dtor(%struct.Foo* %this) 49*9880d681SAndroid Build Coastguard Worker declare void @g(<{ %struct.Foo, %struct.Foo }>* inalloca %memargs) 50*9880d681SAndroid Build Coastguard Worker 51*9880d681SAndroid Build Coastguard Worker define void @f() { 52*9880d681SAndroid Build Coastguard Worker entry: 53*9880d681SAndroid Build Coastguard Worker %base = call i8* @llvm.stacksave() 54*9880d681SAndroid Build Coastguard Worker %memargs = alloca <{ %struct.Foo, %struct.Foo }> 55*9880d681SAndroid Build Coastguard Worker %b = getelementptr <{ %struct.Foo, %struct.Foo }>* %memargs, i32 1 56*9880d681SAndroid Build Coastguard Worker call void @Foo_ctor(%struct.Foo* %b) 57*9880d681SAndroid Build Coastguard Worker 58*9880d681SAndroid Build Coastguard Worker ; If a's ctor throws, we must destruct b. 59*9880d681SAndroid Build Coastguard Worker %a = getelementptr <{ %struct.Foo, %struct.Foo }>* %memargs, i32 0 60*9880d681SAndroid Build Coastguard Worker invoke void @Foo_ctor(%struct.Foo* %a) 61*9880d681SAndroid Build Coastguard Worker to label %invoke.cont unwind %invoke.unwind 62*9880d681SAndroid Build Coastguard Worker 63*9880d681SAndroid Build Coastguard Worker invoke.cont: 64*9880d681SAndroid Build Coastguard Worker call void @g(<{ %struct.Foo, %struct.Foo }>* inalloca %memargs) 65*9880d681SAndroid Build Coastguard Worker call void @llvm.stackrestore(i8* %base) 66*9880d681SAndroid Build Coastguard Worker ... 67*9880d681SAndroid Build Coastguard Worker 68*9880d681SAndroid Build Coastguard Worker invoke.unwind: 69*9880d681SAndroid Build Coastguard Worker call void @Foo_dtor(%struct.Foo* %b) 70*9880d681SAndroid Build Coastguard Worker call void @llvm.stackrestore(i8* %base) 71*9880d681SAndroid Build Coastguard Worker ... 72*9880d681SAndroid Build Coastguard Worker } 73*9880d681SAndroid Build Coastguard Worker 74*9880d681SAndroid Build Coastguard WorkerTo avoid stack leaks, the frontend saves the current stack pointer with 75*9880d681SAndroid Build Coastguard Workera call to :ref:`llvm.stacksave <int_stacksave>`. Then, it allocates the 76*9880d681SAndroid Build Coastguard Workerargument stack space with alloca and calls the default constructor. The 77*9880d681SAndroid Build Coastguard Workerdefault constructor could throw an exception, so the frontend has to 78*9880d681SAndroid Build Coastguard Workercreate a landing pad. The frontend has to destroy the already 79*9880d681SAndroid Build Coastguard Workerconstructed argument ``b`` before restoring the stack pointer. If the 80*9880d681SAndroid Build Coastguard Workerconstructor does not unwind, ``g`` is called. In the Microsoft C++ ABI, 81*9880d681SAndroid Build Coastguard Worker``g`` will destroy its arguments, and then the stack is restored in 82*9880d681SAndroid Build Coastguard Worker``f``. 83*9880d681SAndroid Build Coastguard Worker 84*9880d681SAndroid Build Coastguard WorkerDesign Considerations 85*9880d681SAndroid Build Coastguard Worker===================== 86*9880d681SAndroid Build Coastguard Worker 87*9880d681SAndroid Build Coastguard WorkerLifetime 88*9880d681SAndroid Build Coastguard Worker-------- 89*9880d681SAndroid Build Coastguard Worker 90*9880d681SAndroid Build Coastguard WorkerThe biggest design consideration for this feature is object lifetime. 91*9880d681SAndroid Build Coastguard WorkerWe cannot model the arguments as static allocas in the entry block, 92*9880d681SAndroid Build Coastguard Workerbecause all calls need to use the memory at the top of the stack to pass 93*9880d681SAndroid Build Coastguard Workerarguments. We cannot vend pointers to that memory at function entry 94*9880d681SAndroid Build Coastguard Workerbecause after code generation they will alias. 95*9880d681SAndroid Build Coastguard Worker 96*9880d681SAndroid Build Coastguard WorkerThe rule against allocas between argument allocations and the call site 97*9880d681SAndroid Build Coastguard Workeravoids this problem, but it creates a cleanup problem. Cleanup and 98*9880d681SAndroid Build Coastguard Workerlifetime is handled explicitly with stack save and restore calls. In 99*9880d681SAndroid Build Coastguard Workerthe future, we may want to introduce a new construct such as ``freea`` 100*9880d681SAndroid Build Coastguard Workeror ``afree`` to make it clear that this stack adjusting cleanup is less 101*9880d681SAndroid Build Coastguard Workerpowerful than a full stack save and restore. 102*9880d681SAndroid Build Coastguard Worker 103*9880d681SAndroid Build Coastguard WorkerNested Calls and Copy Elision 104*9880d681SAndroid Build Coastguard Worker----------------------------- 105*9880d681SAndroid Build Coastguard Worker 106*9880d681SAndroid Build Coastguard WorkerWe also want to be able to support copy elision into these argument 107*9880d681SAndroid Build Coastguard Workerslots. This means we have to support multiple live argument 108*9880d681SAndroid Build Coastguard Workerallocations. 109*9880d681SAndroid Build Coastguard Worker 110*9880d681SAndroid Build Coastguard WorkerConsider the evaluation of: 111*9880d681SAndroid Build Coastguard Worker 112*9880d681SAndroid Build Coastguard Worker.. code-block:: c++ 113*9880d681SAndroid Build Coastguard Worker 114*9880d681SAndroid Build Coastguard Worker // Foo is non-trivial. 115*9880d681SAndroid Build Coastguard Worker struct Foo { int a; Foo(); Foo(const &Foo); ~Foo(); }; 116*9880d681SAndroid Build Coastguard Worker Foo bar(Foo b); 117*9880d681SAndroid Build Coastguard Worker int main() { 118*9880d681SAndroid Build Coastguard Worker bar(bar(Foo())); 119*9880d681SAndroid Build Coastguard Worker } 120*9880d681SAndroid Build Coastguard Worker 121*9880d681SAndroid Build Coastguard WorkerIn this case, we want to be able to elide copies into ``bar``'s argument 122*9880d681SAndroid Build Coastguard Workerslots. That means we need to have more than one set of argument frames 123*9880d681SAndroid Build Coastguard Workeractive at the same time. First, we need to allocate the frame for the 124*9880d681SAndroid Build Coastguard Workerouter call so we can pass it in as the hidden struct return pointer to 125*9880d681SAndroid Build Coastguard Workerthe middle call. Then we do the same for the middle call, allocating a 126*9880d681SAndroid Build Coastguard Workerframe and passing its address to ``Foo``'s default constructor. By 127*9880d681SAndroid Build Coastguard Workerwrapping the evaluation of the inner ``bar`` with stack save and 128*9880d681SAndroid Build Coastguard Workerrestore, we can have multiple overlapping active call frames. 129*9880d681SAndroid Build Coastguard Worker 130*9880d681SAndroid Build Coastguard WorkerCallee-cleanup Calling Conventions 131*9880d681SAndroid Build Coastguard Worker---------------------------------- 132*9880d681SAndroid Build Coastguard Worker 133*9880d681SAndroid Build Coastguard WorkerAnother wrinkle is the existence of callee-cleanup conventions. On 134*9880d681SAndroid Build Coastguard WorkerWindows, all methods and many other functions adjust the stack to clear 135*9880d681SAndroid Build Coastguard Workerthe memory used to pass their arguments. In some sense, this means that 136*9880d681SAndroid Build Coastguard Workerthe allocas are automatically cleared by the call. However, LLVM 137*9880d681SAndroid Build Coastguard Workerinstead models this as a write of undef to all of the inalloca values 138*9880d681SAndroid Build Coastguard Workerpassed to the call instead of a stack adjustment. Frontends should 139*9880d681SAndroid Build Coastguard Workerstill restore the stack pointer to avoid a stack leak. 140*9880d681SAndroid Build Coastguard Worker 141*9880d681SAndroid Build Coastguard WorkerExceptions 142*9880d681SAndroid Build Coastguard Worker---------- 143*9880d681SAndroid Build Coastguard Worker 144*9880d681SAndroid Build Coastguard WorkerThere is also the possibility of an exception. If argument evaluation 145*9880d681SAndroid Build Coastguard Workeror copy construction throws an exception, the landing pad must do 146*9880d681SAndroid Build Coastguard Workercleanup, which includes adjusting the stack pointer to avoid a stack 147*9880d681SAndroid Build Coastguard Workerleak. This means the cleanup of the stack memory cannot be tied to the 148*9880d681SAndroid Build Coastguard Workercall itself. There needs to be a separate IR-level instruction that can 149*9880d681SAndroid Build Coastguard Workerperform independent cleanup of arguments. 150*9880d681SAndroid Build Coastguard Worker 151*9880d681SAndroid Build Coastguard WorkerEfficiency 152*9880d681SAndroid Build Coastguard Worker---------- 153*9880d681SAndroid Build Coastguard Worker 154*9880d681SAndroid Build Coastguard WorkerEventually, it should be possible to generate efficient code for this 155*9880d681SAndroid Build Coastguard Workerconstruct. In particular, using inalloca should not require a base 156*9880d681SAndroid Build Coastguard Workerpointer. If the backend can prove that all points in the CFG only have 157*9880d681SAndroid Build Coastguard Workerone possible stack level, then it can address the stack directly from 158*9880d681SAndroid Build Coastguard Workerthe stack pointer. While this is not yet implemented, the plan is that 159*9880d681SAndroid Build Coastguard Workerthe inalloca attribute should not change much, but the frontend IR 160*9880d681SAndroid Build Coastguard Workergeneration recommendations may change. 161