xref: /aosp_15_r20/external/llvm/docs/InAlloca.rst (revision 9880d6810fe72a1726cb53787c6711e909410d58)
1*9880d681SAndroid Build Coastguard Worker==========================================
2*9880d681SAndroid Build Coastguard WorkerDesign and Usage of the InAlloca Attribute
3*9880d681SAndroid Build Coastguard Worker==========================================
4*9880d681SAndroid Build Coastguard Worker
5*9880d681SAndroid Build Coastguard WorkerIntroduction
6*9880d681SAndroid Build Coastguard Worker============
7*9880d681SAndroid Build Coastguard Worker
8*9880d681SAndroid Build Coastguard WorkerThe :ref:`inalloca <attr_inalloca>` attribute is designed to allow
9*9880d681SAndroid Build Coastguard Workertaking the address of an aggregate argument that is being passed by
10*9880d681SAndroid Build Coastguard Workervalue through memory.  Primarily, this feature is required for
11*9880d681SAndroid Build Coastguard Workercompatibility with the Microsoft C++ ABI.  Under that ABI, class
12*9880d681SAndroid Build Coastguard Workerinstances that are passed by value are constructed directly into
13*9880d681SAndroid Build Coastguard Workerargument stack memory.  Prior to the addition of inalloca, calls in LLVM
14*9880d681SAndroid Build Coastguard Workerwere indivisible instructions.  There was no way to perform intermediate
15*9880d681SAndroid Build Coastguard Workerwork, such as object construction, between the first stack adjustment
16*9880d681SAndroid Build Coastguard Workerand the final control transfer.  With inalloca, all arguments passed in
17*9880d681SAndroid Build Coastguard Workermemory are modelled as a single alloca, which can be stored to prior to
18*9880d681SAndroid Build Coastguard Workerthe call.  Unfortunately, this complicated feature comes with a large
19*9880d681SAndroid Build Coastguard Workerset of restrictions designed to bound the lifetime of the argument
20*9880d681SAndroid Build Coastguard Workermemory around the call.
21*9880d681SAndroid Build Coastguard Worker
22*9880d681SAndroid Build Coastguard WorkerFor now, it is recommended that frontends and optimizers avoid producing
23*9880d681SAndroid Build Coastguard Workerthis construct, primarily because it forces the use of a base pointer.
24*9880d681SAndroid Build Coastguard WorkerThis feature may grow in the future to allow general mid-level
25*9880d681SAndroid Build Coastguard Workeroptimization, but for now, it should be regarded as less efficient than
26*9880d681SAndroid Build Coastguard Workerpassing by value with a copy.
27*9880d681SAndroid Build Coastguard Worker
28*9880d681SAndroid Build Coastguard WorkerIntended Usage
29*9880d681SAndroid Build Coastguard Worker==============
30*9880d681SAndroid Build Coastguard Worker
31*9880d681SAndroid Build Coastguard WorkerThe example below is the intended LLVM IR lowering for some C++ code
32*9880d681SAndroid Build Coastguard Workerthat passes two default-constructed ``Foo`` objects to ``g`` in the
33*9880d681SAndroid Build Coastguard Worker32-bit Microsoft C++ ABI.
34*9880d681SAndroid Build Coastguard Worker
35*9880d681SAndroid Build Coastguard Worker.. code-block:: c++
36*9880d681SAndroid Build Coastguard Worker
37*9880d681SAndroid Build Coastguard Worker    // Foo is non-trivial.
38*9880d681SAndroid Build Coastguard Worker    struct Foo { int a, b; Foo(); ~Foo(); Foo(const Foo &); };
39*9880d681SAndroid Build Coastguard Worker    void g(Foo a, Foo b);
40*9880d681SAndroid Build Coastguard Worker    void f() {
41*9880d681SAndroid Build Coastguard Worker      g(Foo(), Foo());
42*9880d681SAndroid Build Coastguard Worker    }
43*9880d681SAndroid Build Coastguard Worker
44*9880d681SAndroid Build Coastguard Worker.. code-block:: llvm
45*9880d681SAndroid Build Coastguard Worker
46*9880d681SAndroid Build Coastguard Worker    %struct.Foo = type { i32, i32 }
47*9880d681SAndroid Build Coastguard Worker    declare void @Foo_ctor(%struct.Foo* %this)
48*9880d681SAndroid Build Coastguard Worker    declare void @Foo_dtor(%struct.Foo* %this)
49*9880d681SAndroid Build Coastguard Worker    declare void @g(<{ %struct.Foo, %struct.Foo }>* inalloca %memargs)
50*9880d681SAndroid Build Coastguard Worker
51*9880d681SAndroid Build Coastguard Worker    define void @f() {
52*9880d681SAndroid Build Coastguard Worker    entry:
53*9880d681SAndroid Build Coastguard Worker      %base = call i8* @llvm.stacksave()
54*9880d681SAndroid Build Coastguard Worker      %memargs = alloca <{ %struct.Foo, %struct.Foo }>
55*9880d681SAndroid Build Coastguard Worker      %b = getelementptr <{ %struct.Foo, %struct.Foo }>* %memargs, i32 1
56*9880d681SAndroid Build Coastguard Worker      call void @Foo_ctor(%struct.Foo* %b)
57*9880d681SAndroid Build Coastguard Worker
58*9880d681SAndroid Build Coastguard Worker      ; If a's ctor throws, we must destruct b.
59*9880d681SAndroid Build Coastguard Worker      %a = getelementptr <{ %struct.Foo, %struct.Foo }>* %memargs, i32 0
60*9880d681SAndroid Build Coastguard Worker      invoke void @Foo_ctor(%struct.Foo* %a)
61*9880d681SAndroid Build Coastguard Worker          to label %invoke.cont unwind %invoke.unwind
62*9880d681SAndroid Build Coastguard Worker
63*9880d681SAndroid Build Coastguard Worker    invoke.cont:
64*9880d681SAndroid Build Coastguard Worker      call void @g(<{ %struct.Foo, %struct.Foo }>* inalloca %memargs)
65*9880d681SAndroid Build Coastguard Worker      call void @llvm.stackrestore(i8* %base)
66*9880d681SAndroid Build Coastguard Worker      ...
67*9880d681SAndroid Build Coastguard Worker
68*9880d681SAndroid Build Coastguard Worker    invoke.unwind:
69*9880d681SAndroid Build Coastguard Worker      call void @Foo_dtor(%struct.Foo* %b)
70*9880d681SAndroid Build Coastguard Worker      call void @llvm.stackrestore(i8* %base)
71*9880d681SAndroid Build Coastguard Worker      ...
72*9880d681SAndroid Build Coastguard Worker    }
73*9880d681SAndroid Build Coastguard Worker
74*9880d681SAndroid Build Coastguard WorkerTo avoid stack leaks, the frontend saves the current stack pointer with
75*9880d681SAndroid Build Coastguard Workera call to :ref:`llvm.stacksave <int_stacksave>`.  Then, it allocates the
76*9880d681SAndroid Build Coastguard Workerargument stack space with alloca and calls the default constructor.  The
77*9880d681SAndroid Build Coastguard Workerdefault constructor could throw an exception, so the frontend has to
78*9880d681SAndroid Build Coastguard Workercreate a landing pad.  The frontend has to destroy the already
79*9880d681SAndroid Build Coastguard Workerconstructed argument ``b`` before restoring the stack pointer.  If the
80*9880d681SAndroid Build Coastguard Workerconstructor does not unwind, ``g`` is called.  In the Microsoft C++ ABI,
81*9880d681SAndroid Build Coastguard Worker``g`` will destroy its arguments, and then the stack is restored in
82*9880d681SAndroid Build Coastguard Worker``f``.
83*9880d681SAndroid Build Coastguard Worker
84*9880d681SAndroid Build Coastguard WorkerDesign Considerations
85*9880d681SAndroid Build Coastguard Worker=====================
86*9880d681SAndroid Build Coastguard Worker
87*9880d681SAndroid Build Coastguard WorkerLifetime
88*9880d681SAndroid Build Coastguard Worker--------
89*9880d681SAndroid Build Coastguard Worker
90*9880d681SAndroid Build Coastguard WorkerThe biggest design consideration for this feature is object lifetime.
91*9880d681SAndroid Build Coastguard WorkerWe cannot model the arguments as static allocas in the entry block,
92*9880d681SAndroid Build Coastguard Workerbecause all calls need to use the memory at the top of the stack to pass
93*9880d681SAndroid Build Coastguard Workerarguments.  We cannot vend pointers to that memory at function entry
94*9880d681SAndroid Build Coastguard Workerbecause after code generation they will alias.
95*9880d681SAndroid Build Coastguard Worker
96*9880d681SAndroid Build Coastguard WorkerThe rule against allocas between argument allocations and the call site
97*9880d681SAndroid Build Coastguard Workeravoids this problem, but it creates a cleanup problem.  Cleanup and
98*9880d681SAndroid Build Coastguard Workerlifetime is handled explicitly with stack save and restore calls.  In
99*9880d681SAndroid Build Coastguard Workerthe future, we may want to introduce a new construct such as ``freea``
100*9880d681SAndroid Build Coastguard Workeror ``afree`` to make it clear that this stack adjusting cleanup is less
101*9880d681SAndroid Build Coastguard Workerpowerful than a full stack save and restore.
102*9880d681SAndroid Build Coastguard Worker
103*9880d681SAndroid Build Coastguard WorkerNested Calls and Copy Elision
104*9880d681SAndroid Build Coastguard Worker-----------------------------
105*9880d681SAndroid Build Coastguard Worker
106*9880d681SAndroid Build Coastguard WorkerWe also want to be able to support copy elision into these argument
107*9880d681SAndroid Build Coastguard Workerslots.  This means we have to support multiple live argument
108*9880d681SAndroid Build Coastguard Workerallocations.
109*9880d681SAndroid Build Coastguard Worker
110*9880d681SAndroid Build Coastguard WorkerConsider the evaluation of:
111*9880d681SAndroid Build Coastguard Worker
112*9880d681SAndroid Build Coastguard Worker.. code-block:: c++
113*9880d681SAndroid Build Coastguard Worker
114*9880d681SAndroid Build Coastguard Worker    // Foo is non-trivial.
115*9880d681SAndroid Build Coastguard Worker    struct Foo { int a; Foo(); Foo(const &Foo); ~Foo(); };
116*9880d681SAndroid Build Coastguard Worker    Foo bar(Foo b);
117*9880d681SAndroid Build Coastguard Worker    int main() {
118*9880d681SAndroid Build Coastguard Worker      bar(bar(Foo()));
119*9880d681SAndroid Build Coastguard Worker    }
120*9880d681SAndroid Build Coastguard Worker
121*9880d681SAndroid Build Coastguard WorkerIn this case, we want to be able to elide copies into ``bar``'s argument
122*9880d681SAndroid Build Coastguard Workerslots.  That means we need to have more than one set of argument frames
123*9880d681SAndroid Build Coastguard Workeractive at the same time.  First, we need to allocate the frame for the
124*9880d681SAndroid Build Coastguard Workerouter call so we can pass it in as the hidden struct return pointer to
125*9880d681SAndroid Build Coastguard Workerthe middle call.  Then we do the same for the middle call, allocating a
126*9880d681SAndroid Build Coastguard Workerframe and passing its address to ``Foo``'s default constructor.  By
127*9880d681SAndroid Build Coastguard Workerwrapping the evaluation of the inner ``bar`` with stack save and
128*9880d681SAndroid Build Coastguard Workerrestore, we can have multiple overlapping active call frames.
129*9880d681SAndroid Build Coastguard Worker
130*9880d681SAndroid Build Coastguard WorkerCallee-cleanup Calling Conventions
131*9880d681SAndroid Build Coastguard Worker----------------------------------
132*9880d681SAndroid Build Coastguard Worker
133*9880d681SAndroid Build Coastguard WorkerAnother wrinkle is the existence of callee-cleanup conventions.  On
134*9880d681SAndroid Build Coastguard WorkerWindows, all methods and many other functions adjust the stack to clear
135*9880d681SAndroid Build Coastguard Workerthe memory used to pass their arguments.  In some sense, this means that
136*9880d681SAndroid Build Coastguard Workerthe allocas are automatically cleared by the call.  However, LLVM
137*9880d681SAndroid Build Coastguard Workerinstead models this as a write of undef to all of the inalloca values
138*9880d681SAndroid Build Coastguard Workerpassed to the call instead of a stack adjustment.  Frontends should
139*9880d681SAndroid Build Coastguard Workerstill restore the stack pointer to avoid a stack leak.
140*9880d681SAndroid Build Coastguard Worker
141*9880d681SAndroid Build Coastguard WorkerExceptions
142*9880d681SAndroid Build Coastguard Worker----------
143*9880d681SAndroid Build Coastguard Worker
144*9880d681SAndroid Build Coastguard WorkerThere is also the possibility of an exception.  If argument evaluation
145*9880d681SAndroid Build Coastguard Workeror copy construction throws an exception, the landing pad must do
146*9880d681SAndroid Build Coastguard Workercleanup, which includes adjusting the stack pointer to avoid a stack
147*9880d681SAndroid Build Coastguard Workerleak.  This means the cleanup of the stack memory cannot be tied to the
148*9880d681SAndroid Build Coastguard Workercall itself.  There needs to be a separate IR-level instruction that can
149*9880d681SAndroid Build Coastguard Workerperform independent cleanup of arguments.
150*9880d681SAndroid Build Coastguard Worker
151*9880d681SAndroid Build Coastguard WorkerEfficiency
152*9880d681SAndroid Build Coastguard Worker----------
153*9880d681SAndroid Build Coastguard Worker
154*9880d681SAndroid Build Coastguard WorkerEventually, it should be possible to generate efficient code for this
155*9880d681SAndroid Build Coastguard Workerconstruct.  In particular, using inalloca should not require a base
156*9880d681SAndroid Build Coastguard Workerpointer.  If the backend can prove that all points in the CFG only have
157*9880d681SAndroid Build Coastguard Workerone possible stack level, then it can address the stack directly from
158*9880d681SAndroid Build Coastguard Workerthe stack pointer.  While this is not yet implemented, the plan is that
159*9880d681SAndroid Build Coastguard Workerthe inalloca attribute should not change much, but the frontend IR
160*9880d681SAndroid Build Coastguard Workergeneration recommendations may change.
161