xref: /aosp_15_r20/art/runtime/mutator_gc_coord.md (revision 795d594fd825385562da6b089ea9b2033f3abf5a)
1*795d594fSAndroid Build Coastguard WorkerMechanisms for Coordination Between Garbage Collector and Mutator
2*795d594fSAndroid Build Coastguard Worker-----------------------------------------------------------------
3*795d594fSAndroid Build Coastguard Worker
4*795d594fSAndroid Build Coastguard WorkerMost garbage collection work can proceed concurrently with the client or
5*795d594fSAndroid Build Coastguard Workermutator Java threads. But in certain places, for example while tracing from
6*795d594fSAndroid Build Coastguard Workerthread stacks, the garbage collector needs to ensure that Java data processed
7*795d594fSAndroid Build Coastguard Workerby the collector is consistent and complete. At these points, the mutators
8*795d594fSAndroid Build Coastguard Workershould not hold references to the heap that are invisible to the garbage
9*795d594fSAndroid Build Coastguard Workercollector. And they should not be modifying the data that is visible to the
10*795d594fSAndroid Build Coastguard Workercollector.
11*795d594fSAndroid Build Coastguard Worker
12*795d594fSAndroid Build Coastguard WorkerLogically, the collector and mutator share a reader-writer lock on the Java
13*795d594fSAndroid Build Coastguard Workerheap and associated data structures. Mutators hold the lock in reader or shared mode
14*795d594fSAndroid Build Coastguard Workerwhile running Java code or touching heap-related data structures. The collector
15*795d594fSAndroid Build Coastguard Workerholds the lock in writer or exclusive mode while it needs the heap data
16*795d594fSAndroid Build Coastguard Workerstructures to be stable. However, this reader-writer lock has a very customized
17*795d594fSAndroid Build Coastguard Workerimplementation that also provides additional facilities, such as the ability
18*795d594fSAndroid Build Coastguard Workerto exclude only a single thread, so that we can specifically examine its heap
19*795d594fSAndroid Build Coastguard Workerreferences.
20*795d594fSAndroid Build Coastguard Worker
21*795d594fSAndroid Build Coastguard WorkerIn order to ensure consistency of the Java data, the compiler inserts "suspend
22*795d594fSAndroid Build Coastguard Workerpoints", sometimes also called "safe points" into the code. These allow a thread
23*795d594fSAndroid Build Coastguard Workerto respond to external requests.
24*795d594fSAndroid Build Coastguard Worker
25*795d594fSAndroid Build Coastguard WorkerWhenever a thread is runnable, i.e. whenever a thread logically holds the
26*795d594fSAndroid Build Coastguard Workermutator lock in shared mode, it is expected to regularly execute such a suspend
27*795d594fSAndroid Build Coastguard Workerpoint, and check for pending requests. They are currently implemented by
28*795d594fSAndroid Build Coastguard Workersetting a flag in the thread structure[^1], which is then explicitly tested by the
29*795d594fSAndroid Build Coastguard Workercompiler-generated code.
30*795d594fSAndroid Build Coastguard Worker
31*795d594fSAndroid Build Coastguard WorkerA thread responds to suspend requests only when it is "runnable", i.e. logically
32*795d594fSAndroid Build Coastguard Workerrunning Java code. When it runs native code, or is blocked in a kernel call, it
33*795d594fSAndroid Build Coastguard Workerlogically releases the mutator lock. When the garbage collector needs mutator
34*795d594fSAndroid Build Coastguard Workercooperation, and the thread is not runnable, it is assured that the mutator is
35*795d594fSAndroid Build Coastguard Workernot touching Java data, and hence the collector can safely perform the required
36*795d594fSAndroid Build Coastguard Workeraction itself, on the mutator thread's behalf.
37*795d594fSAndroid Build Coastguard Worker
38*795d594fSAndroid Build Coastguard WorkerNormally, when a thread makes a JNI call, it is not considered runnable while
39*795d594fSAndroid Build Coastguard Workerexecuting native code. This makes the transitions to and from running native JNI
40*795d594fSAndroid Build Coastguard Workercode somewhat expensive (see below). But these transitions are necessary to
41*795d594fSAndroid Build Coastguard Workerensure that such code, which does not execute "suspend points", and can thus not
42*795d594fSAndroid Build Coastguard Workercooperate with the GC, doesn't delay GC completion. `@FastNative` and
43*795d594fSAndroid Build Coastguard Worker`@CriticalNative` calls avoid these transitions, instead allowing the thread to
44*795d594fSAndroid Build Coastguard Workerremain "runnable", at the expense of potentially delaying GC operations for the
45*795d594fSAndroid Build Coastguard Workerduration of the call.
46*795d594fSAndroid Build Coastguard Worker
47*795d594fSAndroid Build Coastguard WorkerAlthough we say that a thread is "suspended" when it is not running Java code,
48*795d594fSAndroid Build Coastguard Workerit may in fact still be running native code and touching data structures that
49*795d594fSAndroid Build Coastguard Workerare not considered "Java data". This distinction can be a fine line. For
50*795d594fSAndroid Build Coastguard Workerexample, a Java thread blocked on a Java monitor will normally be "suspended"
51*795d594fSAndroid Build Coastguard Workerand blocked on a mutex contained in the monitor data structure. But it may wake
52*795d594fSAndroid Build Coastguard Workerup for reasons beyond ARTs control, which will normally result in touching the
53*795d594fSAndroid Build Coastguard Workermutex. The monitor code must be quite careful to ensure that this does not cause
54*795d594fSAndroid Build Coastguard Workerproblems, especially if the ART runtime was shut down in the interim and the
55*795d594fSAndroid Build Coastguard Workermonitor data structure has been reclaimed.
56*795d594fSAndroid Build Coastguard Worker
57*795d594fSAndroid Build Coastguard WorkerCalls to change thread state
58*795d594fSAndroid Build Coastguard Worker----------------------------
59*795d594fSAndroid Build Coastguard Worker
60*795d594fSAndroid Build Coastguard WorkerWhen a thread changes between running Java and native code, it has to
61*795d594fSAndroid Build Coastguard Workercorrespondingly change its state between "runnable" and one of several
62*795d594fSAndroid Build Coastguard Workerother states, all of which are considered to be "suspended" for our purposes.
63*795d594fSAndroid Build Coastguard WorkerWhen a Java thread starts to execute native code, and may thus not respond
64*795d594fSAndroid Build Coastguard Workerpromptly to suspend requests, it will normally create an object of type
65*795d594fSAndroid Build Coastguard Worker`ScopedThreadSuspension`. `ScopedThreadSuspension`'s constructor changes state to
66*795d594fSAndroid Build Coastguard Workerthe "suspended" state given as an argument, logically releasing the mutator lock
67*795d594fSAndroid Build Coastguard Workerand promising to no longer touch Java data structures. It also handles any
68*795d594fSAndroid Build Coastguard Workerpending suspension requests that slid in just before it changed state.
69*795d594fSAndroid Build Coastguard Worker
70*795d594fSAndroid Build Coastguard WorkerConversely, `ScopedThreadSuspension`'s destructor waits until the GC has finished
71*795d594fSAndroid Build Coastguard Workerany actions it is currently performing on the thread's behalf and effectively
72*795d594fSAndroid Build Coastguard Workerreleased the mutator exclusive lock, and then returns to runnable state,
73*795d594fSAndroid Build Coastguard Workerre-acquiring the mutator lock.
74*795d594fSAndroid Build Coastguard Worker
75*795d594fSAndroid Build Coastguard WorkerOccasionally a thread running native code needs to temporarily again access Java
76*795d594fSAndroid Build Coastguard Workerdata structures, performing the above transitions in the opposite order.
77*795d594fSAndroid Build Coastguard Worker`ScopedObjectAccess` is a similar RAII object whose constructor and destructor
78*795d594fSAndroid Build Coastguard Workerperform those transitions in the reverse order from `ScopedThreadSuspension`.
79*795d594fSAndroid Build Coastguard Worker
80*795d594fSAndroid Build Coastguard WorkerMutator lock implementation
81*795d594fSAndroid Build Coastguard Worker---------------------------
82*795d594fSAndroid Build Coastguard Worker
83*795d594fSAndroid Build Coastguard WorkerThe mutator lock is not implemented as a conventional mutex. But it plays by the
84*795d594fSAndroid Build Coastguard Workerrules of our normal static thread-safety analysis. Thus a function that is
85*795d594fSAndroid Build Coastguard Workerexpected to be called in runnable state, with the ability to access Java data,
86*795d594fSAndroid Build Coastguard Workershould be annotated with `REQUIRES_SHARED(Locks::mutator_lock_)`.
87*795d594fSAndroid Build Coastguard Worker
88*795d594fSAndroid Build Coastguard WorkerThere is an explicit `mutator_lock_` object, of type `MutatorMutex`. `MutatorMutex` is
89*795d594fSAndroid Build Coastguard Workerseemingly a minor refinement of `ReaderWriterMutex`, but it is used entirely
90*795d594fSAndroid Build Coastguard Workerdifferently. It is acquired explicitly by clients that need to hold it
91*795d594fSAndroid Build Coastguard Workerexclusively, and in a small number of cases, it is acquired in shared mode, e.g.
92*795d594fSAndroid Build Coastguard Workervia `SharedTryLock()`, or by the GC itself. However, more commonly
93*795d594fSAndroid Build Coastguard Worker`MutatorMutex::TransitionFromSuspendedToRunnable()`, is used to logically acquire
94*795d594fSAndroid Build Coastguard Workerthe mutator mutex, e.g. as part of `ScopedObjectAccess` construction.
95*795d594fSAndroid Build Coastguard Worker
96*795d594fSAndroid Build Coastguard Worker`TransitionFromSuspendedToRunnable()` does not physically acquire the
97*795d594fSAndroid Build Coastguard Worker`ReaderWriterMutex` in shared mode. Thus any thread acquiring the lock in exclusive mode
98*795d594fSAndroid Build Coastguard Workermust, in addition, explicitly arrange for mutator threads to be suspended via the
99*795d594fSAndroid Build Coastguard Workerthread suspension mechanism, and then make them runnable again on release.
100*795d594fSAndroid Build Coastguard Worker
101*795d594fSAndroid Build Coastguard WorkerLogically the mutator lock is held in shared/reader mode if ***either*** the
102*795d594fSAndroid Build Coastguard Workerunderlying reader-writer lock is held in shared mode, ***or*** if a mutator is in
103*795d594fSAndroid Build Coastguard Workerrunnable state. These two ways of holding the mutator mutex are ***not***
104*795d594fSAndroid Build Coastguard Workerequivalent: In particular, we rely on the garbage collector never actually
105*795d594fSAndroid Build Coastguard Workerentering a "runnable" state while active (see below). However, it often runs with
106*795d594fSAndroid Build Coastguard Workerthe explicit mutator mutex in shared mode, thus blocking others from acquiring it
107*795d594fSAndroid Build Coastguard Workerin exclusive mode.
108*795d594fSAndroid Build Coastguard Worker
109*795d594fSAndroid Build Coastguard WorkerSuspension and checkpoint API
110*795d594fSAndroid Build Coastguard Worker-----------------------------
111*795d594fSAndroid Build Coastguard Worker
112*795d594fSAndroid Build Coastguard WorkerSuspend point checks enable three kinds of communication with mutator threads:
113*795d594fSAndroid Build Coastguard Worker
114*795d594fSAndroid Build Coastguard Worker**Checkpoints**
115*795d594fSAndroid Build Coastguard Worker: Checkpoint requests are used to get a thread to perform an action
116*795d594fSAndroid Build Coastguard Workeron our behalf. `RequestCheckpoint()` asks a specific thread to execute the closure
117*795d594fSAndroid Build Coastguard Workersupplied as an argument at its leisure. `RequestSynchronousCheckpoint()` in
118*795d594fSAndroid Build Coastguard Workeraddition waits for the thread to complete running the closure, and handles
119*795d594fSAndroid Build Coastguard Workersuspended threads by running the closure on their behalf. In addition to these
120*795d594fSAndroid Build Coastguard Workerfunctions provided by `Thread`, `ThreadList` provides the `RunCheckpoint()` function
121*795d594fSAndroid Build Coastguard Workerthat runs a checkpoint function on behalf of each thread, either by using
122*795d594fSAndroid Build Coastguard Worker`RequestCheckpoint()` to run it inside a running thread, or by ensuring that a
123*795d594fSAndroid Build Coastguard Workersuspended thread stays suspended, and then running the function on its behalf.
124*795d594fSAndroid Build Coastguard Worker`RunCheckpoint()` does not wait for completion of the function calls triggered by
125*795d594fSAndroid Build Coastguard Workerthe resulting `RequestCheckpoint()` invocations.
126*795d594fSAndroid Build Coastguard Worker
127*795d594fSAndroid Build Coastguard Worker**Empty checkpoints**
128*795d594fSAndroid Build Coastguard Worker: ThreadList provides `RunEmptyCheckpoint()`, which waits until
129*795d594fSAndroid Build Coastguard Workerall threads have either passed a suspend point, or have been suspended. This
130*795d594fSAndroid Build Coastguard Workerensures that no thread is still executing Java code inside the same
131*795d594fSAndroid Build Coastguard Workersuspend-point-delimited code interval it was executing before the call. For
132*795d594fSAndroid Build Coastguard Workerexample, a read-barrier started before a `RunEmptyCheckpoint()` call will have
133*795d594fSAndroid Build Coastguard Workerfinished before the call returns.
134*795d594fSAndroid Build Coastguard Worker
135*795d594fSAndroid Build Coastguard Worker**Thread suspension**
136*795d594fSAndroid Build Coastguard Worker: ThreadList provides a number of `SuspendThread...()` calls and
137*795d594fSAndroid Build Coastguard Workera `SuspendAll()` call to suspend one or all threads until they are resumed by
138*795d594fSAndroid Build Coastguard Worker`Resume()` or `ResumeAll()`. The `Suspend...` calls guarantee that the target
139*795d594fSAndroid Build Coastguard Workerthread(s) are suspended (again, only in the sense of not running Java code)
140*795d594fSAndroid Build Coastguard Workerwhen the call returns.
141*795d594fSAndroid Build Coastguard Worker
142*795d594fSAndroid Build Coastguard WorkerDeadlock freedom
143*795d594fSAndroid Build Coastguard Worker----------------
144*795d594fSAndroid Build Coastguard Worker
145*795d594fSAndroid Build Coastguard WorkerIt is easy to deadlock while attempting to run checkpoints, or suspending
146*795d594fSAndroid Build Coastguard Workerthreads. In particular, we need to avoid situations in which we cannot suspend
147*795d594fSAndroid Build Coastguard Workera thread because it is blocked, directly, or indirectly, on the GC completing
148*795d594fSAndroid Build Coastguard Workerits task. Deadlocks are avoided as follows:
149*795d594fSAndroid Build Coastguard Worker
150*795d594fSAndroid Build Coastguard Worker**Mutator lock ordering**
151*795d594fSAndroid Build Coastguard WorkerThe mutator lock participates in the normal ART lock ordering hierarchy, as though it
152*795d594fSAndroid Build Coastguard Workerwere a regular lock. See `base/locks.h` for the hierarchy. In particular, only
153*795d594fSAndroid Build Coastguard Workerlocks at or below level `kPostMutatorTopLockLevel` may be acquired after
154*795d594fSAndroid Build Coastguard Workeracquiring the mutator lock, e.g. inside the scope of a `ScopedObjectAccess`.
155*795d594fSAndroid Build Coastguard WorkerSimilarly only locks at level strictly above `kMutatatorLock` may be held while
156*795d594fSAndroid Build Coastguard Workeracquiring the mutator lock, e.g. either by starting a `ScopedObjectAccess`, or
157*795d594fSAndroid Build Coastguard Workerending a `ScopedThreadSuspension`.
158*795d594fSAndroid Build Coastguard Worker
159*795d594fSAndroid Build Coastguard WorkerThis ensures that code that uses purely mutexes and threads state changes cannot
160*795d594fSAndroid Build Coastguard Workerdeadlock: Since we always wait on a lower-level lock, the holder of the
161*795d594fSAndroid Build Coastguard Workerlowest-level lock can always progress. An attempt to initiate a checkpoint or to
162*795d594fSAndroid Build Coastguard Workersuspend another thread must also be treated as an acquisition of the mutator
163*795d594fSAndroid Build Coastguard Workerlock: A thread that is waiting for a lock before it can respond to the request
164*795d594fSAndroid Build Coastguard Workeris itself holding the mutator lock, and can only be blocked on lower-level
165*795d594fSAndroid Build Coastguard Workerlocks. And acquisition of those can never depend on acquiring the mutator
166*795d594fSAndroid Build Coastguard Workerlock.
167*795d594fSAndroid Build Coastguard Worker
168*795d594fSAndroid Build Coastguard Worker**Checkpoints**
169*795d594fSAndroid Build Coastguard WorkerRunning a checkpoint in a thread requires suspending that thread for the
170*795d594fSAndroid Build Coastguard Workerduration of the checkpoint, or running the checkpoint on the threads behalf
171*795d594fSAndroid Build Coastguard Workerwhile that thread is blocked from executing Java code. In the former case, the
172*795d594fSAndroid Build Coastguard Workercheckpoint code is run from `CheckSuspend`, which requires the mutator lock,
173*795d594fSAndroid Build Coastguard Workerso checkpoint code may only acquire mutexes at or below level
174*795d594fSAndroid Build Coastguard Worker`kPostMutatorTopLockLevel`. But that is not sufficient.
175*795d594fSAndroid Build Coastguard Worker
176*795d594fSAndroid Build Coastguard WorkerNo matter whether the checkpoint is run in the target thread, or on its behalf,
177*795d594fSAndroid Build Coastguard Workerthe target thread is effectively suspended and prevented from running Java code.
178*795d594fSAndroid Build Coastguard WorkerHowever the target may hold arbitrary Java monitors, which it can no longer
179*795d594fSAndroid Build Coastguard Workerrelease. This may also prevent higher level mutexes from getting released.  Thus
180*795d594fSAndroid Build Coastguard Workercheckpoint code should only acquire mutexes at level `kPostMonitorLock` or
181*795d594fSAndroid Build Coastguard Workerbelow.
182*795d594fSAndroid Build Coastguard Worker
183*795d594fSAndroid Build Coastguard Worker
184*795d594fSAndroid Build Coastguard Worker**Waiting**
185*795d594fSAndroid Build Coastguard WorkerThis becomes much more problematic when we wait for something other than a lock.
186*795d594fSAndroid Build Coastguard WorkerWaiting for something that may depend on the GC, while holding the mutator lock,
187*795d594fSAndroid Build Coastguard Workercan potentially lead to deadlock, since it will prevent the waiting thread from
188*795d594fSAndroid Build Coastguard Workerparticipating in GC checkpoints. Waiting while holding a lower-level lock like
189*795d594fSAndroid Build Coastguard Worker`thread_list_lock_` is similarly unsafe in general, since a runnable thread may
190*795d594fSAndroid Build Coastguard Workernot respond to checkpoints until it acquires `thread_list_lock_`. In general,
191*795d594fSAndroid Build Coastguard Workerwaiting for a condition variable while holding an unrelated lock is problematic,
192*795d594fSAndroid Build Coastguard Workerand these are specific instances of that general problem.
193*795d594fSAndroid Build Coastguard Worker
194*795d594fSAndroid Build Coastguard WorkerWe do currently provide `WaitHoldingLocks`, and it is sometimes used with
195*795d594fSAndroid Build Coastguard Workerlow-level locks held. But such code must somehow ensure that such waits
196*795d594fSAndroid Build Coastguard Workereventually terminate without deadlock.
197*795d594fSAndroid Build Coastguard Worker
198*795d594fSAndroid Build Coastguard WorkerOne common use of WaitHoldingLocks is to wait for weak reference processing.
199*795d594fSAndroid Build Coastguard WorkerSpecial rules apply to avoid deadlocks in this case: Such waits must start after
200*795d594fSAndroid Build Coastguard Workerweak reference processing is disabled; the GC may not issue further nonempty
201*795d594fSAndroid Build Coastguard Workercheckpoints or suspend requests until weak reference processing has been
202*795d594fSAndroid Build Coastguard Workerreenabled, and threads have been notified. Thus the waiting thread's inability
203*795d594fSAndroid Build Coastguard Workerto respond to nonempty checkpoints and suspend requests cannot directly block
204*795d594fSAndroid Build Coastguard Workerthe GC. Non-GC checkpoint or suspend requests that target a thread waiting on
205*795d594fSAndroid Build Coastguard Workerreference processing will block until reference processing completes.
206*795d594fSAndroid Build Coastguard Worker
207*795d594fSAndroid Build Coastguard WorkerConsider a case in which thread W1 waits on reference processing, while holding
208*795d594fSAndroid Build Coastguard Workera low-level mutex M. Thread W2 holds the mutator lock and waits on M. We avoid a
209*795d594fSAndroid Build Coastguard Workersituation in which the GC needs to suspend or checkpoint W2 by briefly stopping
210*795d594fSAndroid Build Coastguard Workerthe world to disable weak reference access. During the stop-the-world phase, W1
211*795d594fSAndroid Build Coastguard Workercannot yet be waiting for weak-reference access.  Thus there is no danger of
212*795d594fSAndroid Build Coastguard Workerdeadlock while entering this phase. After this phase, there is no need for W2 to
213*795d594fSAndroid Build Coastguard Workersuspend or execute a nonempty checkpoint. If we replaced the stop-the-world
214*795d594fSAndroid Build Coastguard Workerphase by a checkpoint, W2 could receive the checkpoint request too late, and be
215*795d594fSAndroid Build Coastguard Workerunable to respond.
216*795d594fSAndroid Build Coastguard Worker
217*795d594fSAndroid Build Coastguard WorkerEmpty checkpoints can continue to occur during reference processing.  Reference
218*795d594fSAndroid Build Coastguard Workerprocessing wait loops explicitly handle empty checkpoints, and an empty
219*795d594fSAndroid Build Coastguard Workercheckpoint request notifies the condition variable used to wait for reference
220*795d594fSAndroid Build Coastguard Workerprocessing, after acquiring `reference_processor_lock_`.  This means that empty
221*795d594fSAndroid Build Coastguard Workercheckpoints do not preclude client threads from being in the middle of an
222*795d594fSAndroid Build Coastguard Workeroperation that involves a weak reference access, while nonempty checkpoints do.
223*795d594fSAndroid Build Coastguard Worker
224*795d594fSAndroid Build Coastguard Worker**Suspending the GC**
225*795d594fSAndroid Build Coastguard WorkerUnder unusual conditions, the GC can run on any thread. This means that when
226*795d594fSAndroid Build Coastguard Workerthread *A* suspends thread *B* for some other reason, Thread *B* might be
227*795d594fSAndroid Build Coastguard Workerrunning the garbage collector and conceivably thus cause it to block.  This
228*795d594fSAndroid Build Coastguard Workerwould be very deadlock prone. If Thread *A* allocates while Thread *B* is
229*795d594fSAndroid Build Coastguard Workersuspended in the GC, and the allocation requires the GC's help to complete, we
230*795d594fSAndroid Build Coastguard Workerdeadlock.
231*795d594fSAndroid Build Coastguard Worker
232*795d594fSAndroid Build Coastguard WorkerThus we ensure that the GC, together with anything else that can block GCs,
233*795d594fSAndroid Build Coastguard Workercannot be blocked for thread suspension requests. This is accomplished by
234*795d594fSAndroid Build Coastguard Workerensuring that it always appears to be in a suspended thread state. Since we
235*795d594fSAndroid Build Coastguard Workeronly check for suspend requests when entering the runnable state, suspend
236*795d594fSAndroid Build Coastguard Workerrequests go unnoticed until the GC completes. It may physically acquire and
237*795d594fSAndroid Build Coastguard Workerrelease the actual `mutator_lock_` in either shared or exclusive mode.
238*795d594fSAndroid Build Coastguard Worker
239*795d594fSAndroid Build Coastguard WorkerThread Suspension Mechanics
240*795d594fSAndroid Build Coastguard Worker---------------------------
241*795d594fSAndroid Build Coastguard Worker
242*795d594fSAndroid Build Coastguard WorkerThread suspension is initiated by a registered thread, except that, for testing
243*795d594fSAndroid Build Coastguard Workerpurposes, `SuspendAll` may be invoked with `self == nullptr`.  We never suspend
244*795d594fSAndroid Build Coastguard Workerthe initiating thread, explicitly exclusing it from `SuspendAll()`, and failing
245*795d594fSAndroid Build Coastguard Worker`SuspendThreadBy...()` requests to that effect.
246*795d594fSAndroid Build Coastguard Worker
247*795d594fSAndroid Build Coastguard WorkerThe suspend calls invoke `IncrementSuspendCount()` to increment the thread
248*795d594fSAndroid Build Coastguard Workersuspend count for each thread. That adds a "suspend barrier" (atomic counter) to
249*795d594fSAndroid Build Coastguard Workerthe per-thread list of such counters to decrement. It normally sets the
250*795d594fSAndroid Build Coastguard Worker`kSuspendRequest` ("should enter safepoint handler") and `kActiveSuspendBarrier`
251*795d594fSAndroid Build Coastguard Worker("need to notify us when suspended") flags.
252*795d594fSAndroid Build Coastguard Worker
253*795d594fSAndroid Build Coastguard WorkerAfter setting these two flags, we check whether the thread is suspended and
254*795d594fSAndroid Build Coastguard Worker`kSuspendRequest` is still set. Since the thread is already suspended, it cannot
255*795d594fSAndroid Build Coastguard Workerbe expected to respond to "pass the suspend barrier" (decrement the atomic
256*795d594fSAndroid Build Coastguard Workercounter) in a timely fashion.  Hence we do so on its behalf. This decrements
257*795d594fSAndroid Build Coastguard Workerthe "barrier" and removes it from the thread's list of barriers to decrement,
258*795d594fSAndroid Build Coastguard Workerand clears `kActiveSuspendBarrier`. `kSuspendRequest` remains to ensure the
259*795d594fSAndroid Build Coastguard Workerthread doesn't prematurely return to runnable state.
260*795d594fSAndroid Build Coastguard Worker
261*795d594fSAndroid Build Coastguard WorkerIf `SuspendAllInternal()` does not immediately see a suspended state, then it is up
262*795d594fSAndroid Build Coastguard Workerto the target thread to decrement the suspend barrier.
263*795d594fSAndroid Build Coastguard Worker`TransitionFromRunnableToSuspended()` calls
264*795d594fSAndroid Build Coastguard Worker`TransitionToSuspendedAndRunCheckpoints()`, which changes the thread state
265*795d594fSAndroid Build Coastguard Workerand returns. `TransitionFromRunnableToSuspended()` then calls
266*795d594fSAndroid Build Coastguard Worker`CheckActiveSuspendBarriers()` to check for the `kActiveSuspendBarrier` flag
267*795d594fSAndroid Build Coastguard Workerand decrement the suspend barrier if set.
268*795d594fSAndroid Build Coastguard Worker
269*795d594fSAndroid Build Coastguard WorkerThe `suspend_count_lock_` is not consistently held in the target thread
270*795d594fSAndroid Build Coastguard Workerduring this process.  Thus correctness in resolving the race between a
271*795d594fSAndroid Build Coastguard Workersuspension-requesting thread and a target thread voluntarily suspending relies
272*795d594fSAndroid Build Coastguard Workeron first requesting suspension, and then checking whether the target is
273*795d594fSAndroid Build Coastguard Workeralready suspended, The detailed correctness argument is given in a comment
274*795d594fSAndroid Build Coastguard Workerinside `SuspendAllInternal()`. This also ensures that the barrier cannot be
275*795d594fSAndroid Build Coastguard Workerdecremented after the stack frame holding the barrier goes away.
276*795d594fSAndroid Build Coastguard Worker
277*795d594fSAndroid Build Coastguard WorkerThis relies on the fact that the two stores in the two threads to the state and
278*795d594fSAndroid Build Coastguard WorkerkActiveSuspendBarrier flag are ordered with respect to the later loads. That's
279*795d594fSAndroid Build Coastguard Workerguaranteed, since they are all stored in a single `atomic<>`. Thus even relaxed
280*795d594fSAndroid Build Coastguard Workeraccesses are OK.
281*795d594fSAndroid Build Coastguard Worker
282*795d594fSAndroid Build Coastguard WorkerThe actual suspend barrier representation still varies between `SuspendAll()`
283*795d594fSAndroid Build Coastguard Workerand `SuspendThreadBy...()`.  The former relies on the fact that only one such
284*795d594fSAndroid Build Coastguard Workerbarrier can be in use at a time, while the latter maintains a linked list of
285*795d594fSAndroid Build Coastguard Workeractive suspend barriers for each target thread, relying on the fact that each
286*795d594fSAndroid Build Coastguard Workerone can appear on the list of only one thread, and we can thus use list nodes
287*795d594fSAndroid Build Coastguard Workerallocated in the stack frames of requesting threads.
288*795d594fSAndroid Build Coastguard Worker
289*795d594fSAndroid Build Coastguard Worker**Avoiding suspension cycles**
290*795d594fSAndroid Build Coastguard Worker
291*795d594fSAndroid Build Coastguard WorkerAny thread can issue a `SuspendThreadByPeer()`, `SuspendThreadByThreadId()` or
292*795d594fSAndroid Build Coastguard Worker`SuspendAll()` request. But if Thread A increments Thread B's suspend count
293*795d594fSAndroid Build Coastguard Workerwhile Thread B increments Thread A's suspend count, and they then both suspend
294*795d594fSAndroid Build Coastguard Workerduring a subsequent thread transition, we're deadlocked.
295*795d594fSAndroid Build Coastguard Worker
296*795d594fSAndroid Build Coastguard WorkerFor single-thread suspension requests, we refuse to initiate
297*795d594fSAndroid Build Coastguard Workera suspend request from a registered thread that is also being asked to suspend
298*795d594fSAndroid Build Coastguard Worker(i.e. the suspend count is nonzero).  Instead the requestor waits for that
299*795d594fSAndroid Build Coastguard Workercondition to change.  This means that we cannot create a cycle in which each
300*795d594fSAndroid Build Coastguard Workerthread has asked to suspend the next one, and thus no thread can progress.  The
301*795d594fSAndroid Build Coastguard Workerrequired atomicity of the requestor suspend count check with setting the suspend
302*795d594fSAndroid Build Coastguard Workercount of the target(s) target is ensured by holding `suspend_count_lock_`.
303*795d594fSAndroid Build Coastguard Worker
304*795d594fSAndroid Build Coastguard WorkerFor `SuspendAll()`, we enforce a requirement that at most one `SuspendAll()`
305*795d594fSAndroid Build Coastguard Workerrequest is running at one time. We also set the `kSuspensionImmune` thread flag
306*795d594fSAndroid Build Coastguard Workerto prevent a single thread suspension of a thread currently between
307*795d594fSAndroid Build Coastguard Worker`SuspendAll()` and `ResumeAll()` calls. Thus once a `SuspendAll()` call starts,
308*795d594fSAndroid Build Coastguard Workerit will complete before it can be affected by suspension requests from other
309*795d594fSAndroid Build Coastguard Workerthreads.
310*795d594fSAndroid Build Coastguard Worker
311*795d594fSAndroid Build Coastguard Worker[^1]: In the most recent versions of ART, compiler-generated code loads through
312*795d594fSAndroid Build Coastguard Worker    the address at `tlsPtr_.suspend_trigger`. A thread suspension is requested
313*795d594fSAndroid Build Coastguard Worker    by setting this to null, triggering a `SIGSEGV`, causing that thread to
314*795d594fSAndroid Build Coastguard Worker    check for GC cooperation requests. The older mechanism instead sets an
315*795d594fSAndroid Build Coastguard Worker    appropriate `ThreadFlag` entry to request suspension or a checkpoint. Note
316*795d594fSAndroid Build Coastguard Worker    that the actual checkpoint function value is set, along with the flag, while
317*795d594fSAndroid Build Coastguard Worker    holding `suspend_count_lock_`. If the target thread notices that a
318*795d594fSAndroid Build Coastguard Worker    checkpoint is requested, it then acquires the `suspend_count_lock_` to read
319*795d594fSAndroid Build Coastguard Worker    the checkpoint function.
320