1*795d594fSAndroid Build Coastguard WorkerMechanisms for Coordination Between Garbage Collector and Mutator 2*795d594fSAndroid Build Coastguard Worker----------------------------------------------------------------- 3*795d594fSAndroid Build Coastguard Worker 4*795d594fSAndroid Build Coastguard WorkerMost garbage collection work can proceed concurrently with the client or 5*795d594fSAndroid Build Coastguard Workermutator Java threads. But in certain places, for example while tracing from 6*795d594fSAndroid Build Coastguard Workerthread stacks, the garbage collector needs to ensure that Java data processed 7*795d594fSAndroid Build Coastguard Workerby the collector is consistent and complete. At these points, the mutators 8*795d594fSAndroid Build Coastguard Workershould not hold references to the heap that are invisible to the garbage 9*795d594fSAndroid Build Coastguard Workercollector. And they should not be modifying the data that is visible to the 10*795d594fSAndroid Build Coastguard Workercollector. 11*795d594fSAndroid Build Coastguard Worker 12*795d594fSAndroid Build Coastguard WorkerLogically, the collector and mutator share a reader-writer lock on the Java 13*795d594fSAndroid Build Coastguard Workerheap and associated data structures. Mutators hold the lock in reader or shared mode 14*795d594fSAndroid Build Coastguard Workerwhile running Java code or touching heap-related data structures. The collector 15*795d594fSAndroid Build Coastguard Workerholds the lock in writer or exclusive mode while it needs the heap data 16*795d594fSAndroid Build Coastguard Workerstructures to be stable. However, this reader-writer lock has a very customized 17*795d594fSAndroid Build Coastguard Workerimplementation that also provides additional facilities, such as the ability 18*795d594fSAndroid Build Coastguard Workerto exclude only a single thread, so that we can specifically examine its heap 19*795d594fSAndroid Build Coastguard Workerreferences. 20*795d594fSAndroid Build Coastguard Worker 21*795d594fSAndroid Build Coastguard WorkerIn order to ensure consistency of the Java data, the compiler inserts "suspend 22*795d594fSAndroid Build Coastguard Workerpoints", sometimes also called "safe points" into the code. These allow a thread 23*795d594fSAndroid Build Coastguard Workerto respond to external requests. 24*795d594fSAndroid Build Coastguard Worker 25*795d594fSAndroid Build Coastguard WorkerWhenever a thread is runnable, i.e. whenever a thread logically holds the 26*795d594fSAndroid Build Coastguard Workermutator lock in shared mode, it is expected to regularly execute such a suspend 27*795d594fSAndroid Build Coastguard Workerpoint, and check for pending requests. They are currently implemented by 28*795d594fSAndroid Build Coastguard Workersetting a flag in the thread structure[^1], which is then explicitly tested by the 29*795d594fSAndroid Build Coastguard Workercompiler-generated code. 30*795d594fSAndroid Build Coastguard Worker 31*795d594fSAndroid Build Coastguard WorkerA thread responds to suspend requests only when it is "runnable", i.e. logically 32*795d594fSAndroid Build Coastguard Workerrunning Java code. When it runs native code, or is blocked in a kernel call, it 33*795d594fSAndroid Build Coastguard Workerlogically releases the mutator lock. When the garbage collector needs mutator 34*795d594fSAndroid Build Coastguard Workercooperation, and the thread is not runnable, it is assured that the mutator is 35*795d594fSAndroid Build Coastguard Workernot touching Java data, and hence the collector can safely perform the required 36*795d594fSAndroid Build Coastguard Workeraction itself, on the mutator thread's behalf. 37*795d594fSAndroid Build Coastguard Worker 38*795d594fSAndroid Build Coastguard WorkerNormally, when a thread makes a JNI call, it is not considered runnable while 39*795d594fSAndroid Build Coastguard Workerexecuting native code. This makes the transitions to and from running native JNI 40*795d594fSAndroid Build Coastguard Workercode somewhat expensive (see below). But these transitions are necessary to 41*795d594fSAndroid Build Coastguard Workerensure that such code, which does not execute "suspend points", and can thus not 42*795d594fSAndroid Build Coastguard Workercooperate with the GC, doesn't delay GC completion. `@FastNative` and 43*795d594fSAndroid Build Coastguard Worker`@CriticalNative` calls avoid these transitions, instead allowing the thread to 44*795d594fSAndroid Build Coastguard Workerremain "runnable", at the expense of potentially delaying GC operations for the 45*795d594fSAndroid Build Coastguard Workerduration of the call. 46*795d594fSAndroid Build Coastguard Worker 47*795d594fSAndroid Build Coastguard WorkerAlthough we say that a thread is "suspended" when it is not running Java code, 48*795d594fSAndroid Build Coastguard Workerit may in fact still be running native code and touching data structures that 49*795d594fSAndroid Build Coastguard Workerare not considered "Java data". This distinction can be a fine line. For 50*795d594fSAndroid Build Coastguard Workerexample, a Java thread blocked on a Java monitor will normally be "suspended" 51*795d594fSAndroid Build Coastguard Workerand blocked on a mutex contained in the monitor data structure. But it may wake 52*795d594fSAndroid Build Coastguard Workerup for reasons beyond ARTs control, which will normally result in touching the 53*795d594fSAndroid Build Coastguard Workermutex. The monitor code must be quite careful to ensure that this does not cause 54*795d594fSAndroid Build Coastguard Workerproblems, especially if the ART runtime was shut down in the interim and the 55*795d594fSAndroid Build Coastguard Workermonitor data structure has been reclaimed. 56*795d594fSAndroid Build Coastguard Worker 57*795d594fSAndroid Build Coastguard WorkerCalls to change thread state 58*795d594fSAndroid Build Coastguard Worker---------------------------- 59*795d594fSAndroid Build Coastguard Worker 60*795d594fSAndroid Build Coastguard WorkerWhen a thread changes between running Java and native code, it has to 61*795d594fSAndroid Build Coastguard Workercorrespondingly change its state between "runnable" and one of several 62*795d594fSAndroid Build Coastguard Workerother states, all of which are considered to be "suspended" for our purposes. 63*795d594fSAndroid Build Coastguard WorkerWhen a Java thread starts to execute native code, and may thus not respond 64*795d594fSAndroid Build Coastguard Workerpromptly to suspend requests, it will normally create an object of type 65*795d594fSAndroid Build Coastguard Worker`ScopedThreadSuspension`. `ScopedThreadSuspension`'s constructor changes state to 66*795d594fSAndroid Build Coastguard Workerthe "suspended" state given as an argument, logically releasing the mutator lock 67*795d594fSAndroid Build Coastguard Workerand promising to no longer touch Java data structures. It also handles any 68*795d594fSAndroid Build Coastguard Workerpending suspension requests that slid in just before it changed state. 69*795d594fSAndroid Build Coastguard Worker 70*795d594fSAndroid Build Coastguard WorkerConversely, `ScopedThreadSuspension`'s destructor waits until the GC has finished 71*795d594fSAndroid Build Coastguard Workerany actions it is currently performing on the thread's behalf and effectively 72*795d594fSAndroid Build Coastguard Workerreleased the mutator exclusive lock, and then returns to runnable state, 73*795d594fSAndroid Build Coastguard Workerre-acquiring the mutator lock. 74*795d594fSAndroid Build Coastguard Worker 75*795d594fSAndroid Build Coastguard WorkerOccasionally a thread running native code needs to temporarily again access Java 76*795d594fSAndroid Build Coastguard Workerdata structures, performing the above transitions in the opposite order. 77*795d594fSAndroid Build Coastguard Worker`ScopedObjectAccess` is a similar RAII object whose constructor and destructor 78*795d594fSAndroid Build Coastguard Workerperform those transitions in the reverse order from `ScopedThreadSuspension`. 79*795d594fSAndroid Build Coastguard Worker 80*795d594fSAndroid Build Coastguard WorkerMutator lock implementation 81*795d594fSAndroid Build Coastguard Worker--------------------------- 82*795d594fSAndroid Build Coastguard Worker 83*795d594fSAndroid Build Coastguard WorkerThe mutator lock is not implemented as a conventional mutex. But it plays by the 84*795d594fSAndroid Build Coastguard Workerrules of our normal static thread-safety analysis. Thus a function that is 85*795d594fSAndroid Build Coastguard Workerexpected to be called in runnable state, with the ability to access Java data, 86*795d594fSAndroid Build Coastguard Workershould be annotated with `REQUIRES_SHARED(Locks::mutator_lock_)`. 87*795d594fSAndroid Build Coastguard Worker 88*795d594fSAndroid Build Coastguard WorkerThere is an explicit `mutator_lock_` object, of type `MutatorMutex`. `MutatorMutex` is 89*795d594fSAndroid Build Coastguard Workerseemingly a minor refinement of `ReaderWriterMutex`, but it is used entirely 90*795d594fSAndroid Build Coastguard Workerdifferently. It is acquired explicitly by clients that need to hold it 91*795d594fSAndroid Build Coastguard Workerexclusively, and in a small number of cases, it is acquired in shared mode, e.g. 92*795d594fSAndroid Build Coastguard Workervia `SharedTryLock()`, or by the GC itself. However, more commonly 93*795d594fSAndroid Build Coastguard Worker`MutatorMutex::TransitionFromSuspendedToRunnable()`, is used to logically acquire 94*795d594fSAndroid Build Coastguard Workerthe mutator mutex, e.g. as part of `ScopedObjectAccess` construction. 95*795d594fSAndroid Build Coastguard Worker 96*795d594fSAndroid Build Coastguard Worker`TransitionFromSuspendedToRunnable()` does not physically acquire the 97*795d594fSAndroid Build Coastguard Worker`ReaderWriterMutex` in shared mode. Thus any thread acquiring the lock in exclusive mode 98*795d594fSAndroid Build Coastguard Workermust, in addition, explicitly arrange for mutator threads to be suspended via the 99*795d594fSAndroid Build Coastguard Workerthread suspension mechanism, and then make them runnable again on release. 100*795d594fSAndroid Build Coastguard Worker 101*795d594fSAndroid Build Coastguard WorkerLogically the mutator lock is held in shared/reader mode if ***either*** the 102*795d594fSAndroid Build Coastguard Workerunderlying reader-writer lock is held in shared mode, ***or*** if a mutator is in 103*795d594fSAndroid Build Coastguard Workerrunnable state. These two ways of holding the mutator mutex are ***not*** 104*795d594fSAndroid Build Coastguard Workerequivalent: In particular, we rely on the garbage collector never actually 105*795d594fSAndroid Build Coastguard Workerentering a "runnable" state while active (see below). However, it often runs with 106*795d594fSAndroid Build Coastguard Workerthe explicit mutator mutex in shared mode, thus blocking others from acquiring it 107*795d594fSAndroid Build Coastguard Workerin exclusive mode. 108*795d594fSAndroid Build Coastguard Worker 109*795d594fSAndroid Build Coastguard WorkerSuspension and checkpoint API 110*795d594fSAndroid Build Coastguard Worker----------------------------- 111*795d594fSAndroid Build Coastguard Worker 112*795d594fSAndroid Build Coastguard WorkerSuspend point checks enable three kinds of communication with mutator threads: 113*795d594fSAndroid Build Coastguard Worker 114*795d594fSAndroid Build Coastguard Worker**Checkpoints** 115*795d594fSAndroid Build Coastguard Worker: Checkpoint requests are used to get a thread to perform an action 116*795d594fSAndroid Build Coastguard Workeron our behalf. `RequestCheckpoint()` asks a specific thread to execute the closure 117*795d594fSAndroid Build Coastguard Workersupplied as an argument at its leisure. `RequestSynchronousCheckpoint()` in 118*795d594fSAndroid Build Coastguard Workeraddition waits for the thread to complete running the closure, and handles 119*795d594fSAndroid Build Coastguard Workersuspended threads by running the closure on their behalf. In addition to these 120*795d594fSAndroid Build Coastguard Workerfunctions provided by `Thread`, `ThreadList` provides the `RunCheckpoint()` function 121*795d594fSAndroid Build Coastguard Workerthat runs a checkpoint function on behalf of each thread, either by using 122*795d594fSAndroid Build Coastguard Worker`RequestCheckpoint()` to run it inside a running thread, or by ensuring that a 123*795d594fSAndroid Build Coastguard Workersuspended thread stays suspended, and then running the function on its behalf. 124*795d594fSAndroid Build Coastguard Worker`RunCheckpoint()` does not wait for completion of the function calls triggered by 125*795d594fSAndroid Build Coastguard Workerthe resulting `RequestCheckpoint()` invocations. 126*795d594fSAndroid Build Coastguard Worker 127*795d594fSAndroid Build Coastguard Worker**Empty checkpoints** 128*795d594fSAndroid Build Coastguard Worker: ThreadList provides `RunEmptyCheckpoint()`, which waits until 129*795d594fSAndroid Build Coastguard Workerall threads have either passed a suspend point, or have been suspended. This 130*795d594fSAndroid Build Coastguard Workerensures that no thread is still executing Java code inside the same 131*795d594fSAndroid Build Coastguard Workersuspend-point-delimited code interval it was executing before the call. For 132*795d594fSAndroid Build Coastguard Workerexample, a read-barrier started before a `RunEmptyCheckpoint()` call will have 133*795d594fSAndroid Build Coastguard Workerfinished before the call returns. 134*795d594fSAndroid Build Coastguard Worker 135*795d594fSAndroid Build Coastguard Worker**Thread suspension** 136*795d594fSAndroid Build Coastguard Worker: ThreadList provides a number of `SuspendThread...()` calls and 137*795d594fSAndroid Build Coastguard Workera `SuspendAll()` call to suspend one or all threads until they are resumed by 138*795d594fSAndroid Build Coastguard Worker`Resume()` or `ResumeAll()`. The `Suspend...` calls guarantee that the target 139*795d594fSAndroid Build Coastguard Workerthread(s) are suspended (again, only in the sense of not running Java code) 140*795d594fSAndroid Build Coastguard Workerwhen the call returns. 141*795d594fSAndroid Build Coastguard Worker 142*795d594fSAndroid Build Coastguard WorkerDeadlock freedom 143*795d594fSAndroid Build Coastguard Worker---------------- 144*795d594fSAndroid Build Coastguard Worker 145*795d594fSAndroid Build Coastguard WorkerIt is easy to deadlock while attempting to run checkpoints, or suspending 146*795d594fSAndroid Build Coastguard Workerthreads. In particular, we need to avoid situations in which we cannot suspend 147*795d594fSAndroid Build Coastguard Workera thread because it is blocked, directly, or indirectly, on the GC completing 148*795d594fSAndroid Build Coastguard Workerits task. Deadlocks are avoided as follows: 149*795d594fSAndroid Build Coastguard Worker 150*795d594fSAndroid Build Coastguard Worker**Mutator lock ordering** 151*795d594fSAndroid Build Coastguard WorkerThe mutator lock participates in the normal ART lock ordering hierarchy, as though it 152*795d594fSAndroid Build Coastguard Workerwere a regular lock. See `base/locks.h` for the hierarchy. In particular, only 153*795d594fSAndroid Build Coastguard Workerlocks at or below level `kPostMutatorTopLockLevel` may be acquired after 154*795d594fSAndroid Build Coastguard Workeracquiring the mutator lock, e.g. inside the scope of a `ScopedObjectAccess`. 155*795d594fSAndroid Build Coastguard WorkerSimilarly only locks at level strictly above `kMutatatorLock` may be held while 156*795d594fSAndroid Build Coastguard Workeracquiring the mutator lock, e.g. either by starting a `ScopedObjectAccess`, or 157*795d594fSAndroid Build Coastguard Workerending a `ScopedThreadSuspension`. 158*795d594fSAndroid Build Coastguard Worker 159*795d594fSAndroid Build Coastguard WorkerThis ensures that code that uses purely mutexes and threads state changes cannot 160*795d594fSAndroid Build Coastguard Workerdeadlock: Since we always wait on a lower-level lock, the holder of the 161*795d594fSAndroid Build Coastguard Workerlowest-level lock can always progress. An attempt to initiate a checkpoint or to 162*795d594fSAndroid Build Coastguard Workersuspend another thread must also be treated as an acquisition of the mutator 163*795d594fSAndroid Build Coastguard Workerlock: A thread that is waiting for a lock before it can respond to the request 164*795d594fSAndroid Build Coastguard Workeris itself holding the mutator lock, and can only be blocked on lower-level 165*795d594fSAndroid Build Coastguard Workerlocks. And acquisition of those can never depend on acquiring the mutator 166*795d594fSAndroid Build Coastguard Workerlock. 167*795d594fSAndroid Build Coastguard Worker 168*795d594fSAndroid Build Coastguard Worker**Checkpoints** 169*795d594fSAndroid Build Coastguard WorkerRunning a checkpoint in a thread requires suspending that thread for the 170*795d594fSAndroid Build Coastguard Workerduration of the checkpoint, or running the checkpoint on the threads behalf 171*795d594fSAndroid Build Coastguard Workerwhile that thread is blocked from executing Java code. In the former case, the 172*795d594fSAndroid Build Coastguard Workercheckpoint code is run from `CheckSuspend`, which requires the mutator lock, 173*795d594fSAndroid Build Coastguard Workerso checkpoint code may only acquire mutexes at or below level 174*795d594fSAndroid Build Coastguard Worker`kPostMutatorTopLockLevel`. But that is not sufficient. 175*795d594fSAndroid Build Coastguard Worker 176*795d594fSAndroid Build Coastguard WorkerNo matter whether the checkpoint is run in the target thread, or on its behalf, 177*795d594fSAndroid Build Coastguard Workerthe target thread is effectively suspended and prevented from running Java code. 178*795d594fSAndroid Build Coastguard WorkerHowever the target may hold arbitrary Java monitors, which it can no longer 179*795d594fSAndroid Build Coastguard Workerrelease. This may also prevent higher level mutexes from getting released. Thus 180*795d594fSAndroid Build Coastguard Workercheckpoint code should only acquire mutexes at level `kPostMonitorLock` or 181*795d594fSAndroid Build Coastguard Workerbelow. 182*795d594fSAndroid Build Coastguard Worker 183*795d594fSAndroid Build Coastguard Worker 184*795d594fSAndroid Build Coastguard Worker**Waiting** 185*795d594fSAndroid Build Coastguard WorkerThis becomes much more problematic when we wait for something other than a lock. 186*795d594fSAndroid Build Coastguard WorkerWaiting for something that may depend on the GC, while holding the mutator lock, 187*795d594fSAndroid Build Coastguard Workercan potentially lead to deadlock, since it will prevent the waiting thread from 188*795d594fSAndroid Build Coastguard Workerparticipating in GC checkpoints. Waiting while holding a lower-level lock like 189*795d594fSAndroid Build Coastguard Worker`thread_list_lock_` is similarly unsafe in general, since a runnable thread may 190*795d594fSAndroid Build Coastguard Workernot respond to checkpoints until it acquires `thread_list_lock_`. In general, 191*795d594fSAndroid Build Coastguard Workerwaiting for a condition variable while holding an unrelated lock is problematic, 192*795d594fSAndroid Build Coastguard Workerand these are specific instances of that general problem. 193*795d594fSAndroid Build Coastguard Worker 194*795d594fSAndroid Build Coastguard WorkerWe do currently provide `WaitHoldingLocks`, and it is sometimes used with 195*795d594fSAndroid Build Coastguard Workerlow-level locks held. But such code must somehow ensure that such waits 196*795d594fSAndroid Build Coastguard Workereventually terminate without deadlock. 197*795d594fSAndroid Build Coastguard Worker 198*795d594fSAndroid Build Coastguard WorkerOne common use of WaitHoldingLocks is to wait for weak reference processing. 199*795d594fSAndroid Build Coastguard WorkerSpecial rules apply to avoid deadlocks in this case: Such waits must start after 200*795d594fSAndroid Build Coastguard Workerweak reference processing is disabled; the GC may not issue further nonempty 201*795d594fSAndroid Build Coastguard Workercheckpoints or suspend requests until weak reference processing has been 202*795d594fSAndroid Build Coastguard Workerreenabled, and threads have been notified. Thus the waiting thread's inability 203*795d594fSAndroid Build Coastguard Workerto respond to nonempty checkpoints and suspend requests cannot directly block 204*795d594fSAndroid Build Coastguard Workerthe GC. Non-GC checkpoint or suspend requests that target a thread waiting on 205*795d594fSAndroid Build Coastguard Workerreference processing will block until reference processing completes. 206*795d594fSAndroid Build Coastguard Worker 207*795d594fSAndroid Build Coastguard WorkerConsider a case in which thread W1 waits on reference processing, while holding 208*795d594fSAndroid Build Coastguard Workera low-level mutex M. Thread W2 holds the mutator lock and waits on M. We avoid a 209*795d594fSAndroid Build Coastguard Workersituation in which the GC needs to suspend or checkpoint W2 by briefly stopping 210*795d594fSAndroid Build Coastguard Workerthe world to disable weak reference access. During the stop-the-world phase, W1 211*795d594fSAndroid Build Coastguard Workercannot yet be waiting for weak-reference access. Thus there is no danger of 212*795d594fSAndroid Build Coastguard Workerdeadlock while entering this phase. After this phase, there is no need for W2 to 213*795d594fSAndroid Build Coastguard Workersuspend or execute a nonempty checkpoint. If we replaced the stop-the-world 214*795d594fSAndroid Build Coastguard Workerphase by a checkpoint, W2 could receive the checkpoint request too late, and be 215*795d594fSAndroid Build Coastguard Workerunable to respond. 216*795d594fSAndroid Build Coastguard Worker 217*795d594fSAndroid Build Coastguard WorkerEmpty checkpoints can continue to occur during reference processing. Reference 218*795d594fSAndroid Build Coastguard Workerprocessing wait loops explicitly handle empty checkpoints, and an empty 219*795d594fSAndroid Build Coastguard Workercheckpoint request notifies the condition variable used to wait for reference 220*795d594fSAndroid Build Coastguard Workerprocessing, after acquiring `reference_processor_lock_`. This means that empty 221*795d594fSAndroid Build Coastguard Workercheckpoints do not preclude client threads from being in the middle of an 222*795d594fSAndroid Build Coastguard Workeroperation that involves a weak reference access, while nonempty checkpoints do. 223*795d594fSAndroid Build Coastguard Worker 224*795d594fSAndroid Build Coastguard Worker**Suspending the GC** 225*795d594fSAndroid Build Coastguard WorkerUnder unusual conditions, the GC can run on any thread. This means that when 226*795d594fSAndroid Build Coastguard Workerthread *A* suspends thread *B* for some other reason, Thread *B* might be 227*795d594fSAndroid Build Coastguard Workerrunning the garbage collector and conceivably thus cause it to block. This 228*795d594fSAndroid Build Coastguard Workerwould be very deadlock prone. If Thread *A* allocates while Thread *B* is 229*795d594fSAndroid Build Coastguard Workersuspended in the GC, and the allocation requires the GC's help to complete, we 230*795d594fSAndroid Build Coastguard Workerdeadlock. 231*795d594fSAndroid Build Coastguard Worker 232*795d594fSAndroid Build Coastguard WorkerThus we ensure that the GC, together with anything else that can block GCs, 233*795d594fSAndroid Build Coastguard Workercannot be blocked for thread suspension requests. This is accomplished by 234*795d594fSAndroid Build Coastguard Workerensuring that it always appears to be in a suspended thread state. Since we 235*795d594fSAndroid Build Coastguard Workeronly check for suspend requests when entering the runnable state, suspend 236*795d594fSAndroid Build Coastguard Workerrequests go unnoticed until the GC completes. It may physically acquire and 237*795d594fSAndroid Build Coastguard Workerrelease the actual `mutator_lock_` in either shared or exclusive mode. 238*795d594fSAndroid Build Coastguard Worker 239*795d594fSAndroid Build Coastguard WorkerThread Suspension Mechanics 240*795d594fSAndroid Build Coastguard Worker--------------------------- 241*795d594fSAndroid Build Coastguard Worker 242*795d594fSAndroid Build Coastguard WorkerThread suspension is initiated by a registered thread, except that, for testing 243*795d594fSAndroid Build Coastguard Workerpurposes, `SuspendAll` may be invoked with `self == nullptr`. We never suspend 244*795d594fSAndroid Build Coastguard Workerthe initiating thread, explicitly exclusing it from `SuspendAll()`, and failing 245*795d594fSAndroid Build Coastguard Worker`SuspendThreadBy...()` requests to that effect. 246*795d594fSAndroid Build Coastguard Worker 247*795d594fSAndroid Build Coastguard WorkerThe suspend calls invoke `IncrementSuspendCount()` to increment the thread 248*795d594fSAndroid Build Coastguard Workersuspend count for each thread. That adds a "suspend barrier" (atomic counter) to 249*795d594fSAndroid Build Coastguard Workerthe per-thread list of such counters to decrement. It normally sets the 250*795d594fSAndroid Build Coastguard Worker`kSuspendRequest` ("should enter safepoint handler") and `kActiveSuspendBarrier` 251*795d594fSAndroid Build Coastguard Worker("need to notify us when suspended") flags. 252*795d594fSAndroid Build Coastguard Worker 253*795d594fSAndroid Build Coastguard WorkerAfter setting these two flags, we check whether the thread is suspended and 254*795d594fSAndroid Build Coastguard Worker`kSuspendRequest` is still set. Since the thread is already suspended, it cannot 255*795d594fSAndroid Build Coastguard Workerbe expected to respond to "pass the suspend barrier" (decrement the atomic 256*795d594fSAndroid Build Coastguard Workercounter) in a timely fashion. Hence we do so on its behalf. This decrements 257*795d594fSAndroid Build Coastguard Workerthe "barrier" and removes it from the thread's list of barriers to decrement, 258*795d594fSAndroid Build Coastguard Workerand clears `kActiveSuspendBarrier`. `kSuspendRequest` remains to ensure the 259*795d594fSAndroid Build Coastguard Workerthread doesn't prematurely return to runnable state. 260*795d594fSAndroid Build Coastguard Worker 261*795d594fSAndroid Build Coastguard WorkerIf `SuspendAllInternal()` does not immediately see a suspended state, then it is up 262*795d594fSAndroid Build Coastguard Workerto the target thread to decrement the suspend barrier. 263*795d594fSAndroid Build Coastguard Worker`TransitionFromRunnableToSuspended()` calls 264*795d594fSAndroid Build Coastguard Worker`TransitionToSuspendedAndRunCheckpoints()`, which changes the thread state 265*795d594fSAndroid Build Coastguard Workerand returns. `TransitionFromRunnableToSuspended()` then calls 266*795d594fSAndroid Build Coastguard Worker`CheckActiveSuspendBarriers()` to check for the `kActiveSuspendBarrier` flag 267*795d594fSAndroid Build Coastguard Workerand decrement the suspend barrier if set. 268*795d594fSAndroid Build Coastguard Worker 269*795d594fSAndroid Build Coastguard WorkerThe `suspend_count_lock_` is not consistently held in the target thread 270*795d594fSAndroid Build Coastguard Workerduring this process. Thus correctness in resolving the race between a 271*795d594fSAndroid Build Coastguard Workersuspension-requesting thread and a target thread voluntarily suspending relies 272*795d594fSAndroid Build Coastguard Workeron first requesting suspension, and then checking whether the target is 273*795d594fSAndroid Build Coastguard Workeralready suspended, The detailed correctness argument is given in a comment 274*795d594fSAndroid Build Coastguard Workerinside `SuspendAllInternal()`. This also ensures that the barrier cannot be 275*795d594fSAndroid Build Coastguard Workerdecremented after the stack frame holding the barrier goes away. 276*795d594fSAndroid Build Coastguard Worker 277*795d594fSAndroid Build Coastguard WorkerThis relies on the fact that the two stores in the two threads to the state and 278*795d594fSAndroid Build Coastguard WorkerkActiveSuspendBarrier flag are ordered with respect to the later loads. That's 279*795d594fSAndroid Build Coastguard Workerguaranteed, since they are all stored in a single `atomic<>`. Thus even relaxed 280*795d594fSAndroid Build Coastguard Workeraccesses are OK. 281*795d594fSAndroid Build Coastguard Worker 282*795d594fSAndroid Build Coastguard WorkerThe actual suspend barrier representation still varies between `SuspendAll()` 283*795d594fSAndroid Build Coastguard Workerand `SuspendThreadBy...()`. The former relies on the fact that only one such 284*795d594fSAndroid Build Coastguard Workerbarrier can be in use at a time, while the latter maintains a linked list of 285*795d594fSAndroid Build Coastguard Workeractive suspend barriers for each target thread, relying on the fact that each 286*795d594fSAndroid Build Coastguard Workerone can appear on the list of only one thread, and we can thus use list nodes 287*795d594fSAndroid Build Coastguard Workerallocated in the stack frames of requesting threads. 288*795d594fSAndroid Build Coastguard Worker 289*795d594fSAndroid Build Coastguard Worker**Avoiding suspension cycles** 290*795d594fSAndroid Build Coastguard Worker 291*795d594fSAndroid Build Coastguard WorkerAny thread can issue a `SuspendThreadByPeer()`, `SuspendThreadByThreadId()` or 292*795d594fSAndroid Build Coastguard Worker`SuspendAll()` request. But if Thread A increments Thread B's suspend count 293*795d594fSAndroid Build Coastguard Workerwhile Thread B increments Thread A's suspend count, and they then both suspend 294*795d594fSAndroid Build Coastguard Workerduring a subsequent thread transition, we're deadlocked. 295*795d594fSAndroid Build Coastguard Worker 296*795d594fSAndroid Build Coastguard WorkerFor single-thread suspension requests, we refuse to initiate 297*795d594fSAndroid Build Coastguard Workera suspend request from a registered thread that is also being asked to suspend 298*795d594fSAndroid Build Coastguard Worker(i.e. the suspend count is nonzero). Instead the requestor waits for that 299*795d594fSAndroid Build Coastguard Workercondition to change. This means that we cannot create a cycle in which each 300*795d594fSAndroid Build Coastguard Workerthread has asked to suspend the next one, and thus no thread can progress. The 301*795d594fSAndroid Build Coastguard Workerrequired atomicity of the requestor suspend count check with setting the suspend 302*795d594fSAndroid Build Coastguard Workercount of the target(s) target is ensured by holding `suspend_count_lock_`. 303*795d594fSAndroid Build Coastguard Worker 304*795d594fSAndroid Build Coastguard WorkerFor `SuspendAll()`, we enforce a requirement that at most one `SuspendAll()` 305*795d594fSAndroid Build Coastguard Workerrequest is running at one time. We also set the `kSuspensionImmune` thread flag 306*795d594fSAndroid Build Coastguard Workerto prevent a single thread suspension of a thread currently between 307*795d594fSAndroid Build Coastguard Worker`SuspendAll()` and `ResumeAll()` calls. Thus once a `SuspendAll()` call starts, 308*795d594fSAndroid Build Coastguard Workerit will complete before it can be affected by suspension requests from other 309*795d594fSAndroid Build Coastguard Workerthreads. 310*795d594fSAndroid Build Coastguard Worker 311*795d594fSAndroid Build Coastguard Worker[^1]: In the most recent versions of ART, compiler-generated code loads through 312*795d594fSAndroid Build Coastguard Worker the address at `tlsPtr_.suspend_trigger`. A thread suspension is requested 313*795d594fSAndroid Build Coastguard Worker by setting this to null, triggering a `SIGSEGV`, causing that thread to 314*795d594fSAndroid Build Coastguard Worker check for GC cooperation requests. The older mechanism instead sets an 315*795d594fSAndroid Build Coastguard Worker appropriate `ThreadFlag` entry to request suspension or a checkpoint. Note 316*795d594fSAndroid Build Coastguard Worker that the actual checkpoint function value is set, along with the flag, while 317*795d594fSAndroid Build Coastguard Worker holding `suspend_count_lock_`. If the target thread notices that a 318*795d594fSAndroid Build Coastguard Worker checkpoint is requested, it then acquires the `suspend_count_lock_` to read 319*795d594fSAndroid Build Coastguard Worker the checkpoint function. 320