1This is a living document and at times it will be out of date. It is 2intended to articulate how programming in the Go runtime differs from 3writing normal Go. It focuses on pervasive concepts rather than 4details of particular interfaces. 5 6Scheduler structures 7==================== 8 9The scheduler manages three types of resources that pervade the 10runtime: Gs, Ms, and Ps. It's important to understand these even if 11you're not working on the scheduler. 12 13Gs, Ms, Ps 14---------- 15 16A "G" is simply a goroutine. It's represented by type `g`. When a 17goroutine exits, its `g` object is returned to a pool of free `g`s and 18can later be reused for some other goroutine. 19 20An "M" is an OS thread that can be executing user Go code, runtime 21code, a system call, or be idle. It's represented by type `m`. There 22can be any number of Ms at a time since any number of threads may be 23blocked in system calls. 24 25Finally, a "P" represents the resources required to execute user Go 26code, such as scheduler and memory allocator state. It's represented 27by type `p`. There are exactly `GOMAXPROCS` Ps. A P can be thought of 28like a CPU in the OS scheduler and the contents of the `p` type like 29per-CPU state. This is a good place to put state that needs to be 30sharded for efficiency, but doesn't need to be per-thread or 31per-goroutine. 32 33The scheduler's job is to match up a G (the code to execute), an M 34(where to execute it), and a P (the rights and resources to execute 35it). When an M stops executing user Go code, for example by entering a 36system call, it returns its P to the idle P pool. In order to resume 37executing user Go code, for example on return from a system call, it 38must acquire a P from the idle pool. 39 40All `g`, `m`, and `p` objects are heap allocated, but are never freed, 41so their memory remains type stable. As a result, the runtime can 42avoid write barriers in the depths of the scheduler. 43 44`getg()` and `getg().m.curg` 45---------------------------- 46 47To get the current user `g`, use `getg().m.curg`. 48 49`getg()` alone returns the current `g`, but when executing on the 50system or signal stacks, this will return the current M's "g0" or 51"gsignal", respectively. This is usually not what you want. 52 53To determine if you're running on the user stack or the system stack, 54use `getg() == getg().m.curg`. 55 56Stacks 57====== 58 59Every non-dead G has a *user stack* associated with it, which is what 60user Go code executes on. User stacks start small (e.g., 2K) and grow 61or shrink dynamically. 62 63Every M has a *system stack* associated with it (also known as the M's 64"g0" stack because it's implemented as a stub G) and, on Unix 65platforms, a *signal stack* (also known as the M's "gsignal" stack). 66System and signal stacks cannot grow, but are large enough to execute 67runtime and cgo code (8K in a pure Go binary; system-allocated in a 68cgo binary). 69 70Runtime code often temporarily switches to the system stack using 71`systemstack`, `mcall`, or `asmcgocall` to perform tasks that must not 72be preempted, that must not grow the user stack, or that switch user 73goroutines. Code running on the system stack is implicitly 74non-preemptible and the garbage collector does not scan system stacks. 75While running on the system stack, the current user stack is not used 76for execution. 77 78nosplit functions 79----------------- 80 81Most functions start with a prologue that inspects the stack pointer 82and the current G's stack bound and calls `morestack` if the stack 83needs to grow. 84 85Functions can be marked `//go:nosplit` (or `NOSPLIT` in assembly) to 86indicate that they should not get this prologue. This has several 87uses: 88 89- Functions that must run on the user stack, but must not call into 90 stack growth, for example because this would cause a deadlock, or 91 because they have untyped words on the stack. 92 93- Functions that must not be preempted on entry. 94 95- Functions that may run without a valid G. For example, functions 96 that run in early runtime start-up, or that may be entered from C 97 code such as cgo callbacks or the signal handler. 98 99Splittable functions ensure there's some amount of space on the stack 100for nosplit functions to run in and the linker checks that any static 101chain of nosplit function calls cannot exceed this bound. 102 103Any function with a `//go:nosplit` annotation should explain why it is 104nosplit in its documentation comment. 105 106Error handling and reporting 107============================ 108 109Errors that can reasonably be recovered from in user code should use 110`panic` like usual. However, there are some situations where `panic` 111will cause an immediate fatal error, such as when called on the system 112stack or when called during `mallocgc`. 113 114Most errors in the runtime are not recoverable. For these, use 115`throw`, which dumps the traceback and immediately terminates the 116process. In general, `throw` should be passed a string constant to 117avoid allocating in perilous situations. By convention, additional 118details are printed before `throw` using `print` or `println` and the 119messages are prefixed with "runtime:". 120 121For unrecoverable errors where user code is expected to be at fault for the 122failure (such as racing map writes), use `fatal`. 123 124For runtime error debugging, it may be useful to run with `GOTRACEBACK=system` 125or `GOTRACEBACK=crash`. The output of `panic` and `fatal` is as described by 126`GOTRACEBACK`. The output of `throw` always includes runtime frames, metadata 127and all goroutines regardless of `GOTRACEBACK` (i.e., equivalent to 128`GOTRACEBACK=system`). Whether `throw` crashes or not is still controlled by 129`GOTRACEBACK`. 130 131Synchronization 132=============== 133 134The runtime has multiple synchronization mechanisms. They differ in 135semantics and, in particular, in whether they interact with the 136goroutine scheduler or the OS scheduler. 137 138The simplest is `mutex`, which is manipulated using `lock` and 139`unlock`. This should be used to protect shared structures for short 140periods. Blocking on a `mutex` directly blocks the M, without 141interacting with the Go scheduler. This means it is safe to use from 142the lowest levels of the runtime, but also prevents any associated G 143and P from being rescheduled. `rwmutex` is similar. 144 145For one-shot notifications, use `note`, which provides `notesleep` and 146`notewakeup`. Unlike traditional UNIX `sleep`/`wakeup`, `note`s are 147race-free, so `notesleep` returns immediately if the `notewakeup` has 148already happened. A `note` can be reset after use with `noteclear`, 149which must not race with a sleep or wakeup. Like `mutex`, blocking on 150a `note` blocks the M. However, there are different ways to sleep on a 151`note`:`notesleep` also prevents rescheduling of any associated G and 152P, while `notetsleepg` acts like a blocking system call that allows 153the P to be reused to run another G. This is still less efficient than 154blocking the G directly since it consumes an M. 155 156To interact directly with the goroutine scheduler, use `gopark` and 157`goready`. `gopark` parks the current goroutine—putting it in the 158"waiting" state and removing it from the scheduler's run queue—and 159schedules another goroutine on the current M/P. `goready` puts a 160parked goroutine back in the "runnable" state and adds it to the run 161queue. 162 163In summary, 164 165<table> 166<tr><th></th><th colspan="3">Blocks</th></tr> 167<tr><th>Interface</th><th>G</th><th>M</th><th>P</th></tr> 168<tr><td>(rw)mutex</td><td>Y</td><td>Y</td><td>Y</td></tr> 169<tr><td>note</td><td>Y</td><td>Y</td><td>Y/N</td></tr> 170<tr><td>park</td><td>Y</td><td>N</td><td>N</td></tr> 171</table> 172 173Atomics 174======= 175 176The runtime uses its own atomics package at `internal/runtime/atomic`. 177This corresponds to `sync/atomic`, but functions have different names 178for historical reasons and there are a few additional functions needed 179by the runtime. 180 181In general, we think hard about the uses of atomics in the runtime and 182try to avoid unnecessary atomic operations. If access to a variable is 183sometimes protected by another synchronization mechanism, the 184already-protected accesses generally don't need to be atomic. There 185are several reasons for this: 186 1871. Using non-atomic or atomic access where appropriate makes the code 188 more self-documenting. Atomic access to a variable implies there's 189 somewhere else that may concurrently access the variable. 190 1912. Non-atomic access allows for automatic race detection. The runtime 192 doesn't currently have a race detector, but it may in the future. 193 Atomic access defeats the race detector, while non-atomic access 194 allows the race detector to check your assumptions. 195 1963. Non-atomic access may improve performance. 197 198Of course, any non-atomic access to a shared variable should be 199documented to explain how that access is protected. 200 201Some common patterns that mix atomic and non-atomic access are: 202 203* Read-mostly variables where updates are protected by a lock. Within 204 the locked region, reads do not need to be atomic, but the write 205 does. Outside the locked region, reads need to be atomic. 206 207* Reads that only happen during STW, where no writes can happen during 208 STW, do not need to be atomic. 209 210That said, the advice from the Go memory model stands: "Don't be 211[too] clever." The performance of the runtime matters, but its 212robustness matters more. 213 214Unmanaged memory 215================ 216 217In general, the runtime tries to use regular heap allocation. However, 218in some cases the runtime must allocate objects outside of the garbage 219collected heap, in *unmanaged memory*. This is necessary if the 220objects are part of the memory manager itself or if they must be 221allocated in situations where the caller may not have a P. 222 223There are three mechanisms for allocating unmanaged memory: 224 225* sysAlloc obtains memory directly from the OS. This comes in whole 226 multiples of the system page size, but it can be freed with sysFree. 227 228* persistentalloc combines multiple smaller allocations into a single 229 sysAlloc to avoid fragmentation. However, there is no way to free 230 persistentalloced objects (hence the name). 231 232* fixalloc is a SLAB-style allocator that allocates objects of a fixed 233 size. fixalloced objects can be freed, but this memory can only be 234 reused by the same fixalloc pool, so it can only be reused for 235 objects of the same type. 236 237In general, types that are allocated using any of these should be 238marked as not in heap by embedding `runtime/internal/sys.NotInHeap`. 239 240Objects that are allocated in unmanaged memory **must not** contain 241heap pointers unless the following rules are also obeyed: 242 2431. Any pointers from unmanaged memory to the heap must be garbage 244 collection roots. More specifically, any pointer must either be 245 accessible through a global variable or be added as an explicit 246 garbage collection root in `runtime.markroot`. 247 2482. If the memory is reused, the heap pointers must be zero-initialized 249 before they become visible as GC roots. Otherwise, the GC may 250 observe stale heap pointers. See "Zero-initialization versus 251 zeroing". 252 253Zero-initialization versus zeroing 254================================== 255 256There are two types of zeroing in the runtime, depending on whether 257the memory is already initialized to a type-safe state. 258 259If memory is not in a type-safe state, meaning it potentially contains 260"garbage" because it was just allocated and it is being initialized 261for first use, then it must be *zero-initialized* using 262`memclrNoHeapPointers` or non-pointer writes. This does not perform 263write barriers. 264 265If memory is already in a type-safe state and is simply being set to 266the zero value, this must be done using regular writes, `typedmemclr`, 267or `memclrHasPointers`. This performs write barriers. 268 269Runtime-only compiler directives 270================================ 271 272In addition to the "//go:" directives documented in "go doc compile", 273the compiler supports additional directives only in the runtime. 274 275go:systemstack 276-------------- 277 278`go:systemstack` indicates that a function must run on the system 279stack. This is checked dynamically by a special function prologue. 280 281go:nowritebarrier 282----------------- 283 284`go:nowritebarrier` directs the compiler to emit an error if the 285following function contains any write barriers. (It *does not* 286suppress the generation of write barriers; it is simply an assertion.) 287 288Usually you want `go:nowritebarrierrec`. `go:nowritebarrier` is 289primarily useful in situations where it's "nice" not to have write 290barriers, but not required for correctness. 291 292go:nowritebarrierrec and go:yeswritebarrierrec 293---------------------------------------------- 294 295`go:nowritebarrierrec` directs the compiler to emit an error if the 296following function or any function it calls recursively, up to a 297`go:yeswritebarrierrec`, contains a write barrier. 298 299Logically, the compiler floods the call graph starting from each 300`go:nowritebarrierrec` function and produces an error if it encounters 301a function containing a write barrier. This flood stops at 302`go:yeswritebarrierrec` functions. 303 304`go:nowritebarrierrec` is used in the implementation of the write 305barrier to prevent infinite loops. 306 307Both directives are used in the scheduler. The write barrier requires 308an active P (`getg().m.p != nil`) and scheduler code often runs 309without an active P. In this case, `go:nowritebarrierrec` is used on 310functions that release the P or may run without a P and 311`go:yeswritebarrierrec` is used when code re-acquires an active P. 312Since these are function-level annotations, code that releases or 313acquires a P may need to be split across two functions. 314 315go:uintptrkeepalive 316------------------- 317 318The //go:uintptrkeepalive directive must be followed by a function declaration. 319 320It specifies that the function's uintptr arguments may be pointer values that 321have been converted to uintptr and must be kept alive for the duration of the 322call, even though from the types alone it would appear that the object is no 323longer needed during the call. 324 325This directive is similar to //go:uintptrescapes, but it does not force 326arguments to escape. Since stack growth does not understand these arguments, 327this directive must be used with //go:nosplit (in the marked function and all 328transitive calls) to prevent stack growth. 329 330The conversion from pointer to uintptr must appear in the argument list of any 331call to this function. This directive is used for some low-level system call 332implementations. 333