1Chromium OS Embedded Controller runtime 2======================================= 3 4Design principles 5----------------- 6 7 1. Never do at runtime what you can do at compile time 8The goal is saving flash space and computations. 9Compile-time configuration until you really need to switch at runtime. 10 11 2. Real-time: guarantee low latency (eg < 20 us) 12no interrupt disabling ... 13bounded code in interrupt handlers. 14 15 3. Keep it simple: design for the subset of microcontroller we use 16targeted at 32-bit single core CPU 17for small systems : 4kB to 64kB data RAM, possibly execute-in-place from flash. 18 19Execution contexts 20------------------ 21 22This is a pre-emptible runtime with static tasks. 23It has only 2 possible execution contexts: 24 25- the regular [tasks](#tasks) 26- the [interrupt handlers](#interrupts) 27 28The initial startup is an exception as described in the 29[dedicated paragraph](#Startup). 30 31### tasks 32 33The tasks are statically defined at compile-time. 34They are described for each *board* in the 35[board/$board/ec.tasklist](../board/host/ec.tasklist) file. 36 37They also have a static fixed priority implicitly defined at compile-time by 38their order in the [ec.tasklist](../board/host/ec.tasklist) file (the top-most 39one being the lowest priority aka *task* *1*). 40As a consequence, two different tasks cannot have the same priority. 41 42In order to store its context, each task has its own stack whose (*small*) size 43is defined at compile-time in the [ec.tasklist](../board/host/ec.tasklist) file. 44 45A task can normally be preempted at any time by either interrupts or higher 46priority tasks, see the [preemption section](#scheduling-and-preemption) for 47details and the [locking section](#locking-and-atomicity) for the few cases 48where you need to avoid it. 49 50### interrupts 51 52The hardware interrupt requests are connected to the interruption handling 53*C* routines declared by the `DECLARE_IRQ` macros, through some chip/core 54specific mechanisms (e.g. depending whether we have a vectored interrupt 55controller, peripheral interrupt controllers...) 56 57The interrupts can be nested (ie interrupted by a higher priority interrupt). 58All the interrupt vectors are assigned a priority as defined in their 59`DECLARE_IRQ` macro. The number of available priority level is 60architecture-specific (e.g. 4 on Cortex-M0, 8 on Cortex-M3/M4) and several 61interrupt handlers can have the same priority. An interrupt handler can only be 62interrupted by an handler having a priority **strictly** **greater** than 63its own. 64 65In most cases, the exceptions (e.g data/prefetch aborts, software interrupt) can 66be seen as interrupts with a priority strictly greater than all IRQ vectors. 67So they can interrupt any IRQ handler using the same nesting mechanism. 68All fatal exceptions should ultimately lead to a reboot. 69 70### Events 71 72Each task has a *pending* events bitmap[1] implemented as a 32-bit word. 73Several events are pre-defined for all tasks, the most significant bits on the 7432-bit bitmap are reserved for them : the timer pending event on bit 31 75([see the corresponding section](#Timers)), the requested task wake (bit 29), 76the event to kick the waiters on a mutex (bit 30), along with a few hardware 77specific events. 78The 19 least significant bits are available for task-specific meanings. 79 80Those event bits are used in inter-task communication and scheduling mechanism, 81other tasks **and** interrupt handlers can atomically set them to request 82specific actions from the task. Therefore, the presence of pending events in a 83task bitmap has an impact on its scheduling as described in the [scheduling 84section](#scheduling-and-preemption). 85These requests are done using the `task_set_event()` and `task_wake()` 86primitives. 87 88The two typical use-cases are: 89 90- a task sends a message to another task (simply use some common memory 91 structures [see explanation](#single-address-space) and want it to process 92 it now. 93- an hardware IRQ occurred and we need to do some long processing to respond to 94 it (e.g. an I2C transaction). The associated interrupt handler cannot do it 95 (for latency reason), so it will raise an event to ask a task to do it. 96 97The task code chooses to consume them (or a subset of them) when it's running 98through the `task_wait_event()` and `task_wait_event_mask()` primitives. 99 100### Scheduling and preemption 101 102The system has a global bitmap[1] called `tasks_ready` containing one bit 103per task and indicating whether or not it is *ready* *to* *run* 104(ie want/need to be scheduled). 105The task ready bit can only be cleared when it's calling itself one of the 106functions explicitly triggering a re-scheduling (e.g. `task_wait_event()` 107or `task_set_event()`) **and** it has no pending event. 108The task ready bit is set by any task or interrupt handler setting an event 109bit for the task (ie `task_set_event()`). 110 111The scheduling is based on (and *only* on) the `tasks_ready` bitmap 112(which is derived from all the events bitmap of the tasks as explained above). 113 114Then, the scheduling policy to find which task should run is just finding the 115most significant bit set in the tasks_ready bitmap and schedule the corresponding task. 116 117Important note: the re-scheduling happens **only** when we are exiting the interrupt context. 118It is done in a non-preemptible context (likely with the highest priority). 119Indeed, a re-scheduling is actually needed only when the highest priority task ready has changed. 120There are 3 distinct cases where this can happen: 121 122- an interrupt handler sets a new event for a task. 123 In this case, `task_set_event` will detect that it is executed in interrupt 124 context and record in the `need_resched_or_profiling` variable that it might 125 need to re-schedule at interrupt return. When the current interrupt is going 126 to return, it will see this bit and decide to take the slow path making a new 127 scheduling decision and eventually a context switch instead of the fast path 128 returning to the interrupt task. 129- a task sets an event on another task. 130 The runtime will trigger a software interrupt to force a re-scheduling at its 131 exit. 132- the running task voluntarily relinguish its current execution rights by 133 calling `task_wait_event()` or a similar function. 134 This will call the software interrupt similarly to the previous case. 135 136On the re-scheduling path, if the highest-priority ready task is not matching 137the currently running one, it will perform a context-switch by saving all the 138processor registers on the current task stack, switch the stack pointer to the 139newly scheduled task, and restore the registers from the previously saved 140context from there. 141 142### hooks and deferred function 143 144The lowest priority task (ie Task 1, aka TASK_ID_HOOKS) is reserved to execute 145repetitive actions and future actions deferred in time without blocking the 146current task or creating a dedicated task (whose stack memory allocation would 147be wasting precious RAM). 148 149The HOOKS task has a list of deferred functions and their next deadline. 150Every time it is waken up, it runs through the list and calls the ones whose 151deadline is expired. Before going back to sleep, it arms a timer to the closest 152deadline. 153The deferred functions can be created using the `DECLARED_DEFERRED()` macro. 154Similarly the HOOK_SECOND and HOOK_TICK hooks are called periodically by the 155HOOKS task loop (the *tick* duration is platform-defined and shorter than 156the second). 157 158Note: be specially careful about priority inversions when accessing resources 159protected by a mutex (e.g. a shared I2C controller) in a deferred function. 160Indeed being the lowest priority task, it might be de-scheduled for long time 161and starve higher priority tasks trying to access the resource given there is 162no priority boosting implemented for this case. 163Also be careful about long delays (> x 100us) in hook or deferred function 164handlers, since those will starve other hooks of execution time. It is better 165to implement a state machine where you set up a subsequent call to a deferred 166function than have a long delay in your handler. 167 168### watchdog 169 170The system is always protected against misbehaving tasks and interrupt handlers 171by a hardware watchdog rebooting the CPU when it is not attended. 172 173The watchdog is petted in the HOOKS task, typically by declaring a HOOK_TICK 174doing it as regular intervals. Given this is the lowest priority task, 175this guarantees that all tasks are getting some run time during the watchdog 176period. 177 178Note: that's also why one should not sprinkle its code with `watchdog_reload()` 179to paper over long-running routine issues. 180 181To help debugging bad sequences triggering watchdog reboots, most platforms 182implement a warning mechanism defined under `CONFIG_WATCHDOG_HELP`. 183It's a timer firing at the middle of the watchdog period if it hasn't been 184petted by then, and dumping on the console the current state of the execution 185mainly to help finding a stuck task or handler. The normal execution is resumed 186though after this alert. 187 188### Startup 189 190The startup sequence goes through the following steps: 191 192- the assembly entry routine clears the .bss (uninitialized data), 193 copies the initialized data (and optionally the code if we are not executing 194 from flash), sets a stack pointer. 195- we can jump to the `main()` C routine at this point. 196- then we go through the hardware pre-init (before we have all the clocks to 197 run the peripherals normal) and init routines, in this rough order: 198 memory protection if any, gpios in their default state, 199 prepare the interrupt controller, set the clocks, then timers, 200 enable interrupts, init the debug UART and the watchdog. 201- finally start tasks. 202 203For the tasks startup, initially only the HOOKS task is marked as ready, 204so it is the first to start and can call all the HOOK_INIT handlers performing 205initializations before actually executing any real task code. 206Then all tasks are marked as ready, and the highest priority one is given 207the control. 208 209During all the startup sequence until the control is given the first task, 210we are using a speciak stack called 'system stack' which will be later re-used 211as the interrupts and exception stack. 212 213To prepare the first context switch, the code in `task_pre_init()` is stuffing 214all the tasks stacks with a *fake* saved context whose program counter is 215containing the task start address and the stack pointer is pointing to its 216reserved stack space. 217 218### locking and atomicity 219 220The two main concurrency primitives are lightweight atomic variables and 221heavier mutexes. 222 223The atomic variables are 32-bit integers (which can usually be loaded/stored 224atomically on the architecture we are supporting). The `atomic.h` headers 225include primitives to do atomically various bit and arithmetic operations 226using either load-linked/load-exclusive, store-conditional/store-exclusive 227or simple depending what is available. 228 229The mutexes are actually statically allocated binary semaphores. 230In case of contention, they will make the waiting task sleep 231(removing its ready bit) and use the [event mechanism](#Events) to wake-up 232the other waiters on unlocking. 233 234Note: the mutexes are NOT triggering any priority boosting to avoid the 235priority inversion phenomenon. 236 237Given the runtime is running on single core CPU, spinlocks would be equivalent 238to masking interrupts with `interrupt_disable()` spinlocks, but it's 239strongly discouraged to avoid harming the real-time characterics of the runtime. 240 241Time 242---- 243 244### time keeping 245 246In the runtime, the time is accounted everywhere using a 247**64-bit** **microsecond** count since the microcontroller **cold** **boot**. 248 249Note: The runtime has no notion of wall-time/date, even though a few platform have 250an RTC inside the microcontroller. 251 252These microsecond timestamps are implemented in the code using the `timestamp_t` 253type and the current timestamp is returned by the `get_time()` function. 254 255The time-keeping is preferably implemented using a 32-bit hardware 256free running counter at 1Mhz plus a 32-bit word in memory keeping track of 257the high word of the 64-bit absolute time. This word is incremented by the 25832-bit timer rollback interrupt. 259 260Note: as a consequence of this implementation, when the 64-bit timestamp is read 261in interrupt context in an handler having a higher priority than the timer IRQ 262(which is somewhat rare), the high 32-bit word might be incoherent (off by one). 263 264### timer event 265 266The runtime offers *one* (and only one) timer per task. 267All the task timers are multiplexed on a single hardware timer. 268(can be just a *match* *interrupt* on the free running counter mentioned in the 269[previous paragraph](#time-keeping)) 270Every time a timer is armed or expired, the runtime finds the task timer having 271the closest deadline and programs it in the hardware to get an interrupt. 272At the same time, it sets the TASK_EVENT_TIMER event in all tasks whose timer 273deadline has expired. 274The next deadline is computed in interrupt context. 275 276Note: given each task has a **single** timer which is also used to wake-up the 277task when `task_wait_event()` is called with a timeout, one needs to be careful 278when using directly the `timer_arm()` function because there is an eventuality 279that this timer is still running on the next `task_wait_event()` call, the call 280will fail due to the lack of available timer. 281 282Memory 283------ 284 285### Single address space 286 287There is no memory isolation between tasks (ie they all live in the same address 288space). Some architectures implement memory protection mechanism albeit only to 289differentiate executable area (eg `.code`) from writable area (eg `.bss` or 290`.data`) as there is a **single** **privilege** level for all execution contexts. 291 292As all the memory is implicitely shared between the task, the inter-task 293communication can be done by simply writing the data structures in memory 294and using events to wake the other task (given we properly thought the concurrent 295accesses on thoses structures). 296 297### heap 298 299The data structure should be statically allocated at compile time. 300 301Note: there is no dynamic allocator available (e.g. `malloc()`), not due to 302impossibility to create one but to avoid the negative side effects of 303having one: ie poor/unpredictable real-time behavior and possible leaks 304leading to a long-tail of failures. 305 306- TODO: talk about shared memory 307- TODO: where/how we store *panic* *memory* and *sysjump* *parameters*. 308 309### stacks 310 311Each task has its own stack, in addition there is a system stack used for 312startup and interrupts/exceptions. 313 314Note 1: Each task stack is relatively small (e.g. 512 bytes), so one needs to 315be careful about stack usage when implementing features. 316 317Note 2: At the same time, the total size of RAM used by stacks is a big chunk 318of the total RAM consumption, so their sizes need to be carefully tuned. 319(please refer to the [debugging paragraph](#debugging) for additional input on 320this topic. 321 322## Firmware code organization and multiple copies 323 324- TODO: Details the classical RO / RW partitions and how we sysjump. 325 326power management 327---------------- 328 329- TODO: talk about the idle task + WFI (note: interrupts are disabled!) 330- TODO: more about low power idle and the sleep-disable bitmap 331- TODO: adjusting the microsecond timer at wake-up 332 333debugging 334--------- 335 336- TODO: our main tool: serial console ... 337(but non-blocking / discard overflow, cflush DO/DONT) 338- TODO: else JTAG stop and go: careful with watchdog and timer 339- TODO: panics and software panics 340- TODO: stack size tuning and canarying 341 342 343- TODO: Address the rest of the comments from https://crrev.com/c/445941 344 345[1]: bitmap: array of bits. 346