xref: /aosp_15_r20/external/gsc-utils/docs/core_runtime.md (revision 4f2df630800bdcf1d4f0decf95d8a1cb87344f5f)
1Chromium OS Embedded Controller runtime
2=======================================
3
4Design principles
5-----------------
6
7  1. Never do at runtime what you can do at compile time
8The goal is saving flash space and computations.
9Compile-time configuration until you really need to switch at runtime.
10
11  2. Real-time: guarantee low latency (eg < 20 us)
12no interrupt disabling ...
13bounded code in interrupt handlers.
14
15  3. Keep it simple: design for the subset of microcontroller we use
16targeted at 32-bit single core CPU
17for small systems : 4kB to 64kB data RAM, possibly execute-in-place from flash.
18
19Execution contexts
20------------------
21
22This is a pre-emptible runtime with static tasks.
23It has only 2 possible execution contexts:
24
25- the regular [tasks](#tasks)
26- the [interrupt handlers](#interrupts)
27
28The initial startup is an exception as described in the
29[dedicated paragraph](#Startup).
30
31### tasks
32
33The tasks are statically defined at compile-time.
34They are described for each *board* in the
35[board/$board/ec.tasklist](../board/host/ec.tasklist) file.
36
37They also have a static fixed priority implicitly defined at compile-time by
38their order in the [ec.tasklist](../board/host/ec.tasklist) file (the top-most
39one being the lowest priority aka *task* *1*).
40As a consequence, two different tasks cannot have the same priority.
41
42In order to store its context, each task has its own stack whose (*small*) size
43is defined at compile-time in the [ec.tasklist](../board/host/ec.tasklist) file.
44
45A task can normally be preempted at any time by either interrupts or higher
46priority tasks, see the [preemption section](#scheduling-and-preemption) for
47details and the [locking section](#locking-and-atomicity) for the few cases
48where you need to avoid it.
49
50### interrupts
51
52The hardware interrupt requests are connected to the interruption handling
53*C* routines declared by the `DECLARE_IRQ` macros, through some chip/core
54specific mechanisms (e.g. depending whether we have a vectored interrupt
55controller, peripheral interrupt controllers...)
56
57The interrupts can be nested (ie interrupted by a higher priority interrupt).
58All the interrupt vectors are assigned a priority as defined in their
59`DECLARE_IRQ` macro. The number of available priority level is
60architecture-specific (e.g. 4 on Cortex-M0, 8 on Cortex-M3/M4) and several
61interrupt handlers can have the same priority. An interrupt handler can only be
62interrupted by an handler having a priority **strictly** **greater** than
63its own.
64
65In most cases, the exceptions (e.g data/prefetch aborts, software interrupt) can
66be seen as interrupts with a priority strictly greater than all IRQ vectors.
67So they can interrupt any IRQ handler using the same nesting mechanism.
68All fatal exceptions should ultimately lead to a reboot.
69
70### Events
71
72Each task has a *pending* events bitmap[1] implemented as a 32-bit word.
73Several events are pre-defined for all tasks, the most significant bits on the
7432-bit bitmap are reserved for them : the timer pending event on bit 31
75([see the corresponding section](#Timers)), the requested task wake (bit 29),
76the event to kick the waiters on a mutex (bit 30), along with a few hardware
77specific events.
78The 19 least significant bits are available for task-specific meanings.
79
80Those event bits are used in inter-task communication and scheduling mechanism,
81other tasks **and** interrupt handlers can atomically set them to request
82specific actions from the task. Therefore, the presence of pending events in a
83task bitmap has an impact on its scheduling as described in the [scheduling
84section](#scheduling-and-preemption).
85These requests are done using the `task_set_event()` and `task_wake()`
86primitives.
87
88The two typical use-cases are:
89
90- a task sends a message to another task (simply use some common memory
91  structures [see explanation](#single-address-space) and want it to process
92  it now.
93- an hardware IRQ occurred and we need to do some long processing to respond to
94  it (e.g. an I2C transaction). The associated interrupt handler cannot do it
95  (for latency reason), so it will raise an event to ask a task to do it.
96
97The task code chooses to consume them (or a subset of them) when it's running
98through the `task_wait_event()` and `task_wait_event_mask()` primitives.
99
100### Scheduling and preemption
101
102The system has a global bitmap[1] called `tasks_ready` containing one bit
103per task and indicating whether or not it is *ready* *to* *run*
104(ie want/need to be scheduled).
105The task ready bit can only be cleared when it's calling itself one of the
106functions explicitly triggering a re-scheduling (e.g. `task_wait_event()`
107or `task_set_event()`) **and** it has no pending event.
108The task ready bit is set by any task or interrupt handler setting an event
109bit for the task (ie `task_set_event()`).
110
111The scheduling is based on (and *only* on) the `tasks_ready` bitmap
112(which is derived from all the events bitmap of the tasks as explained above).
113
114Then, the scheduling policy to find which task should run is just finding the
115most significant bit set in the tasks_ready bitmap and schedule the corresponding task.
116
117Important note: the re-scheduling happens **only** when we are exiting the interrupt context.
118It is done in a non-preemptible context (likely with the highest priority).
119Indeed, a re-scheduling is actually needed only when the highest priority task ready has changed.
120There are 3 distinct cases where this can happen:
121
122- an interrupt handler sets a new event for a task.
123  In this case, `task_set_event` will detect that it is executed in interrupt
124  context and record in the `need_resched_or_profiling` variable that it might
125  need to re-schedule at interrupt return. When the current interrupt is going
126  to return, it will see this bit and decide to take the slow path making a new
127  scheduling decision and eventually a context switch instead of the fast path
128  returning to the interrupt task.
129- a task sets an event on another task.
130  The runtime will trigger a software interrupt to force a re-scheduling at its
131  exit.
132- the running task voluntarily relinguish its current execution rights by
133  calling `task_wait_event()` or a similar function.
134  This will call the software interrupt similarly to the previous case.
135
136On the re-scheduling path, if the highest-priority ready task is not matching
137the currently running one, it will perform a context-switch by saving all the
138processor registers on the current task stack, switch the stack pointer to the
139newly scheduled task, and restore the registers from the previously saved
140context from there.
141
142### hooks and deferred function
143
144The lowest priority task (ie Task 1, aka TASK_ID_HOOKS) is reserved to execute
145repetitive actions and future actions deferred in time without blocking the
146current task or creating a dedicated task (whose stack memory allocation would
147be wasting precious RAM).
148
149The HOOKS task has a list of deferred functions and their next deadline.
150Every time it is waken up, it runs through the list and calls the ones whose
151deadline is expired. Before going back to sleep, it arms a timer to the closest
152deadline.
153The deferred functions can be created using the `DECLARED_DEFERRED()` macro.
154Similarly the HOOK_SECOND and HOOK_TICK hooks are called periodically by the
155HOOKS task loop (the *tick* duration is platform-defined and shorter than
156the second).
157
158Note: be specially careful about priority inversions when accessing resources
159protected by a mutex (e.g. a shared I2C controller) in a deferred function.
160Indeed being the lowest priority task, it might be de-scheduled for long time
161and starve higher priority tasks trying to access the resource given there is
162no priority boosting implemented for this case.
163Also be careful about long delays (> x 100us) in hook or deferred function
164handlers, since those will starve other hooks of execution time. It is better
165to implement a state machine where you set up a subsequent call to a deferred
166function than have a long delay in your handler.
167
168### watchdog
169
170The system is always protected against misbehaving tasks and interrupt handlers
171by a hardware watchdog rebooting the CPU when it is not attended.
172
173The watchdog is petted in the HOOKS task, typically by declaring a HOOK_TICK
174doing it as regular intervals. Given this is the lowest priority task,
175this guarantees that all tasks are getting some run time during the watchdog
176period.
177
178Note: that's also why one should not sprinkle its code with `watchdog_reload()`
179to paper over long-running routine issues.
180
181To help debugging bad sequences triggering watchdog reboots, most platforms
182implement a warning mechanism defined under `CONFIG_WATCHDOG_HELP`.
183It's a timer firing at the middle of the watchdog period if it hasn't been
184petted by then, and dumping on the console the current state of the execution
185mainly to help finding a stuck task or handler. The normal execution is resumed
186though after this alert.
187
188### Startup
189
190The startup sequence goes through the following steps:
191
192- the assembly entry routine clears the .bss (uninitialized data),
193  copies the initialized data (and optionally the code if we are not executing
194  from flash), sets a stack pointer.
195- we can jump to the `main()` C routine at this point.
196- then we go through the hardware pre-init (before we have all the clocks to
197 run the peripherals normal) and init routines, in this rough order:
198   memory protection if any, gpios in their default state,
199   prepare the interrupt controller, set the clocks, then timers,
200   enable interrupts, init the debug UART and the watchdog.
201- finally start tasks.
202
203For the tasks startup, initially only the HOOKS task is marked as ready,
204so it is the first to start and can call all the HOOK_INIT handlers performing
205initializations before actually executing any real task code.
206Then all tasks are marked as ready, and the highest priority one is given
207the control.
208
209During all the startup sequence until the control is given the first task,
210we are using a speciak stack called 'system stack' which will be later re-used
211as the interrupts and exception stack.
212
213To prepare the first context switch, the code in `task_pre_init()` is stuffing
214all the tasks stacks with a *fake* saved context whose program counter is
215containing the task start address and the stack pointer is pointing to its
216reserved stack space.
217
218### locking and atomicity
219
220The two main concurrency primitives are lightweight atomic variables and
221heavier mutexes.
222
223The atomic variables are 32-bit integers (which can usually be loaded/stored
224atomically on the architecture we are supporting). The `atomic.h` headers
225include primitives to do atomically various bit and arithmetic operations
226using either load-linked/load-exclusive, store-conditional/store-exclusive
227or simple depending what is available.
228
229The mutexes are actually statically allocated binary semaphores.
230In case of contention, they will make the waiting task sleep
231(removing its ready bit) and use the [event mechanism](#Events) to wake-up
232the other waiters on unlocking.
233
234Note: the mutexes are NOT triggering any priority boosting to avoid the
235priority inversion phenomenon.
236
237Given the runtime is running on single core CPU, spinlocks would be equivalent
238to masking interrupts with `interrupt_disable()` spinlocks, but it's
239strongly discouraged to avoid harming the real-time characterics of the runtime.
240
241Time
242----
243
244### time keeping
245
246In the runtime, the time is accounted everywhere using a
247**64-bit** **microsecond** count since the microcontroller **cold** **boot**.
248
249Note: The runtime has no notion of wall-time/date, even though a few platform have
250an RTC inside the microcontroller.
251
252These microsecond timestamps are implemented in the code using the `timestamp_t`
253type and the current timestamp is returned by the `get_time()` function.
254
255The time-keeping is preferably implemented using a 32-bit hardware
256free running counter at 1Mhz plus a 32-bit word in memory keeping track of
257the high word of the 64-bit absolute time. This word is incremented by the
25832-bit timer rollback interrupt.
259
260Note: as a consequence of this implementation, when the 64-bit timestamp is read
261in interrupt context in an handler having a higher priority than the timer IRQ
262(which is somewhat rare), the high 32-bit word might be incoherent (off by one).
263
264### timer event
265
266The runtime offers *one* (and only one) timer per task.
267All the task timers are multiplexed on a single hardware timer.
268(can be just a *match* *interrupt* on the free running counter mentioned in the
269[previous paragraph](#time-keeping))
270Every time a timer is armed or expired, the runtime finds the task timer having
271the closest deadline and programs it in the hardware to get an interrupt.
272At the same time, it sets the TASK_EVENT_TIMER event in all tasks whose timer
273deadline has expired.
274The next deadline is computed in interrupt context.
275
276Note: given each task has a **single** timer which is also used to wake-up the
277task when `task_wait_event()` is called with a timeout, one needs to be careful
278when using directly the `timer_arm()` function because there is an eventuality
279that this timer is still running on the next `task_wait_event()` call, the call
280will fail due to the lack of available timer.
281
282Memory
283------
284
285### Single address space
286
287There is no memory isolation between tasks (ie they all live in the same address
288space). Some architectures implement memory protection mechanism albeit only to
289differentiate executable area (eg `.code`) from writable area (eg `.bss` or
290`.data`) as there is a **single** **privilege** level for all execution contexts.
291
292As all the memory is implicitely shared between the task, the inter-task
293communication can be done by simply writing the data structures in memory
294and using events to wake the other task (given we properly thought the concurrent
295accesses on thoses structures).
296
297### heap
298
299The data structure should be statically allocated at compile time.
300
301Note: there is no dynamic allocator available (e.g. `malloc()`), not due to
302impossibility to create one but to avoid the negative side effects of
303having one: ie poor/unpredictable real-time behavior and possible leaks
304leading to a long-tail of failures.
305
306- TODO: talk about shared memory
307- TODO: where/how we store *panic* *memory* and *sysjump* *parameters*.
308
309### stacks
310
311Each task has its own stack, in addition there is a system stack used for
312startup and interrupts/exceptions.
313
314Note 1: Each task stack is relatively small (e.g. 512 bytes), so one needs to
315be careful about stack usage when implementing features.
316
317Note 2: At the same time, the total size of RAM used by stacks is a big chunk
318of the total RAM consumption, so their sizes need to be carefully tuned.
319(please refer to the [debugging paragraph](#debugging) for additional input on
320this topic.
321
322## Firmware code organization and multiple copies
323
324- TODO: Details the classical RO / RW partitions and how we sysjump.
325
326power management
327----------------
328
329- TODO: talk about the idle task + WFI (note: interrupts are disabled!)
330- TODO: more about low power idle and the sleep-disable bitmap
331- TODO: adjusting the microsecond timer at wake-up
332
333debugging
334---------
335
336- TODO: our main tool: serial console ...
337(but non-blocking / discard overflow, cflush DO/DONT)
338- TODO: else JTAG stop and go: careful with watchdog and timer
339- TODO: panics and software panics
340- TODO: stack size tuning and canarying
341
342
343- TODO: Address the rest of the comments from https://crrev.com/c/445941
344
345[1]: bitmap: array of bits.
346