Lines Matching +full:kernel +full:- +full:policy
1 .. SPDX-License-Identifier: GPL-2.0
20 Operating Performance Points or P-states (in ACPI terminology). As a rule,
24 time (or the more power is drawn) by the CPU in the given P-state. Therefore
29 as possible and then there is no reason to use any P-states different from the
30 highest one (i.e. the highest-performance frequency/voltage configuration
38 put into different P-states.
41 capacity, so as to decide which P-states to put the CPUs into. Of course, since
51 The Linux kernel supports CPU performance scaling by means of the ``CPUFreq``
64 information on the available P-states (or P-state ranges in some cases) and
65 access platform-specific hardware interfaces to change CPU P-states as requested
70 performance scaling algorithms for P-state selection can be represented in a
71 platform-independent form in the majority of cases, so it should be possible
80 platform-independent way. For this reason, ``CPUFreq`` allows scaling drivers
85 ``CPUFreq`` Policy Objects
88 In some cases the hardware interface for P-state control is shared by multiple
90 control the P-state of multiple CPUs at the same time and writing to it affects
93 Sets of CPUs sharing hardware P-state control interfaces are represented by
100 CPUs share the same hardware P-state control interface, all of the pointers
104 of its user space interface is based on the policy concept.
123 logical CPU may be a physical single-core processor, or a single core in a
129 Once invoked, the ``CPUFreq`` core checks if the policy pointer is already set
130 for the given CPU and if so, it skips the policy object creation. Otherwise,
131 a new policy object is created and initialized, which involves the creation of
132 a new policy directory in ``sysfs``, and the policy pointer corresponding to
133 the given CPU is set to the new policy object's address in memory.
135 Next, the scaling driver's ``->init()`` callback is invoked with the policy
139 to, represented by its policy object) and, if the policy object it has been
140 called for is new, to set parameters of the policy, like the minimum and maximum
142 the set of supported P-states is not a continuous range), and the mask of CPUs
143 that belong to the same policy (including both online and offline CPUs). That
144 mask is then used by the core to populate the policy pointers for all of the
147 The next major initialization step for a new policy object is to attach a
149 determined by the kernel command line or configuration, but it may be changed
150 later via ``sysfs``). First, a pointer to the new policy object is passed to
151 the governor's ``->init()`` callback which is expected to initialize all of the
152 data structures necessary to handle the given policy and, possibly, to add
154 invoking its ``->start()`` callback.
156 That callback is expected to register per-CPU utilization update callbacks for
157 all of the online CPUs belonging to the given policy with the CPU scheduler.
162 to determine the P-state to use for the given policy going forward and to
164 the P-state selection. The scaling driver may be invoked directly from
165 scheduler context or asynchronously, via a kernel thread or workqueue, depending
168 Similar steps are taken for policy objects that are not new, but were "inactive"
171 to use the scaling governor previously used with the policy that became
172 "inactive" (and is re-initialized now) instead of the default governor.
175 other CPUs sharing the policy object with it are online already, there is no
176 need to re-initialize the policy object at all. In that case, it only is
178 into account. That is achieved by invoking the governor's ``->stop`` and
179 ``->start()`` callbacks, in this order, for the entire policy.
182 governor layer of ``CPUFreq`` and provides its own P-state selection algorithms.
184 new policy objects. Instead, the driver's ``->setpolicy()`` callback is invoked
185 to register per-CPU utilization update callbacks for each policy. These
187 governors, but in the |intel_pstate| case they both determine the P-state to
191 The policy objects created during CPU initialization and other data structures
193 (which happens when the kernel module containing it is unloaded, for example) or
194 when the last CPU belonging to the given policy in unregistered.
197 Policy Interface in ``sysfs``
200 During the initialization of the kernel, the ``CPUFreq`` core creates a
205 integer number) for every policy object maintained by the ``CPUFreq`` core.
209 associated with (or belonging to) the given policy. The ``policyX`` directories
210 in :file:`/sys/devices/system/cpu/cpufreq` each contain policy-specific
211 attributes (files) to control ``CPUFreq`` behavior for the corresponding policy
216 and what scaling governor is attached to the given policy. Some scaling drivers
217 also add driver-specific attributes to the policy directories in ``sysfs`` to
218 control policy-specific aspects of driver behavior.
224 List of online CPUs belonging to this policy (i.e. sharing the hardware
225 performance scaling interface represented by the ``policyX`` policy
235 BIOS/HW-based mechanisms.
244 Current frequency of the CPUs belonging to this policy as obtained from
252 Maximum possible operating frequency the CPUs belonging to this policy
256 Minimum possible operating frequency the CPUs belonging to this policy
260 The time it takes to switch the CPUs belonging to this policy from one
261 P-state to another, in nanoseconds.
264 work with the `ondemand`_ governor, -1 (:c:macro:`CPUFREQ_ETERNAL`)
268 List of all (online and offline) CPUs belonging to this policy.
271 List of available frequencies of the CPUs belonging to this policy
275 List of ``CPUFreq`` scaling governors present in the kernel that can
276 be attached to this policy or (if the |intel_pstate| scaling driver is
278 applied to this policy.
281 kernel module for the governor held by it to become available and be
285 Current frequency of all of the CPUs belonging to this policy (in kHz).
287 In the majority of cases, this is the frequency of the last P-state
302 The scaling governor currently attached to this policy or (if the
304 provided by the driver that is currently applied to this policy.
306 This attribute is read-write and writing to it will cause a new scaling
307 governor to be attached to this policy or a new scaling algorithm
314 Maximum frequency the CPUs belonging to this policy are allowed to be
317 This attribute is read-write and writing a string representing an
322 Minimum frequency the CPUs belonging to this policy are allowed to be
325 This attribute is read-write and writing a string representing a
326 non-negative integer to it will cause a new limit to be set (it must not
331 is attached to the given policy.
334 be written to in order to set a new frequency for the policy.
344 Scaling governors are attached to policy objects and different policy objects
348 The scaling governor for a given policy object can be changed at any time with
349 the help of the ``scaling_governor`` policy attribute in ``sysfs``.
351 Some governors expose ``sysfs`` attributes to control or fine-tune the scaling
353 tunables, can be either global (system-wide) or per-policy, depending on the
355 per-policy, they are located in a subdirectory of each policy directory.
362 ---------------
364 When attached to a policy object, this governor causes the highest frequency,
365 within the ``scaling_max_freq`` policy limit, to be requested for that policy.
367 The request is made once at that time the governor for the policy is set to
369 policy limits change after that.
372 -------------
374 When attached to a policy object, this governor causes the lowest frequency,
375 within the ``scaling_min_freq`` policy limit, to be requested for that policy.
377 The request is made once at that time the governor for the policy is set to
379 policy limits change after that.
382 -------------
385 to set the CPU frequency for the policy it is attached to by writing to the
386 ``scaling_setspeed`` attribute of that policy.
389 -------------
397 should be changed for a given policy (that depends on whether or not the driver
403 the allowed maximum (that is, the ``scaling_max_freq`` policy limit). In turn,
405 Per-Entity Load Tracking (PELT) metric for the root control group of the
406 given CPU as the CPU utilization estimate (see the *Per-entity load tracking*
414 policy (if the PELT number is frequency-invariant), or the current CPU frequency
419 "IO-wait boosting". That happens when the :c:macro:`SCHED_CPUFREQ_IOWAIT` flag
442 ------------
448 time in which the given CPU was not idle. The ratio of the non-idle (active)
451 If this governor is attached to a policy shared by multiple CPUs, the load is
453 for the entire policy.
456 invoked asynchronously (via a workqueue) and CPU P-states are updated from
459 relatively often and the CPU P-state updates triggered by it can be relatively
465 the value of the ``cpuinfo_max_freq`` policy attribute corresponds to the load of
466 1 (or 100%), and the value of the ``cpuinfo_min_freq`` policy attribute
469 it is allowed to use (the ``scaling_max_freq`` policy limit).
479 to ``cpuinfo_transition_latency`` on each policy this governor is
483 If this tunable is per-policy, the following shell command sets the time
491 will set the frequency to the maximum value allowed for the policy.
528 f * (1 - ``powersave_bias`` / 1000)
542 The performance of a workload with the sensitivity of 0 (memory-bound or
543 IO-bound) is not expected to increase at all as a result of increasing
545 (CPU-bound) are expected to perform much better if the CPU frequency is
551 target, so as to avoid over-provisioning workloads that will not benefit
555 ----------------
564 battery-powered). To achieve that, it changes the frequency in relatively
565 small steps, one step at a time, up or down - depending on whether or not a
572 allowed to set (the ``scaling_max_freq`` policy limit), between 0 and
579 ``scaling_max_freq`` policy limits.
602 ----------
611 "Turbo-Core" or (in technical documentation) "Core Performance Boost" and so on.
616 The frequency boost mechanism may be either hardware-based or software-based.
617 If it is hardware-based (e.g. on x86), the decision to trigger the boosting is
620 limits). If it is software-based (e.g. on ARM), the scaling driver decides
624 -------------------------------
629 but provides a driver-specific interface for controlling it, like
634 trigger boosting (in the hardware-based case), or the software is allowed to
635 trigger boosting (in the software-based case). It does not mean that boosting
646 --------------------------------
676 single-thread performance may vary because of it which may lead to
682 -----------------------
684 The AMD powernow-k8 scaling driver supports a ``sysfs`` knob very similar to
688 If present, that knob is located in every ``CPUFreq`` policy directory in
691 implementation, however, works on the system-wide basis and setting that knob
692 for one policy causes the same value of it to be set for all of the other
696 hardware feature, but it may be configured out of the kernel (via the
711 .. [1] Jonathan Corbet, *Per-entity load tracking*,