Lines Matching +full:system +full:- +full:management

2 PCI Power Management
8 management. Based on previous work by Patrick Mochel <[email protected]>
11 This document only covers the aspects of power management specific to PCI
13 power management refer to Documentation/driver-api/pm/devices.rst and
18 1. Hardware and Platform Support for PCI Power Management
19 2. PCI Subsystem and Device Power Management
20 3. PCI Device Drivers and Power Management
24 1. Hardware and Platform Support for PCI Power Management
27 1.1. Native and Platform-Based Power Management
28 -----------------------------------------------
30 In general, power management is a feature allowing one to save energy by putting
31 devices into states in which they draw less power (low-power states) at the
34 Usually, a device is put into a low-power state when it is underutilized or
36 again, it has to be put back into the "fully functional" state (full-power
41 PCI devices may be put into low-power states in two ways, by using the device
42 capabilities introduced by the PCI Bus Power Management Interface Specification,
44 approach, that is referred to as the native PCI power management (native PCI PM)
51 Power Management Events (PMEs) to let the kernel know about external events
53 to put the device that sent it into the full-power state. However, the PCI Bus
54 Power Management Interface Specification doesn't define any standard method of
55 delivering the PME from the device to the CPU and the operating system kernel.
68 Thus in many situations both the native and the platform-based power management
71 1.2. Native PCI Power Management
72 --------------------------------
74 The PCI Bus Power Management Interface Specification (PCI PM Spec) was
77 management.
81 Spec, it has an 8 byte power management capability field in its PCI
83 features related to the native PCI power management.
85 The PCI PM Spec defines 4 operating states for devices (D0-D3) and for buses
86 (B0-B3). The higher the number, the less power is drawn by the device or bus
88 the device or bus to return to the full-power state (D0 or B0, respectively).
98 PCI bus power management, however, is not supported by the Linux kernel at the
101 Note that every PCI device can be in the full-power state (D0) or in D3cold,
107 supported low-power states (except for D3cold). While in D1-D3hot the
112 forth between D0 and the supported low-power states (except for D3cold) and the
115 +----------------------------+
117 +----------------------------+
119 +----------------------------+
121 +----------------------------+
123 +----------------------------+
125 +----------------------------+
129 a full power-on reset sequence and the power-on defaults are restored to the
133 while in any power state (D0-D3), but they are not required to be capable
139 1.3. ACPI Device Power Management
140 ---------------------------------
142 The platform firmware support for the power management of PCI devices is
143 system-specific. However, if the system in question is compliant with the
145 majority of x86-based systems, it is supposed to implement device power
146 management interfaces defined by the ACPI standard.
150 putting a device into a low-power state. These control methods are encoded
151 using special byte-code language called the ACPI Machine Language (AML) and
156 on the system design in a system-specific fashion.
163 ACPI methods used for device power management fall into that category.
167 D0-D3 states (although the difference between D3hot and D3cold is not taken
178 is going to be put into a low-power state (D1-D3) and is supposed to generate
187 system-wide transition into a sleep state or back into the working state. ACPI
188 defines four system sleep states, S1, S2, S3, and S4, and denotes the system
189 working state as S0. In general, the target system sleep (or working) state
193 If the device is required to wake up the system from the target sleep state, the
195 target state of the system. The kernel is then supposed to use the device's
201 ---------------------
205 putting the device into a low-power state, have to be caught and handled as
206 appropriate. If they are sent while the system is in the working state
208 put the devices generating them into the full-power state and take care of the
209 events that triggered them. In turn, if they are sent while the system is
210 sleeping, they should cause the system's core logic to trigger wakeup.
212 On ACPI-based systems wakeup signals sent by conventional PCI devices are
213 converted into ACPI General-Purpose Events (GPEs) which are hardware signals
214 from the system core logic generated in response to various events that need to
218 and event sources is recorded in the system's ACPI BIOS from where it can be
221 If a PCI device known to the system's ACPI BIOS signals wakeup, the GPE
225 example, native PCI PMEs from devices unknown to the system's ACPI BIOS may be
228 A GPE may be triggered when the system is sleeping (i.e. when it is in one of
229 the ACPI S1-S4 states), in which case system wakeup is started by its core logic
230 (the device that was the source of the signal causing the system wakeup to occur
234 Usually, however, GPEs are also triggered when the system is in the working
235 state (ACPI S0) and in that case the system's core logic generates a System
240 events occurring while the system is in the working state are referred to as
244 conventional PCI devices on systems that are not ACPI-based, but there is one
247 root ports. For conventional PCI devices native PMEs are out-of-band, so they
249 may be routed directly to the system's core logic), but for PCI Express devices
250 they are in-band messages that have to pass through the PCI Express hierarchy,
261 In principle the native PCI Express PME signaling may also be used on ACPI-based
262 systems along with the GPEs, but to use it the kernel has to ask the system's
269 2. PCI Subsystem and Device Power Management
272 2.1. Device Power Management Callbacks
273 --------------------------------------
275 The PCI Subsystem participates in the power management of PCI devices in a
277 the device power management core (PM core) and PCI device drivers.
280 pointers to several device power management callbacks::
303 device power management and they, in turn, execute power management callbacks
304 provided by PCI device drivers. They also perform power management operations
323 unsigned int d3hot_delay; /* D3hot->D0 transition time in ms */
331 --------------------------
333 The PCI subsystem's first task related to device power management is to
334 prepare the device for power management and initialize the fields of struct
339 and if that's the case the offset of its power management capability structure
341 pci_dev object. Next, the function checks which PCI low-power states are
342 supported by the device and from which low-power states the device can generate
343 native PCI PMEs. The power management fields of the device's struct pci_dev and
350 device's struct pci_dev and uses the firmware-provided method to prevent the
353 At this point the device is ready for power management. For driverless devices,
355 during system-wide transitions to a sleep state and back to the working state.
357 2.3. Runtime Device Power Management
358 ------------------------------------
360 The PCI subsystem plays a vital role in the runtime power management of PCI
361 devices. For this purpose it uses the general runtime power management
363 Namely, it provides subsystem-level callbacks::
371 in low-power states, which at the time of this writing works for both the native
372 PCI Express PME signaling and the ACPI GPE-based wakeup signaling described in
375 First, a PCI device is put into a low-power state, or suspended, with the help
378 driver has to provide a pm->runtime_suspend() callback (see below), which is
382 the target low-power state.
384 The low-power state to put the device into is the lowest-power (highest number)
386 system-dependent and is determined by the PCI subsystem on the basis of the
388 device for signaling wakeup and put it into the selected low-power state, the
392 It is expected that the device driver's pm->runtime_suspend() callback will
394 low-power state. The driver ought to leave these tasks to the PCI subsystem
400 driver provides a pm->runtime_resume() callback (see below). However, before
402 back into the full-power state, prevents it from signaling wakeup while in that
404 callback need not worry about the PCI-specific aspects of the device resume.
416 and pm_request_idle(), executes the device driver's pm->runtime_idle()
424 pm->runtime_idle() callback.
426 2.4. System-Wide Power Transitions
427 ----------------------------------
428 There are a few different types of system-wide power transitions, described in
429 Documentation/driver-api/pm/devices.rst. Each of them requires devices to be
430 handled in a specific way and the PM core executes subsystem-level power
431 management callbacks for this purpose. They are executed in phases such that
432 each phase involves executing the same subsystem-level callback for every device
436 2.4.1. System Suspend
439 When the system is going into a sleep state in which the contents of memory will
440 be preserved, such as one of the ACPI sleep states S1-S3, the phases are:
452 driver's pm->prepare() callback if defined (i.e. if the driver's struct
462 bridges are ignored by this routine). Next, the device driver's pm->suspend()
480 returns success. Otherwise the device driver's pm->suspend_noirq() callback is
485 a low-power state.
487 The low-power state to put the device into is the lowest-power (highest number)
488 state from which it can signal wakeup while the system is in the target sleep
490 signaling wakeup is system-dependent and determined by the PCI subsystem, which
491 is also responsible for preparing the device to signal wakeup from the system's
494 PCI device drivers (that don't implement legacy power management callbacks) are
496 into low-power states. However, if one of the driver's suspend callbacks
497 (pm->suspend() or pm->suspend_noirq()) saves the device's standard configuration
499 to signal wakeup and put into a low-power state by the driver (the driver is
504 2.4.2. System Resume
507 When the system is undergoing a transition from a sleep state in which the
509 S1-S3, into the working state (ACPI S0), the phases are:
520 The pci_pm_resume_noirq() routine first puts the device into the full-power
524 legacy PCI power management callbacks (this way all PCI devices are in the
525 full-power state and their standard configuration registers have been restored
528 by drivers whose devices are still suspended). If legacy PCI power management
531 device driver's pm->resume_noirq() callback is executed, if defined, and its
538 device's driver implements legacy PCI power management callbacks (see
541 its driver's pm->resume() callback is executed, if defined (the callback's
549 The pci_pm_complete() routine only executes the device driver's pm->complete()
552 2.4.3. System Hibernation
555 System hibernation is more complicated than system suspend, because it requires
556 a system image to be created and written into a persistent storage medium. The
561 the time of this writing the image creation requires at least 50% of system RAM
572 This means that the prepare phase is exactly the same as for system suspend.
576 the device driver's pm->freeze() callback, if defined, instead of pm->suspend(),
577 and it doesn't apply the suspend-related hardware quirks. It is executed
582 pci_pm_suspend_noirq(), but it calls the device driver's pm->freeze_noirq()
583 routine instead of pm->suspend_noirq(). It also doesn't attempt to prepare the
584 device for signaling wakeup and put it into a low-power state. Still, it saves
605 configuration registers. It also executes the device driver's pm->thaw_noirq()
606 callback, if defined, instead of pm->resume_noirq().
609 driver's pm->thaw() callback instead of pm->resume(). It is executed
613 The complete phase is the same as for system resume.
615 After saving the image, devices need to be powered down before the system can
616 enter the target sleep state (ACPI S4 for ACPI-based systems). This is done in
621 where the prepare phase is exactly the same as for system suspend. The other
623 The PCI subsystem-level callbacks they correspond to::
632 2.4.4. System Restore
635 System restore requires a hibernation image to be loaded into memory and the
636 pre-hibernation memory contents to be restored before the pre-hibernation system
639 As described in Documentation/driver-api/pm/devices.rst, the hibernation image
653 Should the restoration of the pre-hibernation memory contents fail, the boot
658 If the pre-hibernation memory contents are restored successfully, which is the
660 responsible for bringing the system back to the working state. To achieve this,
661 it must restore the devices' pre-hibernation functionality, which is done much
675 respectively, but they execute the device driver's pm->restore_noirq() and
676 pm->restore() callbacks, if available.
678 The complete phase is carried out in exactly the same way as during system
682 3. PCI Device Drivers and Power Management
685 3.1. Power Management Callbacks
686 -------------------------------
688 PCI device drivers participate in power management by providing callbacks to be
689 executed by the PCI subsystem's power management routines described above and by
690 controlling the runtime power management of their devices.
692 At the time of this writing there are two ways to define power management
694 dev_pm_ops structure described in Documentation/driver-api/pm/devices.rst, and
697 runtime power management callbacks and is not really suitable for any new
702 containing pointers to power management (PM) callbacks that will be executed by
717 The prepare() callback is executed during system suspend, during hibernation
718 (when a hibernation image is about to be created), during power-off after
719 saving a hibernation image and during system restore, when a hibernation image
731 in Documentation/driver-api/pm/notifiers.rst).
736 The suspend() callback is only executed during system suspend, after prepare()
737 callbacks have been executed for all devices in the system.
740 low-power state by the PCI subsystem. It is not required (in fact it even is
742 configuration registers of the device, prepare it for waking up the system, or
743 put it into a low-power state. All of these operations can very well be taken
749 registers, to prepare it for system wakeup (if necessary), and to put it into a
750 low-power state, respectively. Moreover, if the driver calls pci_save_state(),
756 can be invoked to handle an interrupt from the device, so all suspend-related
763 The suspend_noirq() callback is only executed during system suspend, after
764 suspend() callbacks have been executed for all devices in the system and
775 The freeze() callback is hibernation-specific and is executed in two situations,
777 in preparation for the creation of a system image, and during restore,
778 after a system image has been loaded into memory from persistent storage and the
783 the driver takes the responsibility for putting the device into a low-power
786 In that cases the freeze() callback should not prepare the device system wakeup
787 or put it into a low-power state. Still, either it or freeze_noirq() should
793 The freeze_noirq() callback is hibernation-specific. It is executed during
795 devices in preparation for the creation of a system image, and during restore,
796 after a system image has been loaded into memory and after prepare() and
810 The poweroff() callback is hibernation-specific. It is executed when the system
818 into a low-power state itself instead of allowing the PCI subsystem to do that,
820 pci_set_power_state() to prepare the device for system wakeup and to put it
821 into a low-power state, respectively, but it need not save the device's standard
827 The poweroff_noirq() callback is hibernation-specific. It is executed after
828 poweroff() callbacks have been executed for all devices in the system.
840 The resume_noirq() callback is only executed during system resume, after the
841 PM core has enabled the non-boot CPUs. The driver's interrupt handler will not
846 state in the resume_noirq phase of system resume and restores their standard
854 The resume() callback is only executed during system resume, after
855 resume_noirq() callbacks have been executed for all devices in the system and
858 This callback is responsible for restoring the pre-suspend configuration of the
865 The thaw_noirq() callback is hibernation-specific. It is executed after a
866 system image has been created and the non-boot CPUs have been enabled by the PM
868 loading of a hibernation image fails during system restore (it is then executed
869 after enabling the non-boot CPUs). The driver's interrupt handler will not be
880 The thaw() callback is hibernation-specific. It is executed after thaw_noirq()
881 callbacks have been executed for all devices in the system and after device
884 This callback is responsible for restoring the pre-freeze configuration of
890 The restore_noirq() callback is hibernation-specific. It is executed in the
892 the image kernel and the non-boot CPUs have been enabled by the image kernel's
898 suspend-resume cycle.
906 The restore() callback is hibernation-specific. It is executed after
907 restore_noirq() callbacks have been executed for all devices in the system and
922 - during system resume, after resume() callbacks have been executed for all
924 - during hibernation, before saving the system image, after thaw() callbacks
926 - during system restore, when the system is going back to its pre-hibernation
939 The runtime_suspend() callback is specific to device runtime power management
941 device is about to be suspended (i.e. quiesced and put into a low-power state)
945 put into a low-power state, but it must allow the PCI subsystem to perform all
946 of the PCI-specific actions necessary for suspending the device.
953 (i.e. put into the full-power state and programmed to process I/O normally) at
957 device after it has been put into the full-power state by the PCI subsystem.
997 3.1.19. Driver Flags for Power Management
1001 power management for the devices by the core itself and by middle layer code
1007 direct-complete mechanism allowing device suspend/resume callbacks to be skipped
1008 if the device is in runtime suspend when the system suspend starts. That also
1013 value from pci_pm_prepare() only if the ->prepare callback provided by the
1015 out from using the direct-complete mechanism dynamically (whereas setting
1016 DPM_FLAG_NO_DIRECT_COMPLETE means permanent opt-out).
1019 perspective the device can be safely left in runtime suspend during system
1021 to avoid resuming the device from runtime suspend unless there are PCI-specific
1024 suspend during the "late" phase of the system-wide transition under way.
1031 in suspend after a system-wide transition into the working state. This flag is
1040 3.2. Device Runtime Power Management
1041 ------------------------------------
1043 In addition to providing device power management callbacks PCI device drivers
1044 are responsible for controlling the runtime power management (runtime PM) of
1057 device should really be suspended and return -EAGAIN if that is not the case).
1070 zero for the device and it will never be runtime-suspended. The simplest
1084 should let user space or some platform-specific code do that (user space can
1102 by work items put into the power management workqueue, pm_wq. Although there
1103 are a few situations in which power management requests are automatically
1106 idle), device drivers are generally responsible for queuing power management
1124 PCI Bus Power Management Interface Specification, Rev. 1.2
1130 Documentation/driver-api/pm/devices.rst