Skip to content

Commit ef4ad74

Browse files
committed
feat(linux): Add s2idle docs
Document all about s2idle and PSCI and how the whole stack helps us in selecting between low power modes on the TI AM62L Signed-off-by: Dhruva Gole <d-gole@ti.com>
1 parent ea4cabe commit ef4ad74

File tree

3 files changed

+299
-0
lines changed

3 files changed

+299
-0
lines changed

configs/AM62LX/AM62LX_linux_toc.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,7 @@ linux/Foundational_Components_Power_Management
7676
linux/Foundational_Components/Power_Management/pm_overview
7777
linux/Foundational_Components/Power_Management/pm_cpuidle
7878
linux/Foundational_Components/Power_Management/pm_am62lx_low_power_modes
79+
linux/Foundational_Components/Power_Management/pm_psci_s2idle
7980
linux/Foundational_Components/Power_Management/pm_wakeup_sources
8081

8182
#linux/Foundational_Components/System_Security/SELinux
Lines changed: 297 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,297 @@
1+
.. _pm_s2idle_psci:
2+
3+
#############################################
4+
Suspend-to-Idle (S2Idle) and PSCI Integration
5+
#############################################
6+
7+
**********************************
8+
Suspend-to-Idle (S2Idle) Overview
9+
**********************************
10+
11+
Suspend-to-Idle (s2idle), also known as "freeze," is a generic, pure software, light-weight variant of system suspend.
12+
In this state, the Linux kernel freezes user space tasks, suspends devices, and then puts all CPUs into their deepest available idle state.
13+
14+
**********************************
15+
S2Idle vs Deep Sleep (mem)
16+
**********************************
17+
18+
On ARM64 platforms, both ``s2idle`` and ``deep`` states can achieve similar power savings (e.g., suspending to RAM / DDR Self-Refresh).
19+
The primary differences lie in the software execution flow, specifically how CPUs are managed and which PSCI APIs are invoked.
20+
21+
.. list-table:: S2Idle vs Deep Sleep (ARM64)
22+
:widths: 20 40 40
23+
:header-rows: 1
24+
25+
* - Feature
26+
- s2idle (Suspend-to-Idle)
27+
- deep (Suspend-to-RAM)
28+
29+
* - **Kernel String**
30+
- ``s2idle`` or ``freeze``
31+
- ``deep`` or ``mem``
32+
33+
* - **Non-boot CPUs**
34+
- **Kept Online**: Non-boot CPUs are put into a deep idle state but remain logically online.
35+
- **Offlined**: Non-boot CPUs are hot-unplugged (removed) from the system via ``CPU_OFF``.
36+
37+
* - **Entry Path**
38+
- **cpuidle**: Uses the standard CPU idle framework and governance. It runtime suspends each driver to make sure it's idle.
39+
- **suspend_ops**: Uses platform-specific suspend operations like each driver's suspend ops. No governor involvement.
40+
41+
* - **PSCI Call**
42+
- ``CPU_SUSPEND``: Invoked for every core (Last Man Standing logic coordinates the cluster/system depth).
43+
- ``SYSTEM_SUSPEND``: Typically invoked by the last active CPU after others are offlined.
44+
45+
* - **Resume Flow**
46+
- **Fast**: CPUs exit the idle loop immediately upon interrupt. Context is preserved.
47+
- **Slow**: Kernel must serially bring secondary CPUs back online (**Hotplug**). This involves recreating
48+
kernel threads, re-enabling interrupts, resuming each driver and per-CPU state restoration for every non-boot core.
49+
50+
* - **Latency**
51+
- Lower
52+
- High, primarily due to the overhead of **CPU Hotplug** for non-boot CPUs
53+
54+
**********************
55+
PSCI as the Enabler
56+
**********************
57+
58+
The Power State Coordination Interface (PSCI) is an ARM-defined standard that acts as the fundamental
59+
enabler for s2idle on ARM platforms. PSCI defines a standardized firmware interface that allows the
60+
Operating System (OS) to request power states without needing intimate knowledge of the underlying
61+
SoC.
62+
63+
**s2idle Call Flow:**
64+
65+
.. code-block:: text
66+
67+
Linux Kernel PSCI Firmware (TF-A)
68+
============ ====================
69+
70+
1. Freeze tasks
71+
|
72+
v
73+
2. Suspend devices
74+
|
75+
v
76+
3. cpuidle driver -----------> CPU_SUSPEND (SMC/HVC)
77+
(per CPU) |
78+
| v
79+
| Coordinate power
80+
| state requests
81+
| |
82+
| v
83+
| Enter low-power
84+
| hardware state
85+
|
86+
|<--------- Resume ---------
87+
|
88+
v
89+
4. Resume devices
90+
|
91+
v
92+
5. Thaw tasks
93+
94+
The `cpuidle` driver calls the PSCI `CPU_SUSPEND` API to transition the CPU (and potentially
95+
higher-level topology nodes like clusters) into a low-power state. The effectiveness of s2idle
96+
depends heavily on the PSCI implementation's ability to coordinate these requests and enter
97+
the deepest possible hardware state.
98+
99+
**************************
100+
OS Initiated (OSI) Mode
101+
**************************
102+
103+
PSCI 1.0 introduced **OS Initiated (OSI)** mode, which shifts the responsibility of power state coordination from the platform firmware to the Operating System.
104+
105+
In the default **Platform Coordinated (PC)** mode, the OS independently requests a state for each core. The firmware then aggregates these requests (voting) to
106+
determine if a cluster or the system can be powered down.
107+
108+
In **OS Initiated (OSI)** mode, the OS explicitly manages the hierarchy. The OS determines when the last core in a power domain (e.g., a cluster) is going idle
109+
and explicitly requests the power-down of that domain.
110+
111+
Why OSI?
112+
========
113+
114+
OSI mode allows the OS to make better power decisions because it has visibility into:
115+
* **Task Scheduling:** The OS knows when other cores will wake up.
116+
* **Wakeup Latencies:** The OS can respect Quality of Service (QoS) latency constraints more accurately.
117+
* **Usage Patterns:** The OS can predict idle duration better than firmware.
118+
119+
OSI Sequence
120+
============
121+
122+
The coordination in OSI mode follows a specific "Last Man Standing" sequence. The OS tracks the state of all cores in a topology node (e.g., a cluster).
123+
124+
.. code-block:: text
125+
126+
OSI "Last Man Standing" Flow
127+
128+
Cluster with 2 Cores OS Action PSCI Request
129+
==================== ========= =============
130+
131+
1. Core 0,1: ACTIVE
132+
|
133+
| Core 0 becomes idle
134+
v
135+
2. Core 0: IDLE --> OS requests local --> CPU_SUSPEND
136+
Core 1: ACTIVE Core Power Down (Core PD only)
137+
Cluster stays ON
138+
|
139+
| Core 1 (LAST) becomes idle
140+
v
141+
3. Core 0,1: IDLE --> OS recognizes --> CPU_SUSPEND
142+
"Last Man" scenario (Composite State)
143+
Requests Composite:
144+
- Core 1: PD Core: PD
145+
- Cluster: PD Cluster: PD
146+
- System: PD System: PD
147+
|
148+
v
149+
4. Firmware Verification --> PSCI firmware checks
150+
& System Power Down all cores/clusters idle
151+
If verified: Power down
152+
entire system
153+
If not: Deny request
154+
(race condition)
155+
156+
**Detailed Steps:**
157+
158+
1. **First Core Idle:** When the first core in a cluster goes idle, the OS requests a local idle state
159+
for that core (e.g., Core Power Down) but keeps the cluster running.
160+
161+
2. **Last Core Idle:** When the *last* active core in the cluster is ready to go idle, the OS recognizes
162+
that the entire cluster, and potentially the system, can now be powered down.
163+
164+
3. **Composite Request:** The last core issues a `CPU_SUSPEND` call that requests a **composite state**:
165+
166+
* **Core State:** Power Down
167+
* **Cluster State:** Power Down
168+
* **System State:** Power Down (as demonstrated in the diagram)
169+
170+
4. **Firmware Enforcement:** The PSCI firmware verifies that all other cores and clusters in the requested node are indeed idle.
171+
If they are not, the request is denied (to prevent race conditions).
172+
173+
*************************************
174+
Understanding the Suspend Parameter
175+
*************************************
176+
177+
The `power_state` parameter passed to `CPU_SUSPEND` is the key to requesting these states.
178+
In OSI mode, this parameter must encode the intent for the entire hierarchy.
179+
180+
Power State Parameter Encoding
181+
================================
182+
183+
The `power_state` is a 32-bit parameter defined by the ARM PSCI specification (ARM DEN0022C).
184+
It has two encoding formats, controlled by the platform's build configuration.
185+
186+
Standard Format
187+
===============
188+
189+
This is the default format used by most platforms:
190+
191+
.. code-block:: text
192+
193+
31 26 25 24 23 17 16 15 0
194+
+---------------+------+----------------+----+----------------------+
195+
| Reserved | Pwr | Reserved | ST | State ID |
196+
| (must be 0) | Level| (must be 0) | | (platform-defined) |
197+
+---------------+------+----------------+----+----------------------+
198+
199+
.. list-table:: Standard Format Bit Fields
200+
:widths: 20 80
201+
:header-rows: 1
202+
203+
* - Bit Field
204+
- Description
205+
206+
* - **[31:26]**
207+
- **Reserved**: Must be zero.
208+
209+
* - **[25:24]**
210+
- **Power Level**: Indicates the deepest power domain level that can be powered down.
211+
212+
* ``0``: CPU/Core level
213+
* ``1``: Cluster level
214+
* ``2``: System level
215+
* ``3``: Higher levels (platform-specific)
216+
217+
* - **[23:17]**
218+
- **Reserved**: Must be zero.
219+
220+
* - **[16]**
221+
- **State Type (ST)**: Type of power state.
222+
223+
* ``0``: Standby or Retention (low latency, context preserved)
224+
* ``1``: Power Down (higher latency, may lose context)
225+
226+
* - **[15:0]**
227+
- **State ID**: Platform-specific identifier for the requested power state. The OS and
228+
platform firmware must agree on the meaning of these values, typically defined through
229+
device tree bindings.
230+
231+
**OSI Mode Consideration:**
232+
233+
In OSI mode, the OS is responsible for tracking which cores are idle. When the last core
234+
in a cluster issues this `CPU_SUSPEND` call with Power Level = 1, the PSCI firmware:
235+
236+
1. Verifies that all other cores in the cluster are already in a low-power state
237+
2. If verified, powers down the entire cluster
238+
3. If not verified (race condition), denies the request with an error code
239+
240+
The State ID field is platform-defined and typically documented in the device tree
241+
``idle-state`` nodes using the ``arm,psci-suspend-param`` property. This mechanism,
242+
leveraging ``cpuidle`` and ``s2idle``, allows the kernel to abstract complex platform-specific
243+
low-power modes into a generic framework. The ``idle-state`` nodes in the Device Tree define these power states,
244+
including their entry/exit latencies and target power consumption, enabling the ``cpuidle`` governor to make informed
245+
decisions about which idle state to enter based on system load and predicted idle duration.
246+
247+
The ``arm,psci-suspend-param`` property then directly maps these idle states to the corresponding PSCI ``power_state`` parameter values that the firmware understands.
248+
249+
Example: System Suspend (Standard Format)
250+
=========================================
251+
252+
When the OS targets a system-wide suspend state (e.g., Suspend-to-RAM), the `power_state` parameter is constructed to target the highest power level.
253+
Consider the example value **0x02012234**:
254+
255+
.. list-table:: Power State Parameter Breakdown (0x02012234)
256+
:widths: 20 20 20 40
257+
:header-rows: 1
258+
259+
* - Field
260+
- Bits
261+
- Value
262+
- Meaning
263+
264+
* - Reserved
265+
- [31:26]
266+
- 0
267+
- Must be zero
268+
269+
* - Power Level
270+
- [25:24]
271+
- 2
272+
- System level
273+
274+
* - Reserved
275+
- [23:17]
276+
- 0
277+
- Must be zero
278+
279+
* - State Type
280+
- [16]
281+
- 1
282+
- Power Down
283+
284+
* - State ID
285+
- [15:0]
286+
- 0x2234
287+
- Platform-specific (e.g., "S2RAM")
288+
289+
**Interpretation:**
290+
291+
* **Power Level = 2** tells the firmware that a system-level transition is requested.
292+
* **State Type = 1** indicates a power-down state.
293+
* **State ID = 0x2234** is the platform-specific identifier for this system state.
294+
295+
In the context of **s2idle**, if the OS determines that all constraints are met for system suspension,
296+
the last active CPU (Last Man) will invoke `CPU_SUSPEND` with this parameter. The PSCI firmware then
297+
coordinates the final steps to suspend the system (e.g., placing DDR in self-refresh and powering down the SoC).

source/linux/Foundational_Components_Power_Management.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ Power Management
1414
Foundational_Components/Power_Management/pm_rtc_ddr
1515
Foundational_Components/Power_Management/pm_low_power_modes
1616
Foundational_Components/Power_Management/pm_am62lx_low_power_modes
17+
Foundational_Components/Power_Management/pm_psci_s2idle
1718
Foundational_Components/Power_Management/pm_wakeup_sources
1819
Foundational_Components/Power_Management/pm_sw_arch
1920
Foundational_Components/Power_Management/pm_debug

0 commit comments

Comments
 (0)