dp: Fix DP scheduler locking #10327

serhiy-katsyuba-intel · 2025-10-24T15:40:39Z

When at least two DP modules are running, each on a separate core, using irq_lock() may lead to interrupts being disabled for a very long time. When module_is_ready_to_process() always returns true, the DP task is executed in a loop all the time except for periods when preempted by higher priority threads. irq_lock() disables interrupts globally. Using irq_lock() on multiple cores can lead to unbalanced double locks without unlock in between.

Consider the case: core 1 calls irq_lock(1); this does not prevent core 2 from also calling flags = irq_lock(2); now flags contains the "interrupts disabled" state as interrupts were previously globally disabled by core 1. Then core 1 calls irq_unlock() -- interrupts are re-enabled; then core 2 calls irq_unlock(flags) to restore interrupts, which actually leads to interrupts being disabled. On the next loop iteration, core 1 calls flags = irq_lock(1), and since then interrupts might be disabled forever with only two DP threads constantly running.

This fixes a regression in multicore DP tests. The issue is triggered by this commit 4225c27, which just allows the DP task to run all the available time without being triggered by LL for every cycle.

When at least two DP modules are running, each on a separate core, using irq_lock() may lead to interrupts being disabled for a very long time. When module_is_ready_to_process() always returns true, the DP task is executed in a loop all the time except for periods when preempted by higher priority threads. irq_lock() disables interrupts globally. Using irq_lock() on multiple cores can lead to unbalanced double locks without unlock in between. Consider the case: core 1 calls irq_lock(1); this does not prevent core 2 from also calling flags = irq_lock(2); now flags contains the "interrupts disabled" state as interrupts were previously globally disabled by core 1. Then core 1 calls irq_unlock() -- interrupts are re-enabled; then core 2 calls irq_unlock(flags) to restore interrupts, which actually leads to interrupts being disabled. On the next loop iteration, core 1 calls flags = irq_lock(1), and since then interrupts might be disabled forever with only two DP threads constantly running. This fixes a regression in multicore DP tests. The issue is triggered by this commit 4225c27, which just allows the DP task to run all the available time without being triggered by LL for every cycle. Signed-off-by: Serhiy Katsyuba <serhiy.katsyuba@intel.com>

softwarecki

I'm glad my solution helped resolve the issue :)

lgirdwood · 2025-10-27T16:58:05Z

@serhiy-katsyuba-intel can you check the internal CI. Thanks !

serhiy-katsyuba-intel · 2025-10-27T17:09:59Z

@serhiy-katsyuba-intel can you check the internal CI. Thanks !

The CI fails on test_00_11_enter_d3_with_topology_stress test on NVL FPGA. That test does not use DP, so should not be directly affected by the changes from this PR. I checked few neighboring PRs: they all fail the same test. cc: @lrudyX , @tmleman .

lrudyX · 2025-10-28T08:36:58Z

test_00_11_enter_d3_with_topology_stress

Fail is related to DUT issue. Working on solving this problem.

serhiy-katsyuba-intel · 2025-10-30T11:04:36Z

Internal Intel CI now is working and passed successfully.

serhiy-katsyuba-intel requested review from LaurentiuM1234, dbaluta, marcinszkudlinski and pblaszko as code owners October 24, 2025 15:40

serhiy-katsyuba-intel requested review from abonislawski, softwarecki and tmleman October 24, 2025 15:50

lyakh approved these changes Oct 27, 2025

View reviewed changes

softwarecki approved these changes Oct 27, 2025

View reviewed changes

tmleman approved these changes Oct 27, 2025

View reviewed changes

lgirdwood approved these changes Oct 27, 2025

View reviewed changes

abonislawski approved these changes Oct 28, 2025

View reviewed changes

tmleman mentioned this pull request Oct 28, 2025

lib: cpu: Check return value from platform_boot_complete #10326

Merged

abonislawski merged commit fe861a6 into thesofproject:main Oct 30, 2025
39 of 45 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

dp: Fix DP scheduler locking #10327

dp: Fix DP scheduler locking #10327

Uh oh!

serhiy-katsyuba-intel commented Oct 24, 2025

Uh oh!

softwarecki left a comment

Uh oh!

lgirdwood commented Oct 27, 2025

Uh oh!

serhiy-katsyuba-intel commented Oct 27, 2025

Uh oh!

lrudyX commented Oct 28, 2025

Uh oh!

serhiy-katsyuba-intel commented Oct 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

dp: Fix DP scheduler locking #10327

dp: Fix DP scheduler locking #10327

Uh oh!

Conversation

serhiy-katsyuba-intel commented Oct 24, 2025

Uh oh!

softwarecki left a comment

Choose a reason for hiding this comment

Uh oh!

lgirdwood commented Oct 27, 2025

Uh oh!

serhiy-katsyuba-intel commented Oct 27, 2025

Uh oh!

lrudyX commented Oct 28, 2025

Uh oh!

serhiy-katsyuba-intel commented Oct 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants