Skip to content

IDC: fix a race#10655

Merged
lgirdwood merged 2 commits intothesofproject:mainfrom
lyakh:idc
Mar 27, 2026
Merged

IDC: fix a race#10655
lgirdwood merged 2 commits intothesofproject:mainfrom
lyakh:idc

Conversation

@lyakh
Copy link
Collaborator

@lyakh lyakh commented Mar 26, 2026

fix an IDC race when using P4WQ

Copilot AI review requested due to automatic review settings March 26, 2026 14:43
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes an IDC submission race in the Zephyr/P4WQ-based IDC path by changing how work items are allocated/submitted per target core, and aligns a notifier call site with the codebase’s “local core mask” convention.

Changes:

  • Expand IDC per-target work storage to two slots and add a mutex intended to serialize submissions.
  • Adjust IDC work selection logic to avoid re-submitting a work item that hasn’t been fully released by P4WQ yet.
  • Replace a direct core-bitmask expression with NOTIFIER_TARGET_CORE_LOCAL in the DAI trigger notifier.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
src/idc/zephyr_idc.c Attempts to prevent P4WQ re-submit race by adding 2 work items/core and a mutex around selection/submission.
src/audio/dai-zephyr.c Uses NOTIFIER_TARGET_CORE_LOCAL instead of BIT(cpu_get_id()) when firing local-core notifier events.
Comments suppressed due to low confidence (1)

src/idc/zephyr_idc.c:156

  • k_mutex_lock(&idc_mutex, ...) is not released on the error path when the target core is down. The early return -EACCES; will leave the mutex locked and can deadlock all subsequent IDC sends. Ensure the mutex is unlocked before returning (or check cpu_is_core_enabled() before taking the mutex).
	k_mutex_lock(&idc_mutex, K_FOREVER);

	if (work->thread)
		/* See comment above the idc_work[] array. */
		work++;

	/* Same priority as the IPC thread which is an EDF task and under Zephyr */
	work->priority = CONFIG_EDF_THREAD_PRIORITY;
	work->deadline = 0;
	work->handler = idc_handler;
	work->sync = mode == IDC_BLOCKING;

	if (!cpu_is_core_enabled(target_cpu)) {
		tr_err(&zephyr_idc_tr, "Core %u is down, cannot sent IDC message", target_cpu);
		return -EACCES;
	}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

lyakh added 2 commits March 26, 2026 15:59
Under Zephyr SOF IDC uses Zephyr's P4WQ for message passing. In it
k_p4wq_submit() checks that the new work item isn't already on the
queue, and it does that by checking its .thread member. That pointer
is set to NULL in p4wq_loop() after calling the work handler with no
locks held. So, the work handler could've completed handling an IPC,
a response could have been sent and a new IPC could arrive, while
p4wq_loop() still hasn't cleared the .thread pointer. To fix this
use a mutex in idc_send_msg() to protect the work items.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
Use NOTIFIER_TARGET_CORE_LOCAL instead of open coding it.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice fix. I have a faint memory we've been looking at this problem as well, but can't find any PR now (I recall there was an attempt to have a pool of Idc_work items. But this seems simpler and does the job.

Copy link
Contributor

@jsarha jsarha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, and also "fixes" the same assert when I try to use my "user" IPC thread to play audio on pipelines that are not on core 0 (it still does not work, as is expected, but does not hit an assert either).

@lgirdwood lgirdwood merged commit a1c65ef into thesofproject:main Mar 27, 2026
42 of 52 checks passed
@lyakh lyakh deleted the idc branch March 27, 2026 12:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants