-
Notifications
You must be signed in to change notification settings - Fork 140
ASoC: SOF: ipc4-pcm: Workaround for crashed firmware on system suspend #4780
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ASoC: SOF: ipc4-pcm: Workaround for crashed firmware on system suspend #4780
Conversation
|
Most if not all CI PR tests have a DSP panic on TGL machines with audio+suspend, similar to thesofproject/sof#8721 In all cases we cannot recover after that DSP crash since the kernel is left in a broken state and the next audio will fail with firmware errors. I know, this is not elegant, but on the cleanup path this is the only place where a DSP panic can break the execution of the cleanup and we did had not once seen similar issues. |
sound/soc/sof/ipc4-pcm.c
Outdated
| * widgets will be correct for the next boot. | ||
| */ | ||
| if (sdev->fw_state != SOF_FW_CRASHED || | ||
| !(cmd == SNDRV_PCM_TRIGGER_STOP && state == SOF_IPC4_PIPE_RESET)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking only for state == SOF_IPC4_PIPE_RESET is enough for the workaround.
I'll update the patch and the commit message for more details I have gathered since yesterday.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ujfalusi can we do something like if (state == RESET && ret != -ETIMEDOUT) instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the firmware crashed before (during audio playback/capture) then the error is not going to be timeout, but -ENODEV and we need to ignore that as well.
|
@ranj063, others could we trigger a DSP panic using some fancy new EDIT, this is what made me think about this: |
e2b5ccd to
34a57bd
Compare
When the system is suspended while audio is active, the sof_ipc4_pcm_hw_free() is invoked to reset the pipelines since during suspend the DSP is turned off, streams will be re-started after resume. If the firmware crashes during while audio is running (or when we reset the stream before suspend) then the sof_ipc4_set_multi_pipeline_state() will fail with IPC error and the state change is interrupted. This will cause misalignment between the kernel and firmware state on next DSP boot resulting errors returned by firmware for IPC messages, eventually failing the audio resume. On stream close the errors are ignored so the kernel state will be corrected on the next DSP boot, so the second boot after the DSP panic. If sof_ipc4_trigger_pipelines() is called from sof_ipc4_pcm_hw_free() then state parameter is SOF_IPC4_PIPE_RESET and only in this case. Treat a forced pipeline reset similarly to how we treat a pcm_free by ignoring error on state sending to allow the kernel's state to be consistent with the state the firmware will have after the next boot. Link: thesofproject/sof#8721 Signed-off-by: Peter Ujfalusi <[email protected]>
|
Changes since v1:
|
Just start audio playback ;)
I rather not spend time on introducing such an interface which can be abused on real systems. You can already send arbitrary messages to the DSP which might crash it, so...
That PR was trying to introduce completely different interface and if that is 'alive' then the driver is effectively disabled, you cannot test how the driver would cope with a DSP panic at a given time. |
Once you have any sort of debug interface I'm pretty sure all security bets are already off.
Fuzzing is supposed to catch most of that. Do you have a specific example? cc: @andyross
Forgot that sorry. |
When the system is suspended while audio is active, the
sof_ipc4_pcm_hw_free() is invoked to reset the pipelines since during
suspend the DSP is turned off, streams will be re-started after resume.
If the firmware crashes during while audio is running (or when we reset
the stream before suspend) then the sof_ipc4_set_multi_pipeline_state()
will fail with IPC error and the state change is interrupted.
This will cause misalignment between the kernel and firmware state on next
DSP boot resulting errors returned by firmware for IPC messages, eventually
failing the audio resume.
On stream close the errors are ignored so the kernel state will be
corrected on the next DSP boot, so the second boot after the DSP panic.
If sof_ipc4_trigger_pipelines() is called from sof_ipc4_pcm_hw_free() then
state parameter is SOF_IPC4_PIPE_RESET and only in this case.
Treat a forced pipeline reset similarly to how we treat a pcm_free by
ignoring error on state sending to allow the kernel's state to be
consistent with the state the firmware will have after the next boot.
Link: thesofproject/sof#8721
Signed-off-by: Peter Ujfalusi [email protected]