ASoC: SOF: debug: Introduce debug option to test DSP ops#4726
ASoC: SOF: debug: Introduce debug option to test DSP ops#4726ranj063 wants to merge 1 commit intothesofproject:topic/sof-devfrom
Conversation
Introduce a new debug option to allow testing DSP operations such a boot firware, set power state etc. Add new new kconfig to allow this option for developers. For now only 2 ops are supported for booting a new firmware and setting the DSP power state to D3. To allow passing a new firmware file, modify the signature of the load_firmware op to take the firmware file name as an argument. When the kconfig for DSP op testing is enabled, runtime PM is disabled to allow the DSP state to be solely managed from the userspace and no audio card is registered. Signed-off-by: Ranjani Sridharan <[email protected]>
plbossart
left a comment
There was a problem hiding this comment.
Not following the general direction if I am honest.
| help | ||
| This option is used for testing DSP operations such as load/unload FW, | ||
| set power state, enable/disable core etc. from userspace. | ||
| Say Y if you want to retain DSP context for FW exceptions. |
There was a problem hiding this comment.
that line looks like a copy-paste from above?
There was a problem hiding this comment.
I would mention that this will disable the audio support completely
| /* hereafter all FW boot flows are for PM reasons */ | ||
| sdev->first_boot = false; | ||
|
|
||
| #if !IS_ENABLED(CONFIG_SND_SOC_SOF_DEBUG_DSP_OPS_TEST) |
There was a problem hiding this comment.
can we use if (!IS_ENABLED) instead of #if
|
|
||
| /* | ||
| * test firmware boot by passing the firmware file as the argument: | ||
| * ex: echo boot_firmware,intel/avs/tgl/community/dsp_basefw.bin > dsp_test_op |
There was a problem hiding this comment.
why combine the two parts?
echo intel/avs/tgl/community/dsp_basefw.bin > filename
echo 1 > boot
| return ret; | ||
|
|
||
| /* resume DMA trace */ | ||
| return sof_fw_trace_resume(sdev); |
There was a problem hiding this comment.
this looks like code duplication, I don't follow what it's supposed to do
Do we really want to do this or run something like the python stuff on top of a simplified SOF driver.
We have to be really careful not to open a pandora box of new stuff that we'll have to maintain forever twice.
There was a problem hiding this comment.
@plbossart the idea is to be able to run the python scripts on Linux with minimal effort by using the debugfs option. This is in essence a simplified SOF driver isnt it?
There was a problem hiding this comment.
So is this basically an alternative to https://github.com/thesofproject/sof-diagnostic-driver that don't have to load/unload?
Isn't the sof-diagnostic-driver approach more generic/flexible?
There was a problem hiding this comment.
Well, the question is really "what is the partitioning between scripts and driver"? Do we want to reuse snd_sof_run_firmware and hard-code it in a sequence used by a script, or do we want the scripts to do something that runs the firmware?
If you really want to use scripts, then the driver part should be rather minimal and probably a pass-through to access registers.
There was a problem hiding this comment.
but @plbossart if we want to write scripts to do all the register reads/writes, it would essentially be rewriting the SOF driver. Isn't this a saner approach?
There was a problem hiding this comment.
but @plbossart if we want to write scripts to do all the register reads/writes, it would essentially be rewriting the SOF driver. Isn't this a saner approach?
the question is whether
a) you want to reuse the existing Python scripts used by FW validation or
b) you want to develop NEW python scripts based on a TBD API
It's not clear from this PR what the direction is.
a) is complicated because I don't know what the interface requirements are, but
b) doesn't seem very appealing to me since it'd be a 3rd way of interacting with the DSP.
There was a problem hiding this comment.
@marc-hb this isnt a replacement for the diagnostic driver. This is a way to enable us to run the python scripts for FW testing on Linux with minimal effort.
What other purpose does the sof-diagnostic-driver have?
There was a problem hiding this comment.
@ranj063, it depends which python script we are talking about... One set I have seen is implementing IPC message crafting, sending (yep, with register poking and then polling for completion).
It might be hard to get it to download the firmware because of the DMA configuring, that is true.
There was a problem hiding this comment.
@plbossart @ujfalusi @andyross @marc-hb this PR is NOT about IPC ABI validation but a feature to support debug and compliance/stress testing of kernel + SOF + Zephyr functional IP programming flows with userspace scripting (not userspace IP programming).
We need to test the kernel functional IP code alongside the associated Zephyr IP code together (with the same timing and sequencing) which is not possible using Python to flip bits via mmap(). We already have a Python mmap() CI tool today that verifies IPC ABI compliance only. i.e. not useful to certify IP programming compliance.
Yes we can do a lot of test with aplay/amixer, but these tools are end user applications not designed to strongly validate/stress the kernel + SOF + Zephyr combination.
e.g. we need to be able to stress (yes, low hanging fruit will come first)
- D3 -> D0 and vice versa
- DSP core off/on
- S0ix entry/exit
- FW boot
- IMR save/restore
- topology load/unload
There was a problem hiding this comment.
I will assert that you can do all of the above with sof-test. Just run thousands of cycles of suspend-resume, module load/unload, etc.
| int snd_sof_load_firmware_raw(struct snd_sof_dev *sdev); | ||
| int snd_sof_load_firmware_memcpy(struct snd_sof_dev *sdev); | ||
| int snd_sof_load_firmware_raw(struct snd_sof_dev *sdev, const char *fw_filename); | ||
| int snd_sof_load_firmware_memcpy(struct snd_sof_dev *sdev, const char *fw_filename); |
There was a problem hiding this comment.
can this go to a separate patch?
|
https://github.com/zephyrproject-rtos/zephyr/ already has a pure-Python driver from @andyross for testing purposes: zephyrproject-rtos/zephyr@9073db4 This was the first, very old commit, newer version is now here: |
|
@marc-hb To be fair: cavstool isn't PM-aware, so you generally have to unload the kernel driver in order to use it. But it's been very successful as a paradigm. I even banged out a broad equivalent for bringing up Zephyr on the mt8195 DSP (which has a MUCH simpler host control interface, again to be fair). I don't personally see any reason not to have this in the kernel, it seems like an obvious thing for the driver to provide. I will say though that I'd prefer to see it start from the perspective of "loading arbitrary firmware to the device and getting output" instead of just assuming everything is an audio blob that handles ALSA streams. The SOF unit testing story for DSP code has always been kinda weak, and Zephyr has had to pick up a lot of that, historically. |
ujfalusi
left a comment
There was a problem hiding this comment.
@ranj063, imho if this test mode is activated then we should not boot the 'default' firmware on module load but wait for the firmware path to be loaded and the instruction to boot the DSP with it.
If this mode is selected we should be printing a warning that only firmware testing is allowed via some tbd interface and the kconfig should be more 'alarming' to deter users from selecting it. Currently it is 'why not, it should not do much harm' category.
Note: We sort of have an interface to do simple things via the message injector with the default firmware, we need to prevent rpm to keep the DSP up, but we can send arbitrary messages all while we have the sound card..
| help | ||
| This option is used for testing DSP operations such as load/unload FW, | ||
| set power state, enable/disable core etc. from userspace. | ||
| Say Y if you want to retain DSP context for FW exceptions. |
There was a problem hiding this comment.
I would mention that this will disable the audio support completely
| return ret; | ||
|
|
||
| /* resume DMA trace */ | ||
| return sof_fw_trace_resume(sdev); |
There was a problem hiding this comment.
@ranj063, it depends which python script we are talking about... One set I have seen is implementing IPC message crafting, sending (yep, with register poking and then polling for completion).
It might be hard to get it to download the firmware because of the DMA configuring, that is true.
| } | ||
|
|
||
| /* ops are executed as "op_name,argument1,argument2...". For example, to set the DSP power state | ||
| * to D3: echo "load_firmware,<PATH>/sof-tgl.ri" > dsp_test_op" |
There was a problem hiding this comment.
why not separate files for the firmware path, set_power_state, etc?
How one would guess the valid op_names and what they are set at the moment?
| dfse->type = SOF_DFSENTRY_TYPE_BUF; | ||
| dfse->sdev = sdev; | ||
|
|
||
| debugfs_create_file("dsp_test_op", 0222, sdev->debugfs_root, dfse, &sof_dsp_ops_tester_fops); |
There was a problem hiding this comment.
dsp_test or rather separate files under dsp_test/ directory?
| plat_data->fw_filename); | ||
| if (!fw_filename) | ||
| return -ENOMEM; | ||
| } |
There was a problem hiding this comment.
Hm, freeing up the passed fw_filename is OK?
| return ret; | ||
| } | ||
|
|
||
| #endif |
There was a problem hiding this comment.
Why are we skipping this part?
| #include "ops.h" | ||
|
|
||
| int snd_sof_load_firmware_raw(struct snd_sof_dev *sdev) | ||
| int snd_sof_load_firmware_raw(struct snd_sof_dev *sdev, const char *fw_filename) |
There was a problem hiding this comment.
you should not neglect the snd_sof_load_firmware_memcpy()
| If unsure, select "N". | ||
|
|
||
| config SND_SOC_SOF_DEBUG_DSP_OPS_TEST | ||
| bool "SOF enable DSP ops testing" |
There was a problem hiding this comment.
What does constitute a "DSP ops"? Isn't this something like "DSP test mode"? Should we have a sof_debug flag for this to be enabled/disabled instead of re-building the kernel?
| } | ||
| #endif | ||
|
|
||
| ret = sof_register_clients(sdev); |
There was a problem hiding this comment.
Why are we loading clients in DSP test mode? They might interfere with the testing, no?
|
|
||
| fw_trace_err: | ||
| #endif | ||
| sof_fw_trace_free(sdev); |
There was a problem hiding this comment.
We also enable ftrace? That will send messages, again, might interfere with testing?
|
The ability to make the DSP panic after some time parameter would also be useful to test recovery features like these: |
The --skip-to-first-trace option was added as a precaution to exclude
log pollution from before the start of a test but how much pollution
it excludes was never assessed; we just don't know.
Hopefully not much? Replace, see below.
mtrace timestamps used to return to zero on every DSP boot, which --skip-to-first-trace
depends on. After SOF commit 76b0979c6453 ("sync time between host and fw for logging"),
mtrace timestamps still return to zero (briefly) but only on "cold" DSP boots,
not on when booting from IMR. Booting from IMR has now been enabled by default
in most configurations. That combination breaks --skip-to-first-trace.
The only practical short-term solution is to turn off --skip-to-first-trace and take
the risk of old logs polluting the test results, which is what this commit does.
If results pollution proves to be a problem, we could in the longer term try to disable
IMR boot just before running this test using sof_debug as discussed in thesofproject#1153
This would also measure from a possibly cleaner DSP state. New, DSP "debug ops"
being proposed in thesofproject/linux#4726 might also help.
Fixes: thesofproject#1153
Signed-off-by: Baofeng Tian <[email protected]>
Signed-off-by: Chao Song <[email protected]>
The --skip-to-first-trace option was added as a precaution to exclude
log pollution from before the start of a test, normally the residue traces are
related to trigger stop IPC, dma put and component reset (as you can see).
There could be 10 ~ 20 lines.
mtrace timestamps used to return to zero on every DSP boot, which --skip-to-first-trace
depends on. After SOF commit 76b0979c6453 ("sync time between host and fw for logging"),
mtrace timestamps still return to zero (briefly) but only on "cold" DSP boots,
not on when booting from IMR. Booting from IMR has now been enabled by default
in most configurations. That combination breaks --skip-to-first-trace.
The only practical short-term solution is to turn off --skip-to-first-trace and take
the risk of old logs polluting the test results, which is what this commit does.
If results pollution proves to be a problem, we could in the longer term try to disable
IMR boot just before running this test using sof_debug as discussed in thesofproject#1153
This would also measure from a possibly cleaner DSP state. New, DSP "debug ops"
being proposed in thesofproject/linux#4726 might also help.
Fixes: thesofproject#1153
Signed-off-by: Baofeng Tian <[email protected]>
Signed-off-by: Chao Song <[email protected]>
It is not a scope of this PR. |
Understood but would it be a useful stepping stone? Looks like one. |
To be honest I would rather not do such a thing. |
The --skip-to-first-trace option was added as a precaution to exclude
log pollution from before the start of a test, normally the residue traces are
related to trigger stop IPC, dma put and component reset (as you can see).
There could be 10 ~ 20 lines.
mtrace timestamps used to return to zero on every DSP boot, which --skip-to-first-trace
depends on. After SOF commit 76b0979c6453 ("sync time between host and fw for logging"),
mtrace timestamps still return to zero (briefly) but only on "cold" DSP boots,
not on when booting from IMR. Booting from IMR has now been enabled by default
in most configurations. That combination breaks --skip-to-first-trace.
The only practical short-term solution is to turn off --skip-to-first-trace and take
the risk of old logs polluting the test results, which is what this commit does.
If results pollution proves to be a problem, we could in the longer term try to disable
IMR boot just before running this test using sof_debug as discussed in #1153
This would also measure from a possibly cleaner DSP state. New, DSP "debug ops"
being proposed in thesofproject/linux#4726 might also help.
Fixes: #1153
Signed-off-by: Baofeng Tian <[email protected]>
Signed-off-by: Chao Song <[email protected]>
|
I'll close this for now since it's not an active development. We can always re-open this later. |
Introduce a new debug option to allow testing DSP operations such a boot firware, set power state etc. Add new new kconfig to allow this option for developers. For now only 2 ops are supported for booting a new firmware and setting the DSP power state to D3.
To allow passing a new firmware file, modify the signature of the load_firmware op to take the firmware file name as an argument. When the kconfig for DSP op testing is enabled, runtime PM is disabled to allow the DSP state to be solely managed from the userspace and no audio card is registered.